docker-mailserver/docs/content/config/advanced/full-text-search.md
Brennan Kinney a0ee472501
docs(chore): Normalize for consistency (#2206)
"Brief" summary/overview of changes. See the PR discussion or individual commits from the PR for more details.

---

Only applies to the `docs/content/**` content (_and `setup` command_). `target/` and `test/` can be normalized at a later date.

* Normalize to `example.com`

- Domains normalized to `example.com`: `mywebserver.com`, `myserver.tld`, `domain.com`, `domain.tld`, `mydomain.net`, `my-domain.tld`, `my-domain.com`, `example.org`, `whoami.com`.
- Alternative domains normalized to `not-example.com`: `otherdomain.com`, `otherdomain.tld`, `domain2.tld`, `mybackupmx.com`, `whoareyou.org`.
- Email addresses normalized to `admin@example.com` (in `ssl.md`): `foo@bar.com`, `yourcurrentemail@gmail.com`, `email@email.com`, `admin@domain.tld`.
- Email addresses normalized to `external-account@gmail.com`: `bill@gates321boom.com`, `external@gmail.com`, `myemail@gmail.com`, `real-email-address@external-domain.com`.
- **`faq.md`:** A FAQ entry title with `sample.domain.com` changed to `subdomain.example.com`.
- **`mail-fetchmail.md`:** Config examples with FQDNs for `imap`/`pop3` used `example.com` domain for a third-party, changed to `gmail.com` as more familiar third-party/external MTA.

* Normalize config volume path

- Normalizing local config path references to `./docker-data/dms/config/`: `./config/`, `config/`, \``config`\`, `/etc/` (_volume mount src path prefix_).
- Normalize DMS volume paths to `docker-data/dms/mail-{data,state,log}`: `./mail`, `./mail-state` `./data/mail`, `./data/state`, `./data/logs`, `./data/maildata`, `./data/mailstate`, `./data/maillogs`, (_dropped/converted data volumes: `maildata`, `mailstate`_).
- Other docker images also adopt the `docker-data/{service name}/` prefix.

* `ssl.md` - Use `dms/custom-certs` where appropriate.

* Apply normalizations to README and example `docker-compose.yml`

---

Common terms, sometimes interchangeably used or now invalid depending on context: `mail`, `mail container`, `mail server`, `mail-server`, `mailserver`,`docker-mailserver`, `Docker Mailserver`.

Rough transformations applied to most matches (_conditionally, depending on context_):

- 'Docker Mailserver' => '`docker-mailserver`'
- 'mail container' => '`docker-mailserver`' (_optionally retaining ' container'_)
- 'mail server' => 'mail-server' / '`docker-mailserver`'
- 'mail-server' => '`docker-mailserver`'
- 'mailserver' => 'mail-server' / '`docker-mailserver`'

Additionally I checked `docker run` (_plus `exec`, `logs`, etc, sub-commands_) and `docker-compose` commands. Often finding usage of `mail` instead of the expected `mailserver`

Additionally changes `mailserver` hostname in k8s to `mail` to align with other non-k8s examples.

---

* drive-by revisions

Mostly minor revisions or improvements to docs that aren't related to normalization effort.
2021-09-23 11:29:37 +12:00

5.3 KiB

title
Advanced | Full-Text Search

Overview

Full-text search allows all messages to be indexed, so that mail clients can quickly and efficiently search messages by their full text content. Dovecot supports a variety of community supported FTS indexing backends.

docker-mailserver comes pre-installed with two plugins that can be enabled with a dovecot config file.

Please be aware that indexing consumes memory and takes up additional disk space.

Xapian

The dovecot-fts-xapian plugin makes use of Xapian. Xapian enables embedding an FTS engine without the need for additional backends.

The indexes will be stored as a subfolder named xapian-indexes inside your local mail-data folder (/var/mail internally). With the default settings, 10GB of email data may generate around 4GB of indexed data.

While indexing is memory intensive, you can configure the plugin to limit the amount of memory consumed by the index workers. With Xapian being small and fast, this plugin is a good choice for low memory environments (2GB) as compared to Solr.

Setup

  1. To configure fts-xapian as a dovecot plugin, create a file at docker-data/dms/config/dovecot/fts-xapian-plugin.conf and place the following in it:

    mail_plugins = $mail_plugins fts fts_xapian
    
    plugin {
        fts = xapian
        fts_xapian = partial=3 full=20 verbose=0
    
        fts_autoindex = yes
        fts_enforced = yes
    
        # disable indexing of folders
        # fts_autoindex_exclude = \Trash
    
        # Index attachements
        # fts_decoder = decode2text
    }
    
    service indexer-worker {
        # limit size of indexer-worker RAM usage, ex: 512MB, 1GB, 2GB
        vsz_limit = 1GB
    }
    
    # service decode2text {
    #     executable = script /usr/libexec/dovecot/decode2text.sh
    #     user = dovecot
    #     unix_listener decode2text {
    #         mode = 0666
    #     }
    # }
    

    adjust the settings to tune for your desired memory limits, exclude folders and enable searching text inside of attachments

  2. Update docker-compose.yml to load the previously created dovecot plugin config file:

      version: '3.8'
      services:
        mailserver:
          image: docker.io/mailserver/docker-mailserver:latest
          container_name: mailserver
          hostname: mail
          domainname: example.com
          env_file: mailserver.env
          ports:
            - "25:25"    # SMTP  (explicit TLS => STARTTLS)
            - "143:143"  # IMAP4 (explicit TLS => STARTTLS)
            - "465:465"  # ESMTP (implicit TLS)
            - "587:587"  # ESMTP (explicit TLS => STARTTLS)
            - "993:993"  # IMAP4 (implicit TLS)
          volumes:
            - ./docker-data/dms/mail-data/:/var/mail/
            - ./docker-data/dms/mail-state/:/var/mail-state/
            - ./docker-data/dms/mail-logs/:/var/log/mail/
            - ./docker-data/dms/config/:/tmp/docker-mailserver/
            - ./docker-data/dms/config/dovecot/fts-xapian-plugin.conf:/etc/dovecot/conf.d/10-plugin.conf:ro
            - /etc/localtime:/etc/localtime:ro
          restart: always
          stop_grace_period: 1m
          cap_add:
            - NET_ADMIN
            - SYS_PTRACE
    
  3. Recreate containers:

    docker-compose down
    docker-compose up -d
    
  4. Initialize indexing on all users for all mail:

    docker-compose exec mailserver doveadm index -A -q \*
    
  5. Run the following command in a daily cron job:

    docker-compose exec mailserver doveadm fts optimize -A
    

Solr

The dovecot-solr Plugin is used in conjunction with Apache Solr running in a separate container. This is quite straightforward to setup using the following instructions.

Solr is a mature and fast indexing backend that runs on the JVM. The indexes are relatively compact compared to the size of your total email.

However, Solr also requires a fair bit of RAM. While Solr is highly tuneable, it may require a bit of testing to get it right.

Setup

  1. docker-compose.yml:

      solr:
        image: lmmdock/dovecot-solr:latest
        volumes:
          - ./docker-data/dms/config/dovecot/solr-dovecot:/opt/solr/server/solr/dovecot
        restart: always
    
      mailserver:
        depends_on:
          - solr
        image: docker.io/mailserver/docker-mailserver:latest
        ...
        volumes:
          ...
          - ./docker-data/dms/config/dovecot/10-plugin.conf:/etc/dovecot/conf.d/10-plugin.conf:ro
        ...
    
  2. ./docker-data/dms/config/dovecot/10-plugin.conf:

    mail_plugins = $mail_plugins fts fts_solr
    
    plugin {
      fts = solr
      fts_autoindex = yes
      fts_solr = url=http://solr:8983/solr/dovecot/
    }
    
  3. Recreate containers: docker-compose down ; docker-compose up -d

  4. Flag all user mailbox FTS indexes as invalid, so they are rescanned on demand when they are next searched: docker-compose exec mailserver doveadm fts rescan -A

Further Discussion

See #905