From 4473b881cf571de840f6853d590cbad310888976 Mon Sep 17 00:00:00 2001 From: eleith Date: Mon, 5 Jul 2021 03:25:26 -0700 Subject: [PATCH] add dovecot-fts-xapian (#2064) * add dovecot-fts-xapian update Docker to build from debian bullseye slim, as it contains packages for fts-xapian. update Docker to install dovecot-fts-xapian. update docs with instructions on how to enable fts-xapian or fts-solr and what considerations to take into when deciding. * address review feedback * update backport method to previously proposed approach (which was lost in a forced push) --- Dockerfile | 8 +- .../config/advanced/full-text-search.md | 115 +++++++++++++++++- 2 files changed, 114 insertions(+), 9 deletions(-) diff --git a/Dockerfile b/Dockerfile index d44ff19b..cc839ba1 100644 --- a/Dockerfile +++ b/Dockerfile @@ -39,6 +39,8 @@ SHELL ["/bin/bash", "-o", "pipefail", "-c"] # ----------------------------------------------- RUN \ + # Backport repo for dovecot-fts-xapian package. This can be removed once Debian 11 is used as base image. + echo 'deb http://deb.debian.org/debian buster-backports main' > /etc/apt/sources.list.d/buster-backports.list && \ apt-get -qq update && \ apt-get -qq install apt-utils 2>/dev/null && \ apt-get -qq dist-upgrade && \ @@ -47,9 +49,9 @@ RUN \ # A - D altermime amavisd-new apt-transport-https arj binutils bzip2 bsd-mailx \ ca-certificates cabextract clamav clamav-daemon cpio curl \ - dbconfig-no-thanks dovecot-core dovecot-imapd dovecot-ldap \ - dovecot-lmtpd dovecot-managesieved dovecot-pop3d dovecot-sieve \ - dovecot-solr dumb-init \ + dbconfig-no-thanks dovecot-core dovecot-fts-xapian dovecot-imapd \ + dovecot-ldap dovecot-lmtpd dovecot-managesieved dovecot-pop3d \ + dovecot-sieve dovecot-solr dumb-init \ # E - O ed fetchmail file gamin gnupg gzip iproute2 iptables \ locales logwatch lhasa libdate-manip-perl liblz4-tool \ diff --git a/docs/content/config/advanced/full-text-search.md b/docs/content/config/advanced/full-text-search.md index 7a37fabe..dc23df4a 100644 --- a/docs/content/config/advanced/full-text-search.md +++ b/docs/content/config/advanced/full-text-search.md @@ -4,11 +4,115 @@ title: 'Advanced | Full-Text Search' ## Overview -Full-text search allows all messages to be indexed, so that mail clients can quickly and efficiently search messages by their full text content. +Full-text search allows all messages to be indexed, so that mail clients can quickly and efficiently search messages by their full text content. Dovecot supports a variety of community supported [FTS indexing backends](https://doc.dovecot.org/configuration_manual/fts/). + +Docker-mailserver comes pre-installed with two plugins that can be enabled with a dovecot config file. + +Please be aware that indexing consumes memory and takes up additional disk space. + +### Xapian + +The [dovecot-fts-xapian](https://github.com/grosjo/fts-xapian) plugin makes use of [Xapian](https://xapian.org/). Xapian enables embedding an FTS engine without the need for additional backends. + +The indexes will be stored as a subfolder named `xapian-indexes` inside your `mail` folder. With the default settings, 10GB of email data may generate around 4GB of indexed data. + +While indexing is memory intensive, you can configure the plugin to limit the amount of memory consumed by the index workers. With Xapian being small and fast, this plugin is a good choice for low memory environments (2GB) as compared to Solr. + +#### Setup + +1. To configure fts-xapian as a dovecot plugin, create a `fts-xapian-plugin.conf` file and place the following in it: + + ``` + mail_plugins = $mail_plugins fts fts_xapian + + plugin { + fts = xapian + fts_xapian = partial=3 full=20 verbose=0 + + fts_autoindex = yes + fts_enforced = yes + + # disable indexing of folders + # fts_autoindex_exclude = \Trash + + # Index attachements + # fts_decoder = decode2text + } + + service indexer-worker { + # limit size of indexer-worker RAM usage, ex: 512MB, 1GB, 2GB + vsz_limit = 1GB + } + + # service decode2text { + # executable = script /usr/libexec/dovecot/decode2text.sh + # user = dovecot + # unix_listener decode2text { + # mode = 0666 + # } + # } + ``` + + adjust the settings to tune for your desired memory limits, exclude folders and enable searching text inside of attachments + +2. Update `docker-compose.yml` to load the previously created dovecot plugin config file: + + ```yaml + + version: '3.8' + services: + mailserver: + image: mailserver/docker-mailserver:latest + hostname: mail + domainname: example.com + container_name: mailserver + env_file: mailserver.env + ports: + - "25:25" # SMTP (explicit TLS => STARTTLS) + - "143:143" # IMAP4 (explicit TLS => STARTTLS) + - "465:465" # ESMTP (implicit TLS) + - "587:587" # ESMTP (explicit TLS => STARTTLS) + - "993:993" # IMAP4 (implicit TLS) + volumes: + - ./data/mail:/var/mail + - ./data/state:/var/mail-state + - ./data/logs:/var/log/mail + - /etc/localtime:/etc/localtime:ro + - ./config/:/tmp/docker-mailserver/ + - ./fts-xapian-plugin.conf:/etc/dovecot/conf.d/10-plugin.conf:ro + restart: always + stop_grace_period: 1m + cap_add: [ "NET_ADMIN", "SYS_PTRACE" ] + ``` + + 3. Recreate containers: + + ``` + docker-compose down + docker-compose up -d + ``` + + 4. Initialize indexing on all users for all mail: + + ``` + docker-compose exec mailserver doveadm index -A -q \* + ``` + + 5. Run the following command in a daily cron job: + + ``` + docker-compose exec mailserver doveadm fts optimize -A + ``` + +### Solr The [dovecot-solr Plugin](https://wiki2.dovecot.org/Plugins/FTS/Solr) is used in conjunction with [Apache Solr](https://lucene.apache.org/solr/) running in a separate container. This is quite straightforward to setup using the following instructions. -## Setup Steps +Solr is a mature and fast indexing backend that runs on the JVM. The indexes are relatively compact compared to the size of your total email. + +However, Solr also requires a fair bit of RAM. While Solr is [highly tuneable](https://solr.apache.org/guide/7_0/query-settings-in-solrconfig.html), it may require a bit of testing to get it right. + +#### Setup 1. `docker-compose.yml`: @@ -47,10 +151,9 @@ The [dovecot-solr Plugin](https://wiki2.dovecot.org/Plugins/FTS/Solr) is used in ``` 3. Recreate containers: `docker-compose down ; docker-compose up -d` + 4. Flag all user mailbox FTS indexes as invalid, so they are rescanned on demand when they are next searched: `docker-compose exec mailserver doveadm fts rescan -A` -## Further Discussion +#### Further Discussion -See [#905][github-issue-905] - -[github-issue-905]: https://github.com/docker-mailserver/docker-mailserver/issues/905 +See [#905](https://github.com/docker-mailserver/docker-mailserver/issues/905)