docker-mailserver/docs/content/config/advanced/full-text-search.md

160 lines
5.1 KiB
Markdown
Raw Normal View History

docs(refactor): Large refactor + additions + fixes Consistency pass, formatting cleanup and fixes, introduce admonitions, add front-matter. --- docs: Add front-matter --- docs: Fix and format links - Some links were invalid (eg files moved or renamed) - Some were valid but had invalid section headers (content removed or migrated) - Some use `http://` instead of `https://` when the website supports a secure connection. - Some already used the `[name][reference]` convention but often with a number that wasn't as useful for maintenance. - All referenced docs needed URLs replaced. Opted for the `[name][reference]` approach to group them all clearly at the bottom of the doc, especially with the relative URLs and in some cases many duplicate entries. - All `tomav` references from the original repo prior to switch to an organization have been corrected. - Minor cosmetic changes to the `name` part of the URL, such as for referencing issues to be consistent. - Some small changes to text body, usually due to duplicate URL reference that was unnecessary (open relay, youtous) - Switched other links to use the `[name][reference]` format when there was a large group of URLs such as wikipedia or kubernetes. Github repos that reference projects related to `docker-mailserver` also got placed here so they're noticed better by maintainers. This also helped quite a bit with `mermaid` external links that are very long. - There was a Github Wiki supported syntax in use `[[name | link]]` for `fetchmail` page that isn't compatible by default with MkDocs (needs a plugin), converted to `[name][reference]` instead since it's a relative link. --- docs: Update commit link for LDAP override script Logic moved to another file, keeping the permalink commit reference so it's unaffected by any changes in the file referenced in future. --- docs: Heading corrections Consistency pass. Helps with the Table of Contents (top-right UI) aka Document Outline. docs: codefence cleanup --- docs: misc cleanup --- docs: Add Admonitions Switches `<details>` usage for collapsible admonitions (`???`) while other text content is switched to the visually more distinct admoniton (`!!!` or `???+`) style. This does affect editor syntax highlighting a bit and markdown linting as it's custom non-standard markdown syntax.
2021-03-01 10:41:19 +00:00
---
title: 'Advanced | Full-Text Search'
docs(refactor): Large refactor + additions + fixes Consistency pass, formatting cleanup and fixes, introduce admonitions, add front-matter. --- docs: Add front-matter --- docs: Fix and format links - Some links were invalid (eg files moved or renamed) - Some were valid but had invalid section headers (content removed or migrated) - Some use `http://` instead of `https://` when the website supports a secure connection. - Some already used the `[name][reference]` convention but often with a number that wasn't as useful for maintenance. - All referenced docs needed URLs replaced. Opted for the `[name][reference]` approach to group them all clearly at the bottom of the doc, especially with the relative URLs and in some cases many duplicate entries. - All `tomav` references from the original repo prior to switch to an organization have been corrected. - Minor cosmetic changes to the `name` part of the URL, such as for referencing issues to be consistent. - Some small changes to text body, usually due to duplicate URL reference that was unnecessary (open relay, youtous) - Switched other links to use the `[name][reference]` format when there was a large group of URLs such as wikipedia or kubernetes. Github repos that reference projects related to `docker-mailserver` also got placed here so they're noticed better by maintainers. This also helped quite a bit with `mermaid` external links that are very long. - There was a Github Wiki supported syntax in use `[[name | link]]` for `fetchmail` page that isn't compatible by default with MkDocs (needs a plugin), converted to `[name][reference]` instead since it's a relative link. --- docs: Update commit link for LDAP override script Logic moved to another file, keeping the permalink commit reference so it's unaffected by any changes in the file referenced in future. --- docs: Heading corrections Consistency pass. Helps with the Table of Contents (top-right UI) aka Document Outline. docs: codefence cleanup --- docs: misc cleanup --- docs: Add Admonitions Switches `<details>` usage for collapsible admonitions (`???`) while other text content is switched to the visually more distinct admoniton (`!!!` or `???+`) style. This does affect editor syntax highlighting a bit and markdown linting as it's custom non-standard markdown syntax.
2021-03-01 10:41:19 +00:00
---
2020-10-08 22:36:39 +00:00
## Overview
Full-text search allows all messages to be indexed, so that mail clients can quickly and efficiently search messages by their full text content. Dovecot supports a variety of community supported [FTS indexing backends](https://doc.dovecot.org/configuration_manual/fts/).
Docker-mailserver comes pre-installed with two plugins that can be enabled with a dovecot config file.
Please be aware that indexing consumes memory and takes up additional disk space.
### Xapian
The [dovecot-fts-xapian](https://github.com/grosjo/fts-xapian) plugin makes use of [Xapian](https://xapian.org/). Xapian enables embedding an FTS engine without the need for additional backends.
The indexes will be stored as a subfolder named `xapian-indexes` inside your `mail` folder. With the default settings, 10GB of email data may generate around 4GB of indexed data.
While indexing is memory intensive, you can configure the plugin to limit the amount of memory consumed by the index workers. With Xapian being small and fast, this plugin is a good choice for low memory environments (2GB) as compared to Solr.
#### Setup
1. To configure fts-xapian as a dovecot plugin, create a `fts-xapian-plugin.conf` file and place the following in it:
```
mail_plugins = $mail_plugins fts fts_xapian
plugin {
fts = xapian
fts_xapian = partial=3 full=20 verbose=0
fts_autoindex = yes
fts_enforced = yes
# disable indexing of folders
# fts_autoindex_exclude = \Trash
# Index attachements
# fts_decoder = decode2text
}
service indexer-worker {
# limit size of indexer-worker RAM usage, ex: 512MB, 1GB, 2GB
vsz_limit = 1GB
}
# service decode2text {
# executable = script /usr/libexec/dovecot/decode2text.sh
# user = dovecot
# unix_listener decode2text {
# mode = 0666
# }
# }
```
adjust the settings to tune for your desired memory limits, exclude folders and enable searching text inside of attachments
2. Update `docker-compose.yml` to load the previously created dovecot plugin config file:
```yaml
version: '3.8'
services:
mailserver:
image: mailserver/docker-mailserver:latest
hostname: mail
domainname: example.com
container_name: mailserver
env_file: mailserver.env
ports:
- "25:25" # SMTP (explicit TLS => STARTTLS)
- "143:143" # IMAP4 (explicit TLS => STARTTLS)
- "465:465" # ESMTP (implicit TLS)
- "587:587" # ESMTP (explicit TLS => STARTTLS)
- "993:993" # IMAP4 (implicit TLS)
volumes:
- ./data/mail:/var/mail
- ./data/state:/var/mail-state
- ./data/logs:/var/log/mail
- /etc/localtime:/etc/localtime:ro
- ./config/:/tmp/docker-mailserver/
- ./fts-xapian-plugin.conf:/etc/dovecot/conf.d/10-plugin.conf:ro
restart: always
stop_grace_period: 1m
cap_add: [ "NET_ADMIN", "SYS_PTRACE" ]
```
3. Recreate containers:
```
docker-compose down
docker-compose up -d
```
4. Initialize indexing on all users for all mail:
```
docker-compose exec mailserver doveadm index -A -q \*
```
5. Run the following command in a daily cron job:
```
docker-compose exec mailserver doveadm fts optimize -A
```
### Solr
2020-10-08 22:37:26 +00:00
2020-10-08 22:36:39 +00:00
The [dovecot-solr Plugin](https://wiki2.dovecot.org/Plugins/FTS/Solr) is used in conjunction with [Apache Solr](https://lucene.apache.org/solr/) running in a separate container. This is quite straightforward to setup using the following instructions.
Solr is a mature and fast indexing backend that runs on the JVM. The indexes are relatively compact compared to the size of your total email.
However, Solr also requires a fair bit of RAM. While Solr is [highly tuneable](https://solr.apache.org/guide/7_0/query-settings-in-solrconfig.html), it may require a bit of testing to get it right.
#### Setup
2020-10-08 22:36:39 +00:00
docs(refactor): Large refactor + additions + fixes Consistency pass, formatting cleanup and fixes, introduce admonitions, add front-matter. --- docs: Add front-matter --- docs: Fix and format links - Some links were invalid (eg files moved or renamed) - Some were valid but had invalid section headers (content removed or migrated) - Some use `http://` instead of `https://` when the website supports a secure connection. - Some already used the `[name][reference]` convention but often with a number that wasn't as useful for maintenance. - All referenced docs needed URLs replaced. Opted for the `[name][reference]` approach to group them all clearly at the bottom of the doc, especially with the relative URLs and in some cases many duplicate entries. - All `tomav` references from the original repo prior to switch to an organization have been corrected. - Minor cosmetic changes to the `name` part of the URL, such as for referencing issues to be consistent. - Some small changes to text body, usually due to duplicate URL reference that was unnecessary (open relay, youtous) - Switched other links to use the `[name][reference]` format when there was a large group of URLs such as wikipedia or kubernetes. Github repos that reference projects related to `docker-mailserver` also got placed here so they're noticed better by maintainers. This also helped quite a bit with `mermaid` external links that are very long. - There was a Github Wiki supported syntax in use `[[name | link]]` for `fetchmail` page that isn't compatible by default with MkDocs (needs a plugin), converted to `[name][reference]` instead since it's a relative link. --- docs: Update commit link for LDAP override script Logic moved to another file, keeping the permalink commit reference so it's unaffected by any changes in the file referenced in future. --- docs: Heading corrections Consistency pass. Helps with the Table of Contents (top-right UI) aka Document Outline. docs: codefence cleanup --- docs: misc cleanup --- docs: Add Admonitions Switches `<details>` usage for collapsible admonitions (`???`) while other text content is switched to the visually more distinct admoniton (`!!!` or `???+`) style. This does affect editor syntax highlighting a bit and markdown linting as it's custom non-standard markdown syntax.
2021-03-01 10:41:19 +00:00
1. `docker-compose.yml`:
2020-10-08 22:36:39 +00:00
docs(refactor): Large refactor + additions + fixes Consistency pass, formatting cleanup and fixes, introduce admonitions, add front-matter. --- docs: Add front-matter --- docs: Fix and format links - Some links were invalid (eg files moved or renamed) - Some were valid but had invalid section headers (content removed or migrated) - Some use `http://` instead of `https://` when the website supports a secure connection. - Some already used the `[name][reference]` convention but often with a number that wasn't as useful for maintenance. - All referenced docs needed URLs replaced. Opted for the `[name][reference]` approach to group them all clearly at the bottom of the doc, especially with the relative URLs and in some cases many duplicate entries. - All `tomav` references from the original repo prior to switch to an organization have been corrected. - Minor cosmetic changes to the `name` part of the URL, such as for referencing issues to be consistent. - Some small changes to text body, usually due to duplicate URL reference that was unnecessary (open relay, youtous) - Switched other links to use the `[name][reference]` format when there was a large group of URLs such as wikipedia or kubernetes. Github repos that reference projects related to `docker-mailserver` also got placed here so they're noticed better by maintainers. This also helped quite a bit with `mermaid` external links that are very long. - There was a Github Wiki supported syntax in use `[[name | link]]` for `fetchmail` page that isn't compatible by default with MkDocs (needs a plugin), converted to `[name][reference]` instead since it's a relative link. --- docs: Update commit link for LDAP override script Logic moved to another file, keeping the permalink commit reference so it's unaffected by any changes in the file referenced in future. --- docs: Heading corrections Consistency pass. Helps with the Table of Contents (top-right UI) aka Document Outline. docs: codefence cleanup --- docs: misc cleanup --- docs: Add Admonitions Switches `<details>` usage for collapsible admonitions (`???`) while other text content is switched to the visually more distinct admoniton (`!!!` or `???+`) style. This does affect editor syntax highlighting a bit and markdown linting as it's custom non-standard markdown syntax.
2021-03-01 10:41:19 +00:00
```yaml
solr:
image: lmmdock/dovecot-solr:latest
volumes:
- solr-dovecot:/opt/solr/server/solr/dovecot
restart: always
2020-10-08 22:36:39 +00:00
docs(refactor): Large refactor + additions + fixes Consistency pass, formatting cleanup and fixes, introduce admonitions, add front-matter. --- docs: Add front-matter --- docs: Fix and format links - Some links were invalid (eg files moved or renamed) - Some were valid but had invalid section headers (content removed or migrated) - Some use `http://` instead of `https://` when the website supports a secure connection. - Some already used the `[name][reference]` convention but often with a number that wasn't as useful for maintenance. - All referenced docs needed URLs replaced. Opted for the `[name][reference]` approach to group them all clearly at the bottom of the doc, especially with the relative URLs and in some cases many duplicate entries. - All `tomav` references from the original repo prior to switch to an organization have been corrected. - Minor cosmetic changes to the `name` part of the URL, such as for referencing issues to be consistent. - Some small changes to text body, usually due to duplicate URL reference that was unnecessary (open relay, youtous) - Switched other links to use the `[name][reference]` format when there was a large group of URLs such as wikipedia or kubernetes. Github repos that reference projects related to `docker-mailserver` also got placed here so they're noticed better by maintainers. This also helped quite a bit with `mermaid` external links that are very long. - There was a Github Wiki supported syntax in use `[[name | link]]` for `fetchmail` page that isn't compatible by default with MkDocs (needs a plugin), converted to `[name][reference]` instead since it's a relative link. --- docs: Update commit link for LDAP override script Logic moved to another file, keeping the permalink commit reference so it's unaffected by any changes in the file referenced in future. --- docs: Heading corrections Consistency pass. Helps with the Table of Contents (top-right UI) aka Document Outline. docs: codefence cleanup --- docs: misc cleanup --- docs: Add Admonitions Switches `<details>` usage for collapsible admonitions (`???`) while other text content is switched to the visually more distinct admoniton (`!!!` or `???+`) style. This does affect editor syntax highlighting a bit and markdown linting as it's custom non-standard markdown syntax.
2021-03-01 10:41:19 +00:00
mailserver:
2021-04-12 10:08:17 +00:00
depends_on:
- solr
image: mailserver/docker-mailserver:latest
docs(refactor): Large refactor + additions + fixes Consistency pass, formatting cleanup and fixes, introduce admonitions, add front-matter. --- docs: Add front-matter --- docs: Fix and format links - Some links were invalid (eg files moved or renamed) - Some were valid but had invalid section headers (content removed or migrated) - Some use `http://` instead of `https://` when the website supports a secure connection. - Some already used the `[name][reference]` convention but often with a number that wasn't as useful for maintenance. - All referenced docs needed URLs replaced. Opted for the `[name][reference]` approach to group them all clearly at the bottom of the doc, especially with the relative URLs and in some cases many duplicate entries. - All `tomav` references from the original repo prior to switch to an organization have been corrected. - Minor cosmetic changes to the `name` part of the URL, such as for referencing issues to be consistent. - Some small changes to text body, usually due to duplicate URL reference that was unnecessary (open relay, youtous) - Switched other links to use the `[name][reference]` format when there was a large group of URLs such as wikipedia or kubernetes. Github repos that reference projects related to `docker-mailserver` also got placed here so they're noticed better by maintainers. This also helped quite a bit with `mermaid` external links that are very long. - There was a Github Wiki supported syntax in use `[[name | link]]` for `fetchmail` page that isn't compatible by default with MkDocs (needs a plugin), converted to `[name][reference]` instead since it's a relative link. --- docs: Update commit link for LDAP override script Logic moved to another file, keeping the permalink commit reference so it's unaffected by any changes in the file referenced in future. --- docs: Heading corrections Consistency pass. Helps with the Table of Contents (top-right UI) aka Document Outline. docs: codefence cleanup --- docs: misc cleanup --- docs: Add Admonitions Switches `<details>` usage for collapsible admonitions (`???`) while other text content is switched to the visually more distinct admoniton (`!!!` or `???+`) style. This does affect editor syntax highlighting a bit and markdown linting as it's custom non-standard markdown syntax.
2021-03-01 10:41:19 +00:00
...
volumes:
...
- ./etc/dovecot/conf.d/10-plugin.conf:/etc/dovecot/conf.d/10-plugin.conf:ro
...
2020-10-08 22:36:39 +00:00
docs(refactor): Large refactor + additions + fixes Consistency pass, formatting cleanup and fixes, introduce admonitions, add front-matter. --- docs: Add front-matter --- docs: Fix and format links - Some links were invalid (eg files moved or renamed) - Some were valid but had invalid section headers (content removed or migrated) - Some use `http://` instead of `https://` when the website supports a secure connection. - Some already used the `[name][reference]` convention but often with a number that wasn't as useful for maintenance. - All referenced docs needed URLs replaced. Opted for the `[name][reference]` approach to group them all clearly at the bottom of the doc, especially with the relative URLs and in some cases many duplicate entries. - All `tomav` references from the original repo prior to switch to an organization have been corrected. - Minor cosmetic changes to the `name` part of the URL, such as for referencing issues to be consistent. - Some small changes to text body, usually due to duplicate URL reference that was unnecessary (open relay, youtous) - Switched other links to use the `[name][reference]` format when there was a large group of URLs such as wikipedia or kubernetes. Github repos that reference projects related to `docker-mailserver` also got placed here so they're noticed better by maintainers. This also helped quite a bit with `mermaid` external links that are very long. - There was a Github Wiki supported syntax in use `[[name | link]]` for `fetchmail` page that isn't compatible by default with MkDocs (needs a plugin), converted to `[name][reference]` instead since it's a relative link. --- docs: Update commit link for LDAP override script Logic moved to another file, keeping the permalink commit reference so it's unaffected by any changes in the file referenced in future. --- docs: Heading corrections Consistency pass. Helps with the Table of Contents (top-right UI) aka Document Outline. docs: codefence cleanup --- docs: misc cleanup --- docs: Add Admonitions Switches `<details>` usage for collapsible admonitions (`???`) while other text content is switched to the visually more distinct admoniton (`!!!` or `???+`) style. This does affect editor syntax highlighting a bit and markdown linting as it's custom non-standard markdown syntax.
2021-03-01 10:41:19 +00:00
volumes:
solr-dovecot:
driver: local
```
2020-10-08 22:36:39 +00:00
2. `etc/dovecot/conf.d/10-plugin.conf`:
docs(refactor): Large refactor + additions + fixes Consistency pass, formatting cleanup and fixes, introduce admonitions, add front-matter. --- docs: Add front-matter --- docs: Fix and format links - Some links were invalid (eg files moved or renamed) - Some were valid but had invalid section headers (content removed or migrated) - Some use `http://` instead of `https://` when the website supports a secure connection. - Some already used the `[name][reference]` convention but often with a number that wasn't as useful for maintenance. - All referenced docs needed URLs replaced. Opted for the `[name][reference]` approach to group them all clearly at the bottom of the doc, especially with the relative URLs and in some cases many duplicate entries. - All `tomav` references from the original repo prior to switch to an organization have been corrected. - Minor cosmetic changes to the `name` part of the URL, such as for referencing issues to be consistent. - Some small changes to text body, usually due to duplicate URL reference that was unnecessary (open relay, youtous) - Switched other links to use the `[name][reference]` format when there was a large group of URLs such as wikipedia or kubernetes. Github repos that reference projects related to `docker-mailserver` also got placed here so they're noticed better by maintainers. This also helped quite a bit with `mermaid` external links that are very long. - There was a Github Wiki supported syntax in use `[[name | link]]` for `fetchmail` page that isn't compatible by default with MkDocs (needs a plugin), converted to `[name][reference]` instead since it's a relative link. --- docs: Update commit link for LDAP override script Logic moved to another file, keeping the permalink commit reference so it's unaffected by any changes in the file referenced in future. --- docs: Heading corrections Consistency pass. Helps with the Table of Contents (top-right UI) aka Document Outline. docs: codefence cleanup --- docs: misc cleanup --- docs: Add Admonitions Switches `<details>` usage for collapsible admonitions (`???`) while other text content is switched to the visually more distinct admoniton (`!!!` or `???+`) style. This does affect editor syntax highlighting a bit and markdown linting as it's custom non-standard markdown syntax.
2021-03-01 10:41:19 +00:00
```conf
mail_plugins = $mail_plugins fts fts_solr
plugin {
fts = solr
fts_autoindex = yes
fts_solr = url=http://solr:8983/solr/dovecot/
}
```
2020-10-08 22:36:39 +00:00
2021-04-12 10:08:17 +00:00
3. Recreate containers: `docker-compose down ; docker-compose up -d`
docs(refactor): Large refactor + additions + fixes Consistency pass, formatting cleanup and fixes, introduce admonitions, add front-matter. --- docs: Add front-matter --- docs: Fix and format links - Some links were invalid (eg files moved or renamed) - Some were valid but had invalid section headers (content removed or migrated) - Some use `http://` instead of `https://` when the website supports a secure connection. - Some already used the `[name][reference]` convention but often with a number that wasn't as useful for maintenance. - All referenced docs needed URLs replaced. Opted for the `[name][reference]` approach to group them all clearly at the bottom of the doc, especially with the relative URLs and in some cases many duplicate entries. - All `tomav` references from the original repo prior to switch to an organization have been corrected. - Minor cosmetic changes to the `name` part of the URL, such as for referencing issues to be consistent. - Some small changes to text body, usually due to duplicate URL reference that was unnecessary (open relay, youtous) - Switched other links to use the `[name][reference]` format when there was a large group of URLs such as wikipedia or kubernetes. Github repos that reference projects related to `docker-mailserver` also got placed here so they're noticed better by maintainers. This also helped quite a bit with `mermaid` external links that are very long. - There was a Github Wiki supported syntax in use `[[name | link]]` for `fetchmail` page that isn't compatible by default with MkDocs (needs a plugin), converted to `[name][reference]` instead since it's a relative link. --- docs: Update commit link for LDAP override script Logic moved to another file, keeping the permalink commit reference so it's unaffected by any changes in the file referenced in future. --- docs: Heading corrections Consistency pass. Helps with the Table of Contents (top-right UI) aka Document Outline. docs: codefence cleanup --- docs: misc cleanup --- docs: Add Admonitions Switches `<details>` usage for collapsible admonitions (`???`) while other text content is switched to the visually more distinct admoniton (`!!!` or `???+`) style. This does affect editor syntax highlighting a bit and markdown linting as it's custom non-standard markdown syntax.
2021-03-01 10:41:19 +00:00
4. Flag all user mailbox FTS indexes as invalid, so they are rescanned on demand when they are next searched: `docker-compose exec mailserver doveadm fts rescan -A`
2020-10-08 22:36:39 +00:00
#### Further Discussion
2020-10-08 22:38:12 +00:00
See [#905](https://github.com/docker-mailserver/docker-mailserver/issues/905)