* tests: Ensure excessive FD limits are avoided
Processes that run as daemons (`postsrsd` and `fail2ban-server`) initialize by closing all FDs (File Descriptors).
This behaviour queries that maximum limit and iterates through the entire range even if only a few FDs are open. In some environments (Docker, limit configured by distro) this can be a range exceeding 1 billion (from kernel default of 1024 soft, 4096 hard), causing an 8 minute delay with heavy CPU activity.
`postsrsd` has since been updated to use `close_range()` syscall, and `fail2ban` will now iterate through `/proc/self/fd` (open FDs) which should resolve the performance hit. Until those updates reach our Docker image, we need to workaround it with `--ulimit` option.
NOTE: If `docker.service` on a distro sets `LimitNOFILE=` to approx 1 million or lower, it should not be an issue. On distros such as Fedora 36, it is `LimitNOFILE=infinity` (approx 1 billion) that causes excessive delays.
* chore: Use Docker host limits instead
Typically on modern distros with systemd, this should equate to 1024 (soft) and 512K (hard) limits. A distro may override the built-in global defaults systemd sets via setting `DefaultLimitNOFILE=` in `/etc/systemd/user.conf` and `/etc/systemd/system.conf`.
* tests(fix): Better prevent non-deterministic failures
- `no_containers.bats` tests the external script `setup.sh` (without `-c`). It's expected that no existing DMS container is running - otherwise it may attempt to use that container and fail. Detect this and fail early via `setup_file()` step.
- `mail_hostname.bats` had a odd timing failure with teardown due to the last tests bringing the containers down earlier (`docker stop` paired with the `docker run --rm`). Adding a moment of delay via `sleep` helps avoid that false positive scenario.
## Quick Summary
Resolves a `TODO` task with `addmailuser`.
## Overview
The main change is adding three new methods in `common.bash`, which replace the completion delay in `addmailuser` / `setup email add` command.
Other than that:
- I swapped `sh -c 'addmailuser ...'` to `setup email add ...`.
- Improved three tests in `setup-cli.bats` for `setup email add|update|del` (_logic remains effectively the same still_).
- Rewrote the `TODO` comment for `setup-cli.bats` test on `setup email del` to better clarify the concern, but the test itself was no longer affected due to changes prior to this PR, so I enabled the commented out assertion.
- Removed unnecessary waits. The two `skip` tests in `test/tests.bats` could be enabled again after this PR.
- Additional fixes to tests were made during the PR (see discussion comments for details), resolving race conditions.
Individual commit messages of the PR provide additional details if helpful.
---
## Relevant commit messages
* chore: Remove creation delay in `addmailuser`
This was apparently only for supporting tests that need to wait on account creation being ready to test against.
As per the removed inline docs, it should be fine to remove once tests are updated to work correctly without it.
* tests(feat): Add two new common helper methods
`wait_until_account_maildir_exists()` provides the same logic `addmailuser` command was carrying, to wait upon the account dir creation in `/var/mail`.
As this was specifically to support tests, it makes more sense as a test method.
`add_mail_account_then_wait_until_ready()` was added to handle the common pattern of creating account and waiting on it. An internal assert will ensure the account was successfully created first during the test before attempting to wait.
* tests(feat): Add common helper for waiting on change event to be processed
The current helper is more complicated for no real benefit, it only detects when a change is made that would trigger a change event in the `changedetector` service. Our usage of this in tests however is only interested in waiting out the completion of the change event.
Remove unnecessary change event waits. These waits should not be necessary if handled correctly.
* tests: `addmailuser` to `add_mail_account_then_wait_until_ready mail()`
This helper method is used where appropriate.
- A password is not relevant (optional).
- We need to wait on the creation on the account (Dovecot and `/var/mail` directory).
* tests: `setup-cli` revise `add`, `update`, `del` tests
The delete test was failing as the `/var/mail` directory did not yet exist.
There is now a proper delay imposed in the `add` test now shares the same account for both `update` and `del` tests resolving that failure.
Additionally tests use better asserts where appropriate and the wait + sleep logic in `add` has been improved (now takes 10 seconds to complete, approx half the time than before).
The `del` test TODO while not technically addressed is no longer relevant due to the tests being switched to `-c` option (there is a separate `no container` test file, but it doesn't provide a `del` test).
* tests(fix): Ensure Postfix is reachable after waiting on ClamAV
There is not much reason to check before waiting on ClamAV.
It is more helpful to debug failures from `nc` mail send commands if we know that nothing went wrong inbetween the ClamAV wait time.
Additionally added an assertion which should provide more information if this part of the test setup fails again.
* tests(fix): Move health check to the top
This test is a bit fragile. It relies on defaults for the healthcheck with intervals of 30 seconds.
If the check occurs while Postfix is down due a change event from earlier tests and the healthcheck kicks in at that point, then if there is not enough time to refresh the health status from `unhealthy`, the test will fail with a false-positive as Postfix is actually working and up again..
* tests(fix): Wait on directory to be removed
Workaround that tries not to introduce heavier delays by waiting on a full change event to complete in the previous `email update` if possible.
There is a chance that the account has the folder deleted, but restored from an active change event (for password update, then the account delete).
* tests: Update testing submodules (bats-assert, bats-support)
These two submodules were migrated to the `bats-core` organization, where they continued to receive updates.
* tests: Use tagged release of `bats-core/bats-support`
This is technically one commit backwards, but no relevant difference has been made since, other than moving the submodule to the `bats-core` organization.
* tests: Bump `bats-assert` to August 2022 (master)
No official release tag since Nov 2018, but a fair amount of changes since then.
* tests: Bump `bats-core` to `v1.7.0` release
* tests(fix): Correctly use assertions
Some tests were updated as the upgrade of bats submodules had `assert` methods raise awareness of incorrect usage.
This additionally revealed some existing tests that weren't meant to be using `run`, which swallowed failures from surfacing.
See the associated PR for more detailed commentary on specific changes.
### Commands refactored:
- User (**All:** add / list / update / del + _dovecot-master variants_)
- Quota (**All:** set / del)
- Virtual Alias (**All:** add / list /del)
- Relay (**All:** add-relayhost / add-sasl / exclude-domain)
### Overall changes involve:
- **Fairly common structure:**
- `_main` method at the top provides an overview of logical steps:
- After all methods are declared beneath it (_and imported from the new `helpers/database/db.sh`_), the `_main` is called at the bottom of the file.
- `delmailuser` additionally processes option support for `-y` prior to calling `_main`.
- `__usage` is now consistent with each of these commands, along with the `help` command.
- Most logic delegated to new helper scripts. Some duplicate content remains on the basis that it's low-risk to maintenance and avoids less hassle to jump between files to check a single line, usually this is arg validation.
- Error handling should be more consistent, along with var names (_no more `USER`/`EMAIL`/`FULL_EMAIL` to refer to the same expected value_).
- **Three new management scripts** (in `helpers/database/manage/`) using a common structure for managing changes to their respective "Database" config file.
- `postfix-accounts.sh` unified not only add and update commands, but also all the dovecot-master versions, a single password call for all 4 of them, with a 5th consumer of the password prompt from the relay command `addsaslpassword`.
- These scripts delegate actual writes to `helpers/database/db.sh` which provides a common API to support the changes made.
- This is more verbose/complex vs the current inline operations each command currently has, as it provides generic support instead of slightly different variations being maintained, along with handling some edge cases that existed and would lead to bugs (notably substring matches).
- Centralizing changes here seems wiser than scattered about. I've tried to make it easy to grok, hopefully it's not worse than the current situation.
- List operations were kept in their respective commands, `db.sh` is only really managing writes. I didn't see a nice way for removing the code duplication for list commands as the duplication was fairly minimal, especially for `listalias` and `listdovecotmasteruser` which were quite simple in their differences in the loop body.
- `listmailuser` and `delmailuser` also retain methods exclusive to respective commands, I wasn't sure if there was any benefit to move those, but they were refactored.
* chore: Create bare new test file `setup-cli.bats`
Bare minimum to setup a new test.
* chore: Transfer over relevant tests
* chore: `mail` container name to dynamic `${TEST_NAME}`
Only applied where it's relevant. Next commit will handle the config path correction.
* chore: Use `TEST_TMP_CONFIG` for referencing local config directory
Could technically use the existing function call. Some paths were using a hard-coded config location.
Both have been converted to `TEST_TMP_CONFIG` and related `grep` calls normalizing the quote mark usage, escaping doesn't seem necessary.
* tests(fix): Create container without providing extra args reference var
If a variable name (of an array) was not provided to reference, this would fail trying to reference `'`.
* chore: Remove `SYS_PTRACE` capability from docs and configs
* chore: Remove `SYS_PTRACE` capability from tests
Doesn't seem to be required. It was originally added when the original change detection feature PR apparently needed it to function.
This helper was to support an earlier ENV for SASL auth support. When extracting logic into individual helpers, it was assumed this was separate from relay support, which it appears was not the case.
---
The `SASL_PASSWD` ENV is specified in tests but no longer used. There is no `external-domain.com` relay configured or tested against anywhere in the project.
The ENV was likely used in tests prior to improved relay support that allowed for adding more than a single set of relay credentials.
---
It likewise has no real relevance anywhere else outside of `relay.sh` as it's the only portion of code to operate with it.
It's only relevant for SASL auth as an SMTP client, not the SMTP server (`smtpd`) SASL support that is delegated to Dovecot. Functionality has been completely migrated into `relay.sh` as a result.
Documentation is poor for this ENV, it is unlikely in wide use? Should consider for removal.
---
The ENV has been dependent upon `RELAY_HOST` to actually enable postfix to use `/etc/postfix/sasl_passwd`, thus not likely relevant in existing setups?
---
Migrate `/etc/postfix/sasl_passwd` check from `tests.bats` as it belongs to relay tests.
* chore: Fix typo
* chore: Apply explicit chroot default for `sender-cleanup`
The implicit default is set to `y` as a compatibility fallback, but otherwise it is [advised to set to `n` going forward](http://www.postfix.org/COMPATIBILITY_README.html#chroot).
Test was changed to catch any backwards-compatibility logs, not just those for `chroot=y`. `using` added as a prefix to avoid catching log message whenever a setting is changed that the default compatibility level is active.
* chore: Set `compatibility_level` in `main.cf`
We retain the level`2` value previously set via scripts. This avoids log noise that isn't helpful.
Applied review feedback to give maintainers some context with this setting and why we have it presently set to `2`.
* tests(fix): Increase some timeouts
Running tests locally via a VM these tests would fail sometimes due to the time from being queued and Amavis actually processing being roughly around 30 seconds.
There should be no harm in raising this to 60 seconds, other than delaying a failure case which will ripple through other time sensitive tests.
It's better to pass when functionality is actually correct but just needs a bit longer to complete.
* tests(fix): Don't setup an invalid hostname
During container startup `helpers/dns.sh` would panic with `hostname -f` failing.
Dropping `--domainname` for this container is fine and does not affect the point of it's test.
---
It's unclear why this does not occur in CI. Possibly changes within the docker daemon since as CI runs docker on Ubuntu 20.04? (2020).
For clarity, this may be equivalent to setting a hostname of `domain.com.domain.com`, or `--hostname` value truncated the NIS domain (`--domainname`) of the same value.
IIRC, it would still fail with both options using different values if `--hostname` was multi-label. I believe I've documented how non-deterministic these options can be across different environments.
`--hostname` should be preferred. There doesn't seem to be any reason to actually need `--domainname` (which is NIS domain name, unrelated to the DNS domain name). We still need to properly investigate reworking our ENV support that `dns.sh` manages.
---
Containers were also not removing themselves after failures either (missing teardown). Which would cause problems when running tests again.
* chore: Normalize white-space
Sets a consistent indent size of 2 spaces. Previously this varied a fair bit, sometimes with tabs or mixed tabs and spaces.
Some formatting with blank lines.
Easier to review with white-space in diff ignored. Some minor edits besides blank lines, but no change in functionality.
* fix: `setup.sh` target container under test
Some of the `setup.sh` commands did not specify the container which was problematic if another `docker-mailserver` container was running, causing test failures.
This probably doesn't help with `test/no_container.bats`, but at least prevents `test/tests.bats` failing at this point.
* disabled unreliable test
The "quota exceeded" test is unreliable and failed too often lately for
my taste. Therefore, I'd like to disable it because there is no use in
having such a test.
* corrected PR id in URL
* refactored scripts located under `target/bin/`
The scripts under `target/bin/` now use the new log and I replaced some
`""` with `''` on the way. The functionality stays the same, this mostly
style and log.
* corrected fail2ban (script and tests)
* corrected OpenDKIM log output in tests
* reverted (some) changes to `sedfile`
Moreover, a few messages for BATS were streamlined and a regression in
the linting script reverted.
* apple PR feedback
* improve log output from `fail2ban` script
The new output has a single, clear message with the '[ ERROR ] '
prefix, and then output that explains the error afterwards. This is
coherent with the logging style which should be used while providing
more information than just a single line about IPTables not functioning.
* simplified `setquota` script
* consistently named the `__usage` function
Before, scripts located under `target/bin/` were using `usage` or
`__usage`. Now, they're using `__usage` as they should.
* improved `sedfile`
With `sedfile`, we cannot use the helper functions in a nice way because
it is used early in the Dockerfile at a stage where the helper scripts
are not yet copied. The script has been adjusted to be canonical with
all the other scripts under `target/bin/`.
* fixed tests
* removed `__usage` from places where it does not belong
`__usage` is to be used on wrong user input, not on other failures as
well. This was fixed in `delquota` and `setquota`.
* apply PR review feedback
The new setup will now set env variables on one place and on one place
only. The old setup used two separate places wich is not DRY and
confusing.
Some default values changed:
1. PFLOGSUMM_TRIGGER: logrotate => none
2. REPORT_SENDER: mailserver-report@HOSTNAME => mailserver-report@DOMAIN
3. REPORT_RECIPIENT: "0" => POSTMASTER_ADDRESS
One env variable was renamed: REPORT_INTERVAL => LOGROTATE_INTERVAL
I believe these defaults to be more sensible, especially the REPORT_RECIPIENT
address. The PFLOGSUMM_TRIGGER value was changed to `none` because otherwise
people would start getting daily Postfix log summary reports automatically.
Now, this is opt-in, and reports are sent only when enabled properly.
Some of the variables changed were marked as deprecated. I removed the note,
as the variables now bear some (sane) defaults again for other variables
(i.e.) REPORT_RECIPIENT is now default for other recipient addresses.
Co-authored-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>
Co-authored-by: Casper <casperklein@users.noreply.github.com>
* chore(refactor): DRY up the `_setup_ssl` method
- `/etc/postfix/ssl` was a bit misleading in usage here. As a maintainer (of my own contribution!) I was confused why only `/etc/postfix/ssl` was referenced and not `/etc/dovecot/ssl`.
- The postfix specific path is unnecessary, dovecot was referencing it via it's config, the same can be done from postfix to a generic DMS specific config location instead.
- This location is defined and created early as `/etc/dms/tls` (with var `DMS_TLS_PATH`). All usage of `/etc/postfix/ssl` has been replaced, making it easier to grok. Several `mkdir` commands related to this have been dropped as a result.
- Likewise, a related `TMP_DMS_TLS_PATH` var provides a reference to the config volume path `/tmp/docker-mailserver` which is used for conditions on presently hard-coded paths.
- Other values that benefit from being DRY have been lifted up into vars. Definitely easier to follow now and makes some further opportunities clearer to tackle in a future refactor.
- `chmod` has been updated where appropriate. Public key/cert is acceptable to have as readable by non-root users (644). The custom type with single fullchain file was not root accessible only, but should as it contains a private key.
- That said, the security benefit can be a bit moot due to source files that were copied remain present, the user would be responsible to ensure similar permissions on their source files.
- I've not touched LetsEncrypt section as I don't have time to investigate into that yet (not familiar with that portion).
---
* chore: Remove mkcert logic and dovecot cert
- No longer serving a purpose.
- Our own TLS startup script handles a variety of cert scenarios, while the dropped code was always generating a self-signed cert and persisting an unused cert regardless with `ONE_DIR=1`.
- To avoid similar issues that DH params had with doveadm validating filepath values in the SSL config, the default dummy values match postfix pointing to "snakeoil" cert. That serves the same purpose as mkcert was covering in the image.
- Bonus, no more hassle with differing mkcert target paths for users replacing our supplied Dovecot with the latest community edition.
---
* Error handling for SSL_TYPE
- Added a panic utility to exit early when SSL_TYPE conditions are misconfigured.
- Some info text had order of key/cert occurrence swapped to be consistent with key then cert.
- Some existing comments moved and rephrased.
- Additional comments added.
- `-f` test for cert files instead of `-e` (true also for directories/devices/symlinks).
- _notify messages lifted out of conditionals so that they always output when the case is hit.
- ~~Empty SSL_TYPE collapsed into catch all panic, while it's contents is now mapped to a new 'disabled' value.~~
---
* Use sedfile + improve sed expressions + update case style
- Uses sedfile when appropriate (file change intentional, not optional match/check).
- sed expressions modified to be DRY and reduce escaping via `-r` flag (acceptable if actual text content contains no `?`,`+`,`()` or `{}` characters, [otherwise they must be escaped](https://www.gnu.org/software/sed/manual/html_node/Extended-regexps.html)).
- sed captures anything matched between the parenthesis`()` and inserts it via `\1` as part of the replacement.
- case statements adopt the `(` prefix, adopting recent shell style for consistency.
---
* Refactor SSL_TYPE=disabled
- Postfix is also disabled now.
- Included heavy inline documentation reference for maintainers.
- Dropped an obsolete postfix config option 'use_tls' on the relayhost function, it was replaced by 'security_level'.
---
* I'm a friggin' sed wizard now
- The `modern` TLS_LEVEL is the default values for the configs they modify. As such, `sedfile` outputs an "Error" which isn't an actual concern, back to regular `sed`.
- I realized that multiple edits for the same file can all be done at once via `-e` (assuming other sed options are the same for each operation), and that `g` suffix is global scope for single line match, not whole file (default as sed iterates through individual lines).
- Some postfix replacements have `smtp` and `smtpd` lines, collapsed into a single `smtpd?` instead now that I know sed better.
---
* tests(fix): Tests that require SSL/TLS to pass
- SSL_TYPE=snakeoil added as temporary workaround.
- nmap tests are being dropped. These were added about 4-5 years ago, I have since made these redundant with the `testssl.sh` tests.
- Additionally the `--link` option is deprecated and IIRC these grades were a bit misleading when I initially used nmap in my own TLS cipher suite update PRs in the past.
- The removed SSL test is already handled in mail_ssl_manual.bats
ldap test:
- Replace `--link` alias option with `--network` and alias assignment.
- Parameterized some values and added the `SSL_TYPE` to resolve the starttls test failure.
privacy test:
- Also needed `SSL_TYPE` to pass the starttls test.
`tests.bats` had another starttls test for imap:
- Workaround for now is to give the main test container `SSL_TYPE=snakeoil`.
---
* Remove the expired lets-encrypt cert
This expired in March 2021. It was originally required when first added back in 2016 as LetsEncrypt was fairly new and not as broadly accepted into OS trust stores.
No longer the case today.
---
* chore: Housekeeping
Not required for this PR branch, little bit of tidying up while working on these two test files.
- privacy test copied over content when extracted from `tests.bats` that isn't relevant.
- ldap test was not as easy to identify the source of DOVECOT_TLS. Added comment to make the prefix connection to `configomat.sh` and `.ext` files more easier to find.
- Additionally converted the two localhost FQDN to vars.
---
* Default SSL_TYPE becomes `''` (aka equivalent to desired `disabled` case)
- This is to prevent other tests from failing by hitting the panic catchall case.
- More ideal would be adjusting tests to default to `disabled`, rather than treating `disabled` as an empty / unset SSL_TYPE value.
---
* Add inline documentation for `dms_panic`
- This could later be better formatted and placed into contributor docs.
Panic with kill (shutdown) not exit (errex):
- `kill 1` from `_shutdown` will send SIGTERM signal to PID 1 (init process).
- `exit 1` within the `start-mailserver.sh` init scripts context, will just exit the initialization script leaving the container running when it shouldn't.
The two previous `_shutdown` methods can benefit from using `dms_panic` wrapper instead to standardize on panic messages.
Decoupling setup process from `setup.sh` script by introducing a setup script _inside_ the container that coordinates the setup process.
**This is not a breaking change**. This way, we do not have to keep track of versions of `setup.sh`.
This change brings the additional benefit for Kubernetes users to be able to make use of `setup` now, without the need for `setup.sh`.
---
* move setup process into container; setup.sh versioning not needed anymore
* add tilde functionality to docs
Co-authored-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>
Co-authored-by: Casper <casperklein@users.noreply.github.com>
* docker_container first, then fall back to docker_image
+ test changes to support
+ test change to wait for smtp port to fix flakey tests since https://github.com/docker-mailserver/docker-mailserver/pull/2104
* quick fix
* Update setup.sh
Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>
Co-authored-by: Casper <casperklein@users.noreply.github.com>
* splitting start-mailserver.sh
* refactoring part 2
* refactored setup-stack.sh
* stzarted adjusting target/bin/*.sh to use new usage format
* corrected lowercase-uppercase test error
* better handling of .bashrc variable export
* linting tests and fix for default assignements
* last stylistic changes and rebase
* provide complete refactoring of openDKIM usage and tests
* fix leftover linting errors
* correct defualt key size and README usage
* provide independent order for arguments
* added `config` and adjusted usage information
* fixing shift in setup.sh
* adjust usage information to use new style and rename script
* use updated argument keysize instead of size
* first migration steps
* altered issue templates
* altered README
* removed .travis.yml
* adjusting registry & repository, Dockerfile and compose.env
* Close stale issues automatically
* Integrated CI with Github Actions (#3)
* feat: integrated ci with github actions
* fix: use secrets for docker org and update image
* docs: clarify why we use -t if no tty exists
* fix: correct remaining references to old repo
chore: prettier automatically updated markdown as well
* fix: hardcode docker org
* change testing image to just testing
* ci: add armv7 as a supported platform
* finished migration steps
* corrected linting in build-push action
* corrected linting in build-push action (2)
* minor preps for PR
* correcting push on pull request and minor details
* adjusted workflows to adhere closer to @wernerfred's diagram
* minor patches
* adjusting Dockerfile's installation of base packages
* adjusting schedule for stale issue action
* reverting license text
* improving CONTRIBUTING.md PR text
* Update CONTRIBUTING.md
* a bigger patch at the end
* moved all scripts into one directory under target/scripts/
* moved the quota-warning.sh script into target/scripts/ and removed empty directory /target/dovecot/scripts
* minor fixes here and there
* adjusted workflows for use a fully qualified name (i.e. docker.io/...)
* improved on the Dockerfile layer count
* corrected local tests - now they (actually) work (fine)!
* corrected start-mailserver.sh to make use of defaults consistently
* removed very old, deprecated variables (actually only one)
* various smaller improvements in the end
* last commit before merging #6
* rearranging variables to use alphabetic order
Co-authored-by: casperklein <casperklein@users.noreply.github.com>
Co-authored-by: Nick Pappas <radicand@users.noreply.github.com>
Co-authored-by: William Desportes <williamdes@wdes.fr>