* postfix: remove smtpd_sasl_auth_enable global setting
* tests: disable auth on 25 port
* tests: revert ldap-smtp-auth-spoofed-sender-with-filter-exception.txt
* Skip failing test
The test seems to have been broken from the beginning.
Sadly, no LDAP maintainers can verify. Added a TODO item if ever a LDAP maintainer comes around.
* Apply PR feedback
---------
Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>
* chore: Only replace `CHKSUM_FILE` when a change has been processed
* chore: Change Detection service should be the last daemon started
* chore: Remove 10 second startup delay for change detector
There should be no concern with conflicts as any writes should have already been done by the time this daemon service is started.
* tests(fix): `smtp_delivery.bats` must wait for Amavis
The change event for adding a user can be processed much sooner now, which means Amavis may not yet be ready.
Added extra condition to wait on at least the Amavis port being reachable, and some failure asserts with the mail queue to better catch / debug when this problem occurs.
* chore: Add some minor delay to avoid Amavis failing to connect
* tests(refactor): Make test cases for opendkim keysizes DRY
- These all do roughly the same logic that can be split into two separate methods.
- `_should_generate_dkim_key()` covers a bit more logic as it can be leveraged to handle other test cases that also perform the same logic.
- The `config/opendkim/` doesn't seem necessary for tests. Only the first few test cases here are testing against it, so we can conditionally make that available. `process_check_restart.bats` also depended on it to run OpenDKIM successfully, but this was due to the `setup-stack.sh` config defaults failing to find an "empty" file forcing `supervisord` to constantly restart the process..
- With this, there we inverse the default opendkim config, so we don't have to mount unique / empty subfolders for each test case, followed by copying over the two extra configs.
* tests(refactor): DRY up more test cases
All the remaining test cases but the last one were refactored here for a clean commit diff. The last test case will be refactored in the following commit.
Plenty of repeated logic spread across these test cases, now condensed into shared methods.
* tests(refactor): Make final test case DRY
* chore: Migrate to new testing helpers
* chore: Revise test case descriptions
* tests(refactor): Improve and simplify assertions
* tests(refactor): Use common container setup instead of `docker run`
- As the majority of test cases are only running `open-dkim` helper, we don't actually have to wait for a full container setup. So an alternative container start is called.
- Also improves assertions a bit more instead of just counting lines.
- Some test cases don't bind mount all of `/tmp/docker-mailserver` contents, thus don't raise permission errors on subsequent test runs.
- Instead of `rm -f` on some config files, have opted to mount them read-only instead, or alternatively mount an anonymous empty volume instead.
- Collapsed the first three test cases into one, thus no `setup_file()` necessary.
- Shift the `_wait_for_finished_setup_in_container()` method into `_common_container_setup()` instead since nothing else is using `_common_container_start()` yet, this allows for avoiding the wait.
* tests(refactor): Collapse dkim key size test cases into single test case
This makes these tests a bit more DRY, and enhances the raised quality issue with these tests. Now not only is the domain checked in the generated DNS dkim record, but we also verify the key size is corrected in the public and private keys via openssl.
* chore: Revise container names
* chore: Swap order of test case 1 and 2
* tests(refactor): Assert generated log output
- `__should_have_tables_trustedhosts_for_domain` shifted in each test case to just after generating the domains keys.
- Asserts `open-dkim` logs instead of just counting them.
- Added checks for domains that should not be present in a test case.
- Additional coverage and notes about the alias from vhost `@localdomain.com`
- Single assert statement with switch statement as all are using common args.
* chore: Minor changes
* tests(refactor): Share `find` logic in helpers and tests
* tests(fix): Listing file content does not need to match line order
The order printed from local system vs CI differed causing the CI to fail. The order of lines is irrelevant so `--index` is not required.
Additionally correct the prefix of the called method to be only one `_` now that it's a `common.bash` helper method.
* chore: Collapse custom DKIM selector test into custom DKIM domain test
These cover the same test logic for the most part, the first domain could also be testing the custom selector.
`special_use_folders.bats` + `mailbox_format_dbox` can assert lines instead, removing the need for `--partial`.
* Apply suggestions from code review
Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>
* chore: Split switch statement method into wrapper methods
---------
Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>
* only add Amavis configuration to Postfix when enabled
Since I am running Rspamd nowadays, I noticed there still are ports open
that belong to Amavis. This is because the Amavis configuration is a
fixed part of Postfix's `master.cf`. I changed that. Now, the Amavis
section is added when Amavis really is enabled.
I took the chance and added proper indentation to `master.cf`; hence the
diff is a bit fuzzy. **But**, only the Amavis part was adjusted, the
rest is just styling.
This appears to have been added to replace the `fam` package in an early version of DMS with Courier for IMAP instead of Dovecot on an Ubuntu 14.04 base image.
It does not appear to serve a purpose anymore.
* chore: Remove the wrapper script for `fail2ban`
- This does not appear necessary. The server can be run with foreground mode.
- `daemons-stack.sh` removal of the socket can be handled by the fail2ban server when using the `-x` option.
* chore: Remove `touch /var/log/auth.log`
These were both added as supposed fixes in 2016 for the then Ubuntu 2014 base image.
Removing them causes no failures in tests.
* fix: Install optional python packages for `fail2ban`
These have barely any overhead in layer weight. The DNS package may provide some QoL improvements, while the `pyinotify` is a better alternative than polling logs to check for updates.
We have `gamin` package installed but `fail2ban` would complain in the log that it was not able to initialize the module for it. There only appears to be a `python-gamin` dependent on EOL python 2, no longer available from Debian Bullseye.
* added options to toggle OpenDKIM & OpenDMARC
rspamd can provide DKIM signing and DMARC checking itself, so users
should be able to disable OpenDKIM & OpenDMARC. The default is left at
1, so users have to to opt-in when the want to disable the features.
* misc small enhancements
* adjusted start of rspamd
The order of starting redis + rspamd was reversed (now correct) and
rspamd now starts with the correct user.
* adjusted rspamd core configuration
The main configuration was revised. This includes AV configuration as
well as worker/proxy/controller configuration used to control the main
rspamd processes.
The configuration is not tested extensively, but well enough that I am
confident to go forward with it until we declare rspamd support as
stable.
* update & improve the documentation
* add tests
These are some initial tests which test the most basic functionality.
* tests(refactor): Improve consistency and documentation for test helpers (#3012)
* added `ALWAYS_RUN` target `Makefile` recipies (#3013)
This ensures the recipies are always run.
Co-authored-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
* adjusted rspamd test to refactored test helper functions
* improve documentation
* apply suggestions from code review (no. 1 by @polarthene)
Co-authored-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>
* streamline heredoc (EOM -> EOF)
* adjust rspamd test (remove unnecessary run arguments)
Co-authored-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>
* fix: RSPAM ENV should only add to array if ENV enabled
* fix: Correctly match ownership for Postfix content
- `/var/lib/postfix` dir and content is `postfix:postfix`, not `postfix:root`.
- `/var/spool/postfix` is `root:root` not `postfix:root` like it's content.
- Add additional comments, including ownership changes by Postfix to `/var/spool/postfix` when process starts / restarts.
* fix: Ensure correct `chown -R` user and groups applied
These were all fine except for clamav not using the correct clamav group. `fetchmail` group is `nogroup` as per the group set by the debian package.
Additionally formatted the `-eq 1 ]]` content to align on the same columns, and added additional comment about the purpose of this `chown -R` usage so that it's clear what bug / breakage it's attempting to prevent / fix.
* refactor: `misc-stack.sh` conditional handling
The last condition doesn't get triggered at all AFAIK. Nor does it make sense to make a folder path with `mkdir -p` to symlink to when the container does not have anything to copy over?
- If that was for files, the `mkdir -p` approach seems invalid?
- If it was for a directory that could come up later, it should instead be created in advance? None of the current values for `FILES` seem to hit this path.
Removing as it doesn't seem relevant to current support.
Symlinking was done for each case, I've opted to just perform that after the conditional instead.
Additional inline docs added for additional context.
* chore: Move amavis `chown -R` fix into `misc-stack.sh`
This was handled separately for some reason. It belongs with the other services handling this fix in `misc-stack.sh`.
The `-h` option isn't relevant, when paired with `-R` it has no effect.
* fix: Dockerfile should preserve `clamav` ownership with `COPY --link`
The UID and GID were copied over but would not match `clamav` user and group due to numeric ID mismatch between containers. `--chown=clamav` fixes that.
* chore: Workaround `buildx` bug with separate `chown -R`
Avoids increasing the image weight from this change by leveraging `COPY` in the final stage.
* chore: `COPY --link` from a separate stage instead of relying on scratch
The `scratch` approach wasn't great. A single layer invalidation in the previous stage would result in a new 600MB layer to store.
`make build` with this change seems to barely be affected by such if a change came before copying over the linked stage, although with `buildx` and the `docker-container` driver with `--load` it would take much longer to import and seemed to keep adding storage. Possibly because I was testing with a minimal `buildx` command, that wasn't leveraging proper cache options?
* lint: Appease the linting gods
* chore: Align `misc-stack.sh` paths for `chown -R` operations
Review feedback
Co-authored-by: Casper <casperklein@users.noreply.github.com>
* fix: Reduce one extra cache layer copy
No apparent advantage of a `COPY --link` initially in separate stage.
Just `COPY --chown` in the separate stage and `COPY --link` the stage content. 230MB less in build cache used.
* fix: Remove separate ClamAV stage by adding `clamav` user explicitly
Creating the user before the package is installed allows to ensure a fixed numeric ID that we can provide to `--chown` that is compatible with `--link`.
This keeps the build cache minimal for CI, without being anymore complex as a workaround than the separate stage was for the most part.
* chore: Add reference link regarding users to `misc-stack.sh`
* chore: Co-locate process checking and process restart verification
Extract the test cases for checking a process is running and properly restarts from various test files into a single one:
Core (always running):
opendkim, opendmarc, master (postfix)
ENV dependent:
amavi (amavisd-new), clamd, dovecot, fail2ban-server (fail2ban), fetchmail, postgrey, postsrsd, saslauthd
These now run off a single container with the required ENV and call a common function (the revised version in parallel test cases).
* fix(saslauthd): Quote wrap supervisor config vars
`saslauth.conf` calls `-O` option for most commands defined with an ENV that may be empty/null. This would cause the process to silently fail / die.
This doesn't happen if quote wrapping the ENV, which calls `-O` with an empty string.
Not necessary, but since one of `postgrey` ENV were quote wrapped in `supervisor-app.conf`, I've also done the same there.
* fix(postsrsd): Change supervisor `autorestart` policy to `true`
The PR that introduced the config switched from `true` to `unexpected` without any context. That prevents restart working when the process is killed. Setting to `true` instead will correctly restart the service.
* chore: Remove disabled postgrey test file
`mail_with_postgrey_disabled_by_default.bats` only checked the migrated test cases, removed as no longer serving a purpose.
* tests(refactor): Make `_should_restart_when_killed()` more reliable
The previous version did not ensure that the last checks process was actually restarted, only that it was running.
It turns out that `pkill` is only sending the signal, there can be some delay before the original process is actually killed and restarted.
This can be identified with `pgrep --older <seconds>`. First ensure the process is at a specified age, then after killing check that the process is not running that is at least that old, finally check that there is a younger process actually running.. (_could fail if a process doesn't restart, or there is a delay such as imposed by `sleep` in wrapper scripts for postfix and fail2ban_)
The helper method is not used anywhere else now, move it into this test instead. It has been refactored to accomodate the needs for `--older`, and `--list-full` provides some output that can be matched (similar for `pkill --echo`).
* test(docs): Add inline notes about processes
* chore: Compress test cases into single case with loop
Moves the list of processes into array vars to iterate through instead.
If a failure occurs, the process name is visible along with line number in `_should_restart_when_killed()` to identify what went wrong.
* chore: Handle `FETCHMAIL_PARALLEL=1` process checks as well
* tests: Add test case for disabled ENV
Additional coverage to match what other test files were doing before, ensuring that these ENV can prevent their respective service from running.
* chore: Move `clamd` enabled check to it's own test case
Not sure about this.
It reduces the time of CPU activity (sustained full load on a thread) and increase in memory usage (1GB+ loading signatures database), but as a separate test case it also adds 10 seconds without reducing the time of the test case it was extracted from.
* chore: Make `disabled` variant the 1st test case
* fix: Adjust test cases to pass when using slower wrapper scripts
* tests(refactor): `mail_fetchmail.bats` updated to new format
Additionally merges in the parallel test file.
* chore: Move `config/fetchmail.cf` into separate sub-directory
Keep out of the default base config for tests.
* chore: Change `fetchmail.cf` FQDNs to `.test` TLD
Changed the first configs remote and local user values to more clearly document what their values should represent (_and that they don't need to be a full mail address, that's just what our Dovecot is configured with for login_).
Shifted the `here` to the end of the `is` line. It's optional syntax, only intended to contrast with the remote `there` for readability.
Additionally configured imap protocol. Not tested or verified if that's correct configuration for usage with imap protocol instead. The fetchmail feature tests are currently lacking.
Added an inline doc into the fetchmail test to reference a PR about the importance of the trailing `.` in the config. Updated the partial matching to ensure it matches for that in the value as well.
* chore: Finalize `process-check-restart.bats`
Few minor adjustments. The other ENV for clamd doesn't seem to provide any benefit, trim out the noise. Added a note about why it's been split out.
Fetchmail parallel configs are matching the config file path in the process command that is returned. The `.rc` suffix is just to add further clarity to that.
* tests(refactor): `mail_changedetector.bats` - Leverage DRY methods
`supervisorctl tail` is not the most reliably way to get logs for the latest change detection and has been known to be fragile in the past.
We can instead read the full log for the service directly with `tac` and `sed` to extract all log content since the last change detection.
Common asserts have also been extracted out into separate methods.
* tests(chore): Remove sleep and redundant change event
Container 1 is still blocked at this point from an existing lock and change event.
Make the lock stale immediately and no extra sleep is required when paired with the helper method to wait until the event is processed (which should remove the stale lock).
* tests(refactor): Add more DRY methods
- Simplify the test case so it's easier to grok.
- 2nd test case (blocking) extracts out initial setup into a separate method and merges the later service restart logic which is redundant.
- Additional comments for improved context of what is going on / expected.
* tests(chore): Revise the change detection helper method
- Add explicit counting arg to change detection support.
- Extract revised logic into it's own generic helper method.
- Utilize this for a separate method that monitors for a change event having started, but not waiting for completion.
This allows dropping the 40 sec of remaining `sleep` in `mail_changedetector` test. It was also required due to potentially missing the timing of a change event completing concurrently in a 2nd container that needed to be waited on and then checked.
* tests(chore): Migrate to current test conventions
- Switch to common container setup helpers
- Update container name and change usage to variables instead.
- Adopt the new convention of prefix variable for test cases (revised test case descriptions).
* tests(chore): Remove legacy change detection
This has since been replaced with the new helper watches the `changedetector` service logs directly instead of only detecting a change has occurred via checksum comparison.
No tests use this method anymore, it was originally for `tests.bats`. Thus the tests in `test_helper.bats` are being dropped too. The new helper has test coverage in `changedetector` tests.
* chore: Lock removal should not incur `sleep 5` afterwards
- A new lock should be created by this script after removal. The sleep doesn't help avoid a race condition with lock file creation after removal.
- Reduces test time as a bonus.
- Added some additional comments to test.
* tests(chore): `tls_letsencrypt.bats` leverage improved change detection
- No need to wait on the change detection service anymore during container startup.
- No need to count change events processed either as waiting a fixed duration is no longer relied on.
- This makes the reload count method redundant, dropped.
* tests(chore): Convert `setup-cli.bats` to new test conventions
This test file was already adapted to the original common setup helpers.
- `TEST_NAME` replaced with `CONTAINER_NAME`.
- Prefix var added, test case descriptions drop explicit prefix.
- No other changes.
* tests(chore): Extract out helpers related to change-detection
- New helper file for sharing these helpers to tests.
- Includes the helpful log method from changedetector tests.
- No longer need to maintain duplicate copies of these methods during the test migration. All tests that use them are now importing the separate helper file.
- `tls_letsencrypt.bats` has switched to using the log helper.
- Generic log count helper is removed from `test_helper/common.bash` as any test that needs it in future can adapt to `helper/common.bash`.
* tests(refactor): `tls_letsencrypt.bats` remove `_get_service_logs()`
This helper does not seem useful as moving away from `supervisorctl tail` and no other tests had a need for it.
* tests(chore): Remove common setup methods from `test_helper/common.bash`
No other tests depend on this. Future tests will adopt the revised versions from `helper/setup.bash`.
Additionally updates `helper/setup.bash` comments that are no longer applicable to `TEST_TMP_CONFIG` and `CONTAINER_NAME`.
* chore: Use `|| true` to simplify setting `EXPECTED_COUNT` correctly
* chore: Drop ENV `ENABLE_POSTFIX_VIRTUAL_TRANSPORT`
* tests(chore): Remove redundant `dovecot-lmtp` config
None of this is needed. Only relevant change is changing the LMTP service listener for Dovecot and that can be delegated to `user-patches.sh`.
* tests(refactor): Use `user-patches.sh` instead of replacing config file
The only relevant changes in `test/config/dovecot-lmtp` regarding LMTP was:
- `/etc/dovecot/dovecot.conf` (`protocols = imap lmtp`) and `/etc/dovecot/protocols.d/` (`protocols = $protocols lmtp`).
- `conf.d/10-master.conf` only changed the LMTP service listener from a unix socket to TCP on port 24 (_this was the only change required for the test to pass_).
None of those configs are required as:
- `protocols = imap pop3 lmtp` [is the upstream default](https://doc.dovecot.org/settings/core/#core_setting-protocols), no need to add `lmtp`.
- The LMTP service listener is now configured for the test with `user-patches.sh`.
* tests(refactor): `mail_lmtp_ip.bats`
- Converted to new testing conventions and common container helpers.
- `ENABLE_POSTFIX_VIRTUAL_TRANSPORT` was not relevant, dropped.
- Revised test cases, logic remains the same.
- Large custom config used was not documented and doesn't appear to serve any purpose. Simplified by replacing with a single modification with `user-patches.sh`.
- Added some additional comments for context of test and improvements that could be made.
* tests(chore): Adjust comments
The comment from `mail_hostname` provides no valid context, it was likely copied over from `tests.bats` in Oct 2020 by accident.
The email sent is just for testing, nothing relevant to LMTP.
---
Added additional comment for test to reference extra information from.
* tests(chore): Update similar log line matching
Extracts out the match pattern and formatting commands into separate vars (reduces horizontal scrolling), and includes extra docs about what the matched line should be expected to look like.
* fix: Workaround `postconf` write settle logic
After updating `main.cf`, to avoid an enforced delay from reading the config by postfix tools, we can ensure the modified time is at least 2 seconds in the past as a workaround. This should be ok with our usage AFAIK.
Shaves off 2+ seconds roughly off each container startup, reduces roughly 2+ minutes off tests.
* chore: Only modify `mtime` if less than 2 seconds ago
- Slight improvement by avoiding unnecessary writes with a conditional check on the util method.
- Can more comfortably call this during `postfix reload` in the change detection cycle now.
- Identified other tests that'd benefit from this, created a helper method to call instead of copy/paste.
- The `setup email restrict` command also did a modification and reload. Added util method here too.
* tests(fix): `mail_smtponly.bats` should wait for Postfix
- `postfix reload` fails if the service is not ready yet.
- `service postfix reload` and `/etc/init.d/postfix reload` presumably wait until it is ready? (as these work regardless)
* chore: Review feedback - Move reload method into utilities
* chore: Set `TLS_INTERMEDIATE_SUITE` to only use TLS 1.2 ciphersuites
Removes support of the following cipher suites that are only valid for TLS 1.0 + 1.1:
- `ECDHE-ECDSA-AES128-SHA`
- `ECDHE-RSA-AES128-SHA`
- `ECDHE-ECDSA-AES256-SHA`
- `ECDHE-RSA-AES256-SHA`
- `DHE-RSA-AES128-SHA`
- `DHE-RSA-AES256-SHA`
* chore: Update TLS version min and ignore settings
These are now the same as modern settings.
* fix: Remove min TLS support workaround
No longer required now that outdated TLS versions have been dropped.
* tests: Remove support for TLS 1.0 and 1.1 ciphersuites
* tests: Remove support for TLS 1.0 and 1.1 ciphersuites (Port 25)
The removed SHA1 cipher suites are still supported in TLS 1.2, thus they've been excluded for port 25 via the `SHA1` exclusion pattern in `main.cf`.
With `reload` a change detection event during local testing can be processed in less than a second according to logs. Previously this was 5+ seconds (_plus additional downtime for Postfix/Dovecot to become available again_).
In the past it was apparently an issue to use `<service> reload` due to a concern with the PID for wrapper scripts that `supervisorctl` managed, thus `supervisorctl <service> restart` had been used. Past discussions with maintainers suggest this is not likely an issue anymore, and `reload` should be fine to switch to now 👍
---
**NOTE:** It may not be an issue in the CI, but on _**local systems running tests may risk failure in `setup-cli.bats` from a false positive**_ due to 1 second polling window of the test helper method, and a change event being possible to occur entirely between the two checks undetected by the current approach.
If this is a problem, we may need to think of a better way to catch the change. The `letsencrypt` test counts how many change events are expected to have been processed, and this could technically be leveraged by the test helper too.
---
**NOTE:** These two lines (_with regex pattern for postfix_) are output in the terminal when using the services respective `reload` commands:
```
postfix/master.*: reload -- version .*, configuration /etc/postfix
dovecot: master: Warning: SIGHUP received - reloading configuration
```
I wasn't sure how to match them as they did not appear in the `changedetector` log (_**EDIT:** they appear in the main log output, eg `docker logs <container name>`_).
Instead I've just monitored the `changedetector` log messages, which should be ok for logic that previously needed to ensure Dovecot / Postfix was back up after the `restart` was issued.
---
Commit history:
* chore: Change events `reload` Dovecot and Postfix instead of `restart`
Reloading is faster than restarting the processes.
Restarting is a bit heavy handed here and may no longer be necessary for general usage?
* tests: Adapt tests to support service `reload` instead of `restart`
* chore: Additional logging for debugging change event logs
* fix: Wait on change detection, then verify directory created
Change detection is too fast now (0-1 seconds vs 5+).
Directory being waited on here was created near the end of a change event, reducing that time to detect a change by the utility method further.
We can instead check that the directory exists after the change detection event is completed.
* chore: Keep using the maildir polling check
We don't presently use remote storage in tests, but it might be relevant in future when testing NFS.
This at least avoids any confusing failure happening when that scenario is tested.
As per deprecation notice from v11.3 release notes, and a related prior PR; this ENV is to be removed.
It's no longer considered useful, and none of the tests that configured it were actually using it for relaying anything.