* fix: Workaround `postconf` write settle logic
After updating `main.cf`, to avoid an enforced delay from reading the config by postfix tools, we can ensure the modified time is at least 2 seconds in the past as a workaround. This should be ok with our usage AFAIK.
Shaves off 2+ seconds roughly off each container startup, reduces roughly 2+ minutes off tests.
* chore: Only modify `mtime` if less than 2 seconds ago
- Slight improvement by avoiding unnecessary writes with a conditional check on the util method.
- Can more comfortably call this during `postfix reload` in the change detection cycle now.
- Identified other tests that'd benefit from this, created a helper method to call instead of copy/paste.
- The `setup email restrict` command also did a modification and reload. Added util method here too.
* tests(fix): `mail_smtponly.bats` should wait for Postfix
- `postfix reload` fails if the service is not ready yet.
- `service postfix reload` and `/etc/init.d/postfix reload` presumably wait until it is ready? (as these work regardless)
* chore: Review feedback - Move reload method into utilities
With `reload` a change detection event during local testing can be processed in less than a second according to logs. Previously this was 5+ seconds (_plus additional downtime for Postfix/Dovecot to become available again_).
In the past it was apparently an issue to use `<service> reload` due to a concern with the PID for wrapper scripts that `supervisorctl` managed, thus `supervisorctl <service> restart` had been used. Past discussions with maintainers suggest this is not likely an issue anymore, and `reload` should be fine to switch to now 👍
---
**NOTE:** It may not be an issue in the CI, but on _**local systems running tests may risk failure in `setup-cli.bats` from a false positive**_ due to 1 second polling window of the test helper method, and a change event being possible to occur entirely between the two checks undetected by the current approach.
If this is a problem, we may need to think of a better way to catch the change. The `letsencrypt` test counts how many change events are expected to have been processed, and this could technically be leveraged by the test helper too.
---
**NOTE:** These two lines (_with regex pattern for postfix_) are output in the terminal when using the services respective `reload` commands:
```
postfix/master.*: reload -- version .*, configuration /etc/postfix
dovecot: master: Warning: SIGHUP received - reloading configuration
```
I wasn't sure how to match them as they did not appear in the `changedetector` log (_**EDIT:** they appear in the main log output, eg `docker logs <container name>`_).
Instead I've just monitored the `changedetector` log messages, which should be ok for logic that previously needed to ensure Dovecot / Postfix was back up after the `restart` was issued.
---
Commit history:
* chore: Change events `reload` Dovecot and Postfix instead of `restart`
Reloading is faster than restarting the processes.
Restarting is a bit heavy handed here and may no longer be necessary for general usage?
* tests: Adapt tests to support service `reload` instead of `restart`
* chore: Additional logging for debugging change event logs
* fix: Wait on change detection, then verify directory created
Change detection is too fast now (0-1 seconds vs 5+).
Directory being waited on here was created near the end of a change event, reducing that time to detect a change by the utility method further.
We can instead check that the directory exists after the change detection event is completed.
* chore: Keep using the maildir polling check
We don't presently use remote storage in tests, but it might be relevant in future when testing NFS.
This at least avoids any confusing failure happening when that scenario is tested.
Allows for using `load` with an absolute path instead of a relative one, which makes it possible to group tests into different directories.
Parallel tests differ slightly, loading the newer `helper/common.bash` and `helper/setup.bash` files instead of the older `test_helper/common.bash` which serial tests continue to use.
## Quick Summary
Resolves a `TODO` task with `addmailuser`.
## Overview
The main change is adding three new methods in `common.bash`, which replace the completion delay in `addmailuser` / `setup email add` command.
Other than that:
- I swapped `sh -c 'addmailuser ...'` to `setup email add ...`.
- Improved three tests in `setup-cli.bats` for `setup email add|update|del` (_logic remains effectively the same still_).
- Rewrote the `TODO` comment for `setup-cli.bats` test on `setup email del` to better clarify the concern, but the test itself was no longer affected due to changes prior to this PR, so I enabled the commented out assertion.
- Removed unnecessary waits. The two `skip` tests in `test/tests.bats` could be enabled again after this PR.
- Additional fixes to tests were made during the PR (see discussion comments for details), resolving race conditions.
Individual commit messages of the PR provide additional details if helpful.
---
## Relevant commit messages
* chore: Remove creation delay in `addmailuser`
This was apparently only for supporting tests that need to wait on account creation being ready to test against.
As per the removed inline docs, it should be fine to remove once tests are updated to work correctly without it.
* tests(feat): Add two new common helper methods
`wait_until_account_maildir_exists()` provides the same logic `addmailuser` command was carrying, to wait upon the account dir creation in `/var/mail`.
As this was specifically to support tests, it makes more sense as a test method.
`add_mail_account_then_wait_until_ready()` was added to handle the common pattern of creating account and waiting on it. An internal assert will ensure the account was successfully created first during the test before attempting to wait.
* tests(feat): Add common helper for waiting on change event to be processed
The current helper is more complicated for no real benefit, it only detects when a change is made that would trigger a change event in the `changedetector` service. Our usage of this in tests however is only interested in waiting out the completion of the change event.
Remove unnecessary change event waits. These waits should not be necessary if handled correctly.
* tests: `addmailuser` to `add_mail_account_then_wait_until_ready mail()`
This helper method is used where appropriate.
- A password is not relevant (optional).
- We need to wait on the creation on the account (Dovecot and `/var/mail` directory).
* tests: `setup-cli` revise `add`, `update`, `del` tests
The delete test was failing as the `/var/mail` directory did not yet exist.
There is now a proper delay imposed in the `add` test now shares the same account for both `update` and `del` tests resolving that failure.
Additionally tests use better asserts where appropriate and the wait + sleep logic in `add` has been improved (now takes 10 seconds to complete, approx half the time than before).
The `del` test TODO while not technically addressed is no longer relevant due to the tests being switched to `-c` option (there is a separate `no container` test file, but it doesn't provide a `del` test).
* tests(fix): Ensure Postfix is reachable after waiting on ClamAV
There is not much reason to check before waiting on ClamAV.
It is more helpful to debug failures from `nc` mail send commands if we know that nothing went wrong inbetween the ClamAV wait time.
Additionally added an assertion which should provide more information if this part of the test setup fails again.
* tests(fix): Move health check to the top
This test is a bit fragile. It relies on defaults for the healthcheck with intervals of 30 seconds.
If the check occurs while Postfix is down due a change event from earlier tests and the healthcheck kicks in at that point, then if there is not enough time to refresh the health status from `unhealthy`, the test will fail with a false-positive as Postfix is actually working and up again..
* tests(fix): Wait on directory to be removed
Workaround that tries not to introduce heavier delays by waiting on a full change event to complete in the previous `email update` if possible.
There is a chance that the account has the folder deleted, but restored from an active change event (for password update, then the account delete).
* tests: Update testing submodules (bats-assert, bats-support)
These two submodules were migrated to the `bats-core` organization, where they continued to receive updates.
* tests: Use tagged release of `bats-core/bats-support`
This is technically one commit backwards, but no relevant difference has been made since, other than moving the submodule to the `bats-core` organization.
* tests: Bump `bats-assert` to August 2022 (master)
No official release tag since Nov 2018, but a fair amount of changes since then.
* tests: Bump `bats-core` to `v1.7.0` release
* tests(fix): Correctly use assertions
Some tests were updated as the upgrade of bats submodules had `assert` methods raise awareness of incorrect usage.
This additionally revealed some existing tests that weren't meant to be using `run`, which swallowed failures from surfacing.
* chore: Create bare new test file `setup-cli.bats`
Bare minimum to setup a new test.
* chore: Transfer over relevant tests
* chore: `mail` container name to dynamic `${TEST_NAME}`
Only applied where it's relevant. Next commit will handle the config path correction.
* chore: Use `TEST_TMP_CONFIG` for referencing local config directory
Could technically use the existing function call. Some paths were using a hard-coded config location.
Both have been converted to `TEST_TMP_CONFIG` and related `grep` calls normalizing the quote mark usage, escaping doesn't seem necessary.
* tests(fix): Create container without providing extra args reference var
If a variable name (of an array) was not provided to reference, this would fail trying to reference `'`.
* chore: Extract change-detection method to it's own helper
This doesn't really belong in `helpers/ssl.sh`. Moving to it's own helper script.
* chore: Co-locate related change-detection method from container startup
It seems relevant to migrate the related support during startup for the change detection feature into this helper.
I opted to move the call from `start-mailserver.sh` into the `_setup` call at the end for a more explicit/visible location.
* chore: Move `CHKSUM_FILE` into `helpers/change-detection.sh`
It belongs there, not in `helpers/index.sh`.
* chore: Revise inline documentation
* tests(fix): Ensure correct functionality
Presently `test/test_helper.bats` is using it's own `CHKSUM_FILE` instead of sourcing the var for the filepath.
`test_helper/common.bash` was calling a method to check for changes, but this helper may not correctly detect letsencrypt related changes as these are not ENV rely on, but global vars handled by `helpers/dns.sh`, so that should be run first like it is for `check-for-changes.sh`.
* tests(chore): Use `CHKSUM_FILE` var from helper
* chore: `addmailuser` should use `CHKSUM_FILE` var
* chore: Update `check-for-changes.sh` log message with correct path
* tests(fix): Increase some timeouts
Running tests locally via a VM these tests would fail sometimes due to the time from being queued and Amavis actually processing being roughly around 30 seconds.
There should be no harm in raising this to 60 seconds, other than delaying a failure case which will ripple through other time sensitive tests.
It's better to pass when functionality is actually correct but just needs a bit longer to complete.
* tests(fix): Don't setup an invalid hostname
During container startup `helpers/dns.sh` would panic with `hostname -f` failing.
Dropping `--domainname` for this container is fine and does not affect the point of it's test.
---
It's unclear why this does not occur in CI. Possibly changes within the docker daemon since as CI runs docker on Ubuntu 20.04? (2020).
For clarity, this may be equivalent to setting a hostname of `domain.com.domain.com`, or `--hostname` value truncated the NIS domain (`--domainname`) of the same value.
IIRC, it would still fail with both options using different values if `--hostname` was multi-label. I believe I've documented how non-deterministic these options can be across different environments.
`--hostname` should be preferred. There doesn't seem to be any reason to actually need `--domainname` (which is NIS domain name, unrelated to the DNS domain name). We still need to properly investigate reworking our ENV support that `dns.sh` manages.
---
Containers were also not removing themselves after failures either (missing teardown). Which would cause problems when running tests again.
* chore: Normalize white-space
Sets a consistent indent size of 2 spaces. Previously this varied a fair bit, sometimes with tabs or mixed tabs and spaces.
Some formatting with blank lines.
Easier to review with white-space in diff ignored. Some minor edits besides blank lines, but no change in functionality.
* fix: `setup.sh` target container under test
Some of the `setup.sh` commands did not specify the container which was problematic if another `docker-mailserver` container was running, causing test failures.
This probably doesn't help with `test/no_container.bats`, but at least prevents `test/tests.bats` failing at this point.
* chore: Extract letsencrypt logic into methods
This allows other scripts to share the functionality to discover the correct letsencrypt folder from the 3 possible locations (where specific order is important).
As these methods should now return a string value, the `return 1` after a panic is now dropped.
* chore: Update comments
The todo is resolved with this PR, `_setup_ssl` will be called by both cert conditional statements with purpose for each better documented to maintainers at the start of the logic block.
* refactor: Defer most logic to helper/ssl.sh
The loop is no longer required, extraction is delegated to `_setup_ssl` now.
For the change event prevention, we retrieve the relevant FQDN via the new helper method, beyond that it's just indentation diff.
`check-for-changes.sh` adjusted to allow locally scoped var declarations by wrapping a function. Presently no loop control flow is needed so this seems fine. Made it clear that `CHANGED` is local and `CHKSUM_FILE` is not.
Panic scope doesn't require `SSL_TYPE` for context, it's clearly`letsencrypt`.
* fix: Correctly match wildcard results
Now that the service configs are properly updated, when the services restart they will return a cert with the SAN `DNS:*.example.test`, which is valid for `mail.example.test`, however the test function did not properly account for this in the regexp query.
Resolved by truncating the left-most DNS label from FQDN and adding a third check to match a returned wildcard DNS result.
Extracted out the common logic to create the regexp query and renamed the methods to communicate more clearly that they check the FQDN is supported, not necessarily explicitly listed by the cert.
* tests(letsencrypt): Enable remaining tests
These will now pass. Adjusted comments accordingly.
Added an additional test on a fake FQDN that should still be valid to a wildcard cert (SNI validation in a proper setup would reject the connection afterwards).
Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>
* chore: Normalize container setup
Easier to grok what is different between configurations.
- Container name usage replaced with variable
- Volumes defined earlier and redeclared when relevant (only real difference is `VOLUME_LETSENCRYPT`)
- Contextual comment about the `acme.json` copy.
- Quoting `SSL_TYPE`, `SSL_DOMAIN` and `-h` values for syntax highlighting.
- Moved `-t` and `${NAME}` to separate line.
- Consistent indentation.
* chore: DRY test logic
Extracts out repeated test logic into methods
* chore: Scope configs to individual test cases (1/3)
- Preparation step for shifting out the container configs to their own scoped test cases. Split into multiple commits to ease reviewing by diffs for this change.
- Re-arrange the hostname and domain configs to match the expected order of the new test cases.
- Shuffle the hostname and domainname grouped tests into tests per container config scope.
- Collapse the `acme.json` test cases into single test case.
* chore: Scope configs to individual test cases (2/3)
- Shifts the hostname and domainname container configs into their respective scoped test cases.
- Moving the `acme.json` container config produces a less favorable diff, so is deferred to a follow-up commit.
- Test cases updated to refer to their `${CONTAINER_NAME}` var instead of the hard-coded string name.
* chore: Scope configs to individual test cases (3/3)
Final commit to shift out the container configs.
- Common vars are exported in `setup_file()` for the test cases to use without needing to repeat the declaration in each test case.
- `teardown_file()` shifts container removal at end of scoped test case.
* chore: Adapt to `common_container_setup` template
- `CONTAINER_NAME` becomes `TEST_NAME` (`common.bash` helper via `init_with_defaults`).
- `docker run ...` and related configuration is now outsourced to the `common.bash` helper, only extra args that the default template does not cover are defined in the test case.
- `TARGET_DOMAIN`establishes the domain folder name for `/etc/letsencrypt/live`.
- `_should*` methods no longer manage a `CONTAINER_NAME` arg, instead using the `TEST_NAME` global that should be valid as test is run as a sequence of test cases.
- `PRIVATE_CONFIG` and the `private_config_path ...` are now using the global `TEST_TMP_CONFIG` initialized at the start of each test case, slightly different as not locally defined/scoped like `PRIVATE_CONFIG` would be within the test case, hence the explicit choice of a different name for context.
* chore: Minor tweaks
- Test case comment descriptions.
- DRY: `docker rm -f` lines moved to `teardown()`
- Use `wait_for_service` helper instead of checking the `changedetector` script itself is running.
- There is a startup delay before the `changedetector` begins monitoring, wait until it ready event is logged.
- Added a helper to query logs for a service (useful later).
- `/bin/sh` commands reduced to `sh`.
- Change the config check to match and compare output, not number of lines returned. Provides better failure output by bats to debug against.
* chore: Add more test functions for `acme.json`
This just extracts out existing logic from the test case to functions to make the test case itself more readable/terse.
* chore: Housekeeping
No changes, just moving logic around and grouping into inline functions, with some added comments.
* chore: Switch to `example.test` certs
This also required copying the source files to match the expected letsencrypt file structure expected in the test/container usage.
* chore: Delete `test/config/letsencrypt/`
No longer necessary, using the `example.test/` certs instead.
These letsencrypt certs weren't for the domains they were used for, and of course long expired.
* chore: Housekeeping
Add more maintainer comments, rename some functions.
* tests: Expand `acme.json` extraction coverage
Finally able to add more test coverage! :)
- Two new methods to validate expected success/failure of extraction for a given FQDN.
- Added an RSA test prior to the wildcard to test a renewal simulation (just with different cert type).
- Added extra method to make sure we're detecting multiple successful change events, not just a previous logged success (false positive).
* tests: Refactor the negotiate_tls functionality
Covers all ports (except POP) and correctly tests against expected verification status with new `example.test` certs.
The `FQDN` var will be put to use in a follow-up commit.
* tests: Verify the certs contain the expected FQDNs
* chore: Extract TLS test methods into a separate helper script
Can be useful for other TLS tests to utilize.
* chore: Housekeeping
* chore: Fix test typo
There was a mismatch between the output and expected output between these two files "find key for" and "find key & cert for". Changed to "find key and/or cert for" to make the warning more clear that it's issued for either or both failure conditions.
Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>
These are improvements for better supporting the requirements of other tests.
- Opted for passing an array reference instead of an ENV file. This seems to be a better approach and supports more than just ENV changes.
- Likewise, shifted to a `create` + `start` approach, instead of `docker run` for added flexibility.
- Using `TEST_TMP_CONFIG` instead of `PRIVATE_CONFIG` to make the difference in usage with config volume in tests more clear.
- Changed the config volume from read-only volume mount to be read-write instead, which seems required for other tests.
- Added notes about logged failures from a read-only config volume during container startup.
- Added `TEST_CA_CERT` as a default CA cert path for the test files volume. This can be used by default by openssl methods.
* fix: Spam bounced test copy/paste typo
* tests(docs): Expand inline documentation
Should assist maintainers like myself that are not yet familiar with this functionality, saving some time :)
* Refactor bounced test + Introduce initial container template
DRY'd up the test and extracted a common init pattern for other tests to adopt in future.
The test does not need to run distinct containers at once, so a common name is fine, although the `init_with_defaults()` method could be given an arg to add a suffix: `init_with_defaults "_${BATS_TEST_NUMBER}"` which could be called in `setup()` for tests that can benefit from being run in parallel.
Often it seems the containers only need the bare minimum config such as accounts provided to actually make the container happy to perform a test, so sharing a `:ro` config mount is fine, or in future this could be better addressed.
---
The test would fail if the test cases requiring smtp access ran before postfix was ready (_only a few seconds after setup scripts announce being done_). Added the wait condition for smtp, took a while to track that failure down.
* docker_container first, then fall back to docker_image
+ test changes to support
+ test change to wait for smtp port to fix flakey tests since https://github.com/docker-mailserver/docker-mailserver/pull/2104
* quick fix
* Update setup.sh
Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>
Co-authored-by: Casper <casperklein@users.noreply.github.com>
* first migration steps
* altered issue templates
* altered README
* removed .travis.yml
* adjusting registry & repository, Dockerfile and compose.env
* Close stale issues automatically
* Integrated CI with Github Actions (#3)
* feat: integrated ci with github actions
* fix: use secrets for docker org and update image
* docs: clarify why we use -t if no tty exists
* fix: correct remaining references to old repo
chore: prettier automatically updated markdown as well
* fix: hardcode docker org
* change testing image to just testing
* ci: add armv7 as a supported platform
* finished migration steps
* corrected linting in build-push action
* corrected linting in build-push action (2)
* minor preps for PR
* correcting push on pull request and minor details
* adjusted workflows to adhere closer to @wernerfred's diagram
* minor patches
* adjusting Dockerfile's installation of base packages
* adjusting schedule for stale issue action
* reverting license text
* improving CONTRIBUTING.md PR text
* Update CONTRIBUTING.md
* a bigger patch at the end
* moved all scripts into one directory under target/scripts/
* moved the quota-warning.sh script into target/scripts/ and removed empty directory /target/dovecot/scripts
* minor fixes here and there
* adjusted workflows for use a fully qualified name (i.e. docker.io/...)
* improved on the Dockerfile layer count
* corrected local tests - now they (actually) work (fine)!
* corrected start-mailserver.sh to make use of defaults consistently
* removed very old, deprecated variables (actually only one)
* various smaller improvements in the end
* last commit before merging #6
* rearranging variables to use alphabetic order
Co-authored-by: casperklein <casperklein@users.noreply.github.com>
Co-authored-by: Nick Pappas <radicand@users.noreply.github.com>
Co-authored-by: William Desportes <williamdes@wdes.fr>