* chore: Co-locate process checking and process restart verification
Extract the test cases for checking a process is running and properly restarts from various test files into a single one:
Core (always running):
opendkim, opendmarc, master (postfix)
ENV dependent:
amavi (amavisd-new), clamd, dovecot, fail2ban-server (fail2ban), fetchmail, postgrey, postsrsd, saslauthd
These now run off a single container with the required ENV and call a common function (the revised version in parallel test cases).
* fix(saslauthd): Quote wrap supervisor config vars
`saslauth.conf` calls `-O` option for most commands defined with an ENV that may be empty/null. This would cause the process to silently fail / die.
This doesn't happen if quote wrapping the ENV, which calls `-O` with an empty string.
Not necessary, but since one of `postgrey` ENV were quote wrapped in `supervisor-app.conf`, I've also done the same there.
* fix(postsrsd): Change supervisor `autorestart` policy to `true`
The PR that introduced the config switched from `true` to `unexpected` without any context. That prevents restart working when the process is killed. Setting to `true` instead will correctly restart the service.
* chore: Remove disabled postgrey test file
`mail_with_postgrey_disabled_by_default.bats` only checked the migrated test cases, removed as no longer serving a purpose.
* tests(refactor): Make `_should_restart_when_killed()` more reliable
The previous version did not ensure that the last checks process was actually restarted, only that it was running.
It turns out that `pkill` is only sending the signal, there can be some delay before the original process is actually killed and restarted.
This can be identified with `pgrep --older <seconds>`. First ensure the process is at a specified age, then after killing check that the process is not running that is at least that old, finally check that there is a younger process actually running.. (_could fail if a process doesn't restart, or there is a delay such as imposed by `sleep` in wrapper scripts for postfix and fail2ban_)
The helper method is not used anywhere else now, move it into this test instead. It has been refactored to accomodate the needs for `--older`, and `--list-full` provides some output that can be matched (similar for `pkill --echo`).
* test(docs): Add inline notes about processes
* chore: Compress test cases into single case with loop
Moves the list of processes into array vars to iterate through instead.
If a failure occurs, the process name is visible along with line number in `_should_restart_when_killed()` to identify what went wrong.
* chore: Handle `FETCHMAIL_PARALLEL=1` process checks as well
* tests: Add test case for disabled ENV
Additional coverage to match what other test files were doing before, ensuring that these ENV can prevent their respective service from running.
* chore: Move `clamd` enabled check to it's own test case
Not sure about this.
It reduces the time of CPU activity (sustained full load on a thread) and increase in memory usage (1GB+ loading signatures database), but as a separate test case it also adds 10 seconds without reducing the time of the test case it was extracted from.
* chore: Make `disabled` variant the 1st test case
* fix: Adjust test cases to pass when using slower wrapper scripts
* tests(refactor): `mail_fetchmail.bats` updated to new format
Additionally merges in the parallel test file.
* chore: Move `config/fetchmail.cf` into separate sub-directory
Keep out of the default base config for tests.
* chore: Change `fetchmail.cf` FQDNs to `.test` TLD
Changed the first configs remote and local user values to more clearly document what their values should represent (_and that they don't need to be a full mail address, that's just what our Dovecot is configured with for login_).
Shifted the `here` to the end of the `is` line. It's optional syntax, only intended to contrast with the remote `there` for readability.
Additionally configured imap protocol. Not tested or verified if that's correct configuration for usage with imap protocol instead. The fetchmail feature tests are currently lacking.
Added an inline doc into the fetchmail test to reference a PR about the importance of the trailing `.` in the config. Updated the partial matching to ensure it matches for that in the value as well.
* chore: Finalize `process-check-restart.bats`
Few minor adjustments. The other ENV for clamd doesn't seem to provide any benefit, trim out the noise. Added a note about why it's been split out.
Fetchmail parallel configs are matching the config file path in the process command that is returned. The `.rc` suffix is just to add further clarity to that.
- `disabled_clamav_spamassassin`:
- Just shuffling the test order around, and removing the restart test at the end which doesn't make sense.
- `postscreen`:
- Now uses common helper for getting container IP
- Does not appear to need the `NET_ADMIN` capability?
- Reduced startup time for the 2nd container + additional context about it's relevance.
- Test cases are largely the same, but refactored the `nc` alternative that properly waits it's turn. This only needs to run once. Added additional commentary and made into a generic method if needed in other tests.
- `fail2ban`:
- Use the common container IP helper method.
- Postscreen isn't affecting this test, it's not required to do the much slower exchange with the mail server when sending a login failure.
- IP being passed into ENV is no longer necessary.
- `sleep 5` in the related test cases doesn't seem necessary, can better rely on polling with timeout.
- `sleep 10` for `setup.sh` also doesn't appear to be necessary.
- `postgrey`:
- Reduced POSTGREY_DELAY to 3, which shaves a fair amount of wasted time while still verifying the delay works.
- One of the checks in `main.cf` doesn't seem to need to know about the earlier spamhaus portion of the line to work, removed.
- Better test case descriptions.
- Improved log matching via standard method that better documents the expected triplet under test.
- Removed a redundant whitelist file and test that didn't seem to have any relevance. Added a TODO with additional notes about a concern with these tests.
- Reduced test time as 8 second timeouts from `-w 8` don't appear to be required, better to poll with grep instead.
- Replaced `wc -l` commands with a new method to assert expected line count, better enabling assertions on the actual output.
- `undef_spam_subject`:
- Split to two separate test cases, and initialize each container in their case instead of `setup_file()`, allowing for using the default `teardown()` method (and slight benefit if running in parallel).
- `permit_docker`:
- Not a parallel test, but I realized that the repeat helper methods don't necessarily play well with `run` as the command (can cause false positive of what was successful).
- The usual serial to parallel test conversion to utilize the `setup.bash` common setup structure, and adding a `TEST_PREFIX` var for each test case to leverage.
- Standardize on parallel test naming conventions for variables / values.
- More consistent use of `bash -c` instead of `/bin/bash -c` or `/bin/sh -c`.
- Using the `_run_in_container` helper instead of `run docker exec ${CONTAINER_NAME}`.
- Updates tests to use the `check_if_process_is_running` helper.
---
chore: Revise inline docs for the `ssl_letsencrypt` test
- Moves the override to be in closer proximity to the `initial_setup` call, and better communicates the intent to override.
- Removes top comment block that is no longer providing value or correct information to maintainers.
- Revised `acme.json` test case inline doc comments.