Published on May 12, 2025 · Reading time: 3 minutes
Recently, I’ve run into an issue in one of my personal projects, with Django responding to all incoming requests with 400 (bad request) after a few days of uptime. That’s right, no error messages and no exception tracebacks at all, just a simple HTML message. Docker Compose was not aware of the issue because the main process hasn’t died, otherwise the restart policy would cause the container to be restarted.
Compose offers an optional healthcheck feature, but I didn’t use it until now
because it didn’t really work. You can configure the check command and other
parameters, but Compose is not an orchestrator and can’t restart containers on
its own. There’s Swarm, but it wouldn’t work for this app because it needs to be
run as root and have access to /dev
directory of the host. You can’t build
images and restart the current deployment with a single command. I haven’t found
a way to show logs for all containers in the deployment at once. There were many
more issues, but I forgot them by now.
I could use the popular
willfarrell/autoheal
image
along with Compose, but I don’t feel comfortable giving access to the Docker
socket to a container based on old Alpine OS1, mostly because there
are other apps running on this machine.
Solution
Can I have a simple systemd service that can restart faulty containers? It doesn’t look that difficult to implement. Let’s start with this one-liner:
docker compose ps --format '{{ .Service }} {{ .Status }}' | grep -i unhealthy | cut -d " " -f 1 | xargs -r docker compose restart
Using standard Unix commands, this script will:
- list all services of the deployment in a simplified format
- get only unhealthy services
- get the service name (first word of each line)
- pass the result to
docker compose restart
, but only if the list is not empty
Of course, it needs to be run from a directory having a compose.yml
file, and
you can’t have services with a whitespace or the “unhealthy” string in their
names.
This bash script has to be run in a loop:
#!/usr/bin/env bash
set -e
while true
do
docker compose ps --format '{{ .Service }} {{ .Status }}' | grep -i unhealthy | cut -d " " -f 1 | xargs -r docker compose restart
sleep 2
done
…have an executable bit:
chmod +x healthcheck.sh
…and work as a systemd service:
[Unit]
Description=Restart unhealthy services
After=docker.service
Requires=docker.service
[Service]
Type=simple
User=my_user_name
Group=docker
WorkingDirectory=/home/my_user_name/my_app_name
ExecStart=/home/my_user_name/my_app_name/healthcheck.sh
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
On Debian, this service file can be saved to
/etc/systemd/system/my_app_name.service
, and enabled like this:
systemctl daemon-reload
systemctl enable my_app_name.service
systemctl start my_app_name.service
You can check logs of the service to make sure unhealthy services are actually restarted:
systemctl status my_app_name.service
● my_app_name.service - Restart unhealthy services
Loaded: loaded (/etc/systemd/system/my_app_name.service; enabled; preset: enabled)
Active: active (running) since Sun 2025-05-11 10:32:12 CEST; 5h 34min ago
Main PID: 2077 (bash)
Tasks: 2 (limit: 4757)
CPU: 20min 31.831s
CGroup: /system.slice/my_app_name.service
├─ 2077 bash /home/my_user_name/my_app_name/healthcheck.sh
└─489754 sleep 2
maj 11 10:32:12 localhost systemd[1]: Started my_app_name.service - Restart unhealthy services.
maj 11 10:32:13 localhost healthcheck.sh[46057]: Container my_app_name-backend-1 Restarting
maj 11 10:32:13 localhost healthcheck.sh[46057]: Container my_app_name-backend-1 Started
maj 11 10:33:46 localhost healthcheck.sh[51393]: Container my_app_name-backend-1 Restarting
maj 11 10:33:47 localhost healthcheck.sh[51393]: Container my_app_name-backend-1 Started
-
Alpine 3.21 is the latest version as of May 2025, but 3.18 is being used ↩︎
Check out other blog posts:
-
EclairM0, the pocket notepad
2025-04-24 · 14 min read
Tiny device with great performance, long battery life, open hardware design and many additional use cases. Notes app written in TinyGo.
-
Tracking libadwaita adoption in Fedora (updated for F42)
2025-04-15 · 2 min read
The complete list of software preinstalled in Fedora, including apps using the libadwaita library.
-
Making framebuf text 10x faster in CircuitPython
2024-12-23 · 4 min read
Finding a cause of slow text rendering and optimizing it for monochrome LCD and OLED displays.