swap: finish the secondary swap units' jobs if deactivation of the primary swap unit...
authorHATAYAMA Daisuke <d.hatayama@fujitsu.com>
Thu, 25 Jul 2019 03:54:48 +0000 (23:54 -0400)
committerThe Plumber <50238977+systemd-rhel-bot@users.noreply.github.com>
Fri, 24 Apr 2020 11:51:53 +0000 (13:51 +0200)
commit0c0129bd214df25318d267fbbfe2cde6fc18b465
treef54fb0efc0e9b0df3ddd874aba25f0fd912e5584
parent77e4fe56bc4fdebb629e5250c113aa1750b0d0a2
swap: finish the secondary swap units' jobs if deactivation of the primary swap unit fails

Currently, if deactivation of the primary swap unit fails:

    # LANG=C systemctl --no-pager stop dev-mapper-fedora\\x2dswap.swap
    Job for dev-mapper-fedora\x2dswap.swap failed.
    See "systemctl status "dev-mapper-fedora\\x2dswap.swap"" and "journalctl -xe" for details.

then there are still the running stop jobs for all the secondary swap units
that follow the primary one:

    # systemctl list-jobs
     JOB UNIT                                                                                                         TYPE STATE
     3233 dev-disk-by\x2duuid-2dc8b9b1\x2da0a5\x2d44d8\x2d89c4\x2d6cdd26cd5ce0.swap                                    stop running
     3232 dev-dm\x2d1.swap                                                                                             stop running
     3231 dev-disk-by\x2did-dm\x2duuid\x2dLVM\x2dyuXWpCCIurGzz2nkGCVnUFSi7GH6E3ZcQjkKLnF0Fil0RJmhoLN8fcOnDybWCMTj.swap stop running
     3230 dev-disk-by\x2did-dm\x2dname\x2dfedora\x2dswap.swap                                                          stop running
     3234 dev-fedora-swap.swap                                                                                         stop running

    5 jobs listed.

This remains endlessly because their JobTimeoutUSec is infinity:

    # LANG=C systemctl show -p JobTimeoutUSec dev-fedora-swap.swap
    JobTimeoutUSec=infinity

If this issue happens during system shutdown, the system shutdown appears to
get hang and the system will be forcibly shutdown or rebooted 30 minutes later
by the following configuration:

    # grep -E "^JobTimeout" /usr/lib/systemd/system/reboot.target
    JobTimeoutSec=30min
    JobTimeoutAction=reboot-force

The scenario in the real world seems that there is some service unit with
KillMode=none, processes whose memory is being swapped out are not killed
during stop operation in the service unit and then swapoff command fails.

On the other hand, it works well in successful case of swapoff command because
the secondary jobs monitor /proc/swaps file and can detect deletion of the
corresponding swap file.

This commit fixes the issue by finishing the secondary swap units' jobs if
deactivation of the primary swap unit fails.

Fixes: #11577
(cherry picked from commit 9c1f969d40f84d5cc98d810bab8b24148b2d8928)
(cherry picked from commit b89a1a9d19aa806feb984c8dba25116b5b5a52bc)

Resolves: #1821372
src/core/swap.c