Jan Synacek [Wed, 19 Feb 2020 14:36:13 +0000 (15:36 +0100)]
meson: allow setting the version string during configuration
Very loosely based on upstream commits
e1ca734edd17a90a325d5b566a4ea96e66c206e5
and
681bd2c524ed71ac04045c90884ba8d55eee7b66.
Resolves: #
1804252
Chris Down [Mon, 30 Sep 2019 17:36:13 +0000 (18:36 +0100)]
cgroup: Mark memory protections as explicitly set in transient units
A later version of the DefaultMemory{Low,Min} patch changed these to
require explicitly setting memory_foo_set, but we only set that in
load-fragment, not dbus-cgroup.
Without these, we may fall back to either DefaultMemoryFoo or
CGROUP_LIMIT_MIN when we really shouldn't.
(cherry picked from commit
184e989d7da4648bd36511ffa28a9f2b469589d1)
Related: #
1763435
Chris Down [Mon, 30 Sep 2019 17:25:09 +0000 (18:25 +0100)]
cgroup: Respect DefaultMemoryMin when setting memory.min
This is an oversight from https://github.com/systemd/systemd/pull/12332.
Sadly the tests didn't catch it since it requires a real cgroup
hierarchy to see, and it wasn't seen in prod since we're only currently
using DefaultMemoryLow, not DefaultMemoryMin. :-(
(cherry picked from commit
64fe532e90b3e99bf7821ded8a1107c239099e40)
Related: #
1763435
Chris Down [Mon, 30 Sep 2019 17:24:26 +0000 (18:24 +0100)]
cgroup: Check ancestor memory min for unified memory config
Otherwise we might not enable it when we should, ie. DefaultMemoryMin is
set in a parent, but not MemoryMin in the current unit.
(cherry picked from commit
7c9d2b79935d413389a603918a711df75acd3f48)
Related: #
1763435
Chris Down [Fri, 3 May 2019 12:40:11 +0000 (08:40 -0400)]
cgroup: Test that it's possible to set memory protection to 0 again
The previous commit fixes this up, and this should prevent it
regressing.
(cherry picked from commit
465ace74d9820824968ab5e82c81e42c2f1894b0)
Related: #
1763435
Chris Down [Fri, 3 May 2019 12:32:41 +0000 (08:32 -0400)]
cgroup: Support 0-value for memory protection directives
These make sense to be explicitly set at 0 (which has a different effect
than the default, since it can affect processing of `DefaultMemoryXXX`).
Without this, it's not easily possible to relinquish memory protection
for a subtree, which is not great.
(cherry picked from commit
22bf131be278b95a4a204514d37a4344cf6365c6)
Related: #
1763435
Chris Down [Fri, 3 May 2019 12:19:05 +0000 (08:19 -0400)]
cgroup: Readd some plumbing for DefaultMemoryMin
Somehow these got lost in the previous PR, rendering DefaultMemoryMin
not very useful.
(cherry picked from commit
7e7223b3d57c950b399352a92e1d817f7c463602)
Related: #
1763435
Chris Down [Tue, 30 Apr 2019 18:22:04 +0000 (14:22 -0400)]
cgroup: Polish hierarchically aware protection docs a bit
I missed adding a section in `systemd.resource-control` about
DefaultMemoryMin in #12332.
Also, add a NEWS entry going over the general concept.
(cherry picked from commit
acdb4b5236f38bbefbcc4a47fdbb9cd558b4b5c5)
Related: #
1763435
Chris Down [Tue, 16 Apr 2019 17:44:05 +0000 (18:44 +0100)]
unit: Add DefaultMemoryMin
(cherry picked from commit
7ad5439e0663e39e36619957fa37eefe8026bcab)
Related: #
1763435
Chris Down [Tue, 16 Apr 2019 17:14:09 +0000 (18:14 +0100)]
cgroup: Create UNIT_DEFINE_ANCESTOR_MEMORY_LOOKUP
This is in preparation for creating unit_get_ancestor_memory_min.
(cherry picked from commit
6264b85e92aeddb74b8d8808a08c9eae8390a6a5)
Related: #
1763435
Chris Down [Thu, 28 Mar 2019 12:50:50 +0000 (12:50 +0000)]
cgroup: Implement default propagation of MemoryLow with DefaultMemoryLow
In cgroup v2 we have protection tunables -- currently MemoryLow and
MemoryMin (there will be more in future for other resources, too). The
design of these protection tunables requires not only intermediate
cgroups to propagate protections, but also the units at the leaf of that
resource's operation to accept it (by setting MemoryLow or MemoryMin).
This makes sense from an low-level API design perspective, but it's a
good idea to also have a higher-level abstraction that can, by default,
propagate these resources to children recursively. In this patch, this
happens by having descendants set memory.low to N if their ancestor has
DefaultMemoryLow=N -- assuming they don't set a separate MemoryLow
value.
Any affected unit can opt out of this propagation by manually setting
`MemoryLow` to some value in its unit configuration. A unit can also
stop further propagation by setting `DefaultMemoryLow=` with no
argument. This removes further propagation in the subtree, but has no
effect on the unit itself (for that, use `MemoryLow=0`).
Our use case in production is simplifying the configuration of machines
which heavily rely on memory protection tunables, but currently require
tweaking a huge number of unit files to make that a reality. This
directive makes that significantly less fragile, and decreases the risk
of misconfiguration.
After this patch is merged, I will implement DefaultMemoryMin= using the
same principles.
(cherry picked from commit
c52db42b78f6fbeb7792cc4eca27e2767a48b6ca)
Related: #
1763435
Filipe Brandenburger [Wed, 12 Sep 2018 06:15:09 +0000 (23:15 -0700)]
test: remove support for suffix in get_testdata_dir()
Instead, use path_join() in callers wherever needed.
(cherry picked from commit
55890a40c3ec0c061c04d1395a38c26313132d12)
Related: #
1763435
Yu Watanabe [Mon, 6 Aug 2018 04:42:14 +0000 (13:42 +0900)]
core: introduce cgroup_add_device_allow()
(cherry picked from commit
fd870bac25c2dd36affaed0251b5a7023f635306)
Related: #
1763435
Tejun Heo [Sat, 9 Jun 2018 00:33:14 +0000 (17:33 -0700)]
core: add MemoryMin
The kernel added support for a new cgroup memory controller knob memory.min in
bf8d5d52ffe8 ("memcg: introduce memory.min") which was merged during v4.18
merge window.
Add MemoryMin to support memory.min.
(cherry picked from commit
484226357789991de0b3363beb69258be06b4c92)
Resolves: #
1763435
David Rheinsberg [Thu, 14 Mar 2019 12:34:13 +0000 (13:34 +0100)]
sd-bus: skip sending formatted UIDs via SASL
The dbus external authentication takes as optional argument the UID the
sender wants to authenticate as. This uid is purely optional. The
AF_UNIX socket already conveys the same information through the
auxiliary socket data, so we really don't have to provide that
information.
Unfortunately, there is no way to send empty arguments, since they are
interpreted as "missing argument", which has a different meaning. The
SASL negotiation thus changes from:
AUTH EXTERNAL <uid>
NEGOTIATE_UNIX_FD (optional)
BEGIN
to:
AUTH EXTERNAL
DATA
NEGOTIATE_UNIX_FD (optional)
BEGIN
And thus the replies we expect as a client change from:
OK <server-id>
AGREE_UNIX_FD (optional)
to:
DATA
OK <server-id>
AGREE_UNIX_FD (optional)
Since the old sd-bus server implementation used the wrong reply for
"AUTH" requests that do not carry the arguments inlined, we decided to
make sd-bus clients accept this as well. Hence, sd-bus now allows
"OK <server-id>\r\n" replies instead of "DATA\r\n" replies.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
(cherry picked from commit
1ed4723d38cd0d1423c8fe650f90fa86007ddf55)
Resolves: #
1838081
David Rheinsberg [Thu, 14 Mar 2019 12:33:28 +0000 (13:33 +0100)]
sd-bus: fix SASL reply to empty AUTH
The correct way to reply to "AUTH <protocol>" without any payload is to
send "DATA" rather than "OK". The "DATA" reply triggers the client to
respond with the requested payload.
In fact, adding the data as hex-encoded argument like
"AUTH <protocol> <hex-data>" is an optimization that skips the "DATA"
roundtrip. The standard way to perform an authentication is to send the
"DATA" line.
This commit fixes sd-bus to properly send the "DATA" line. Surprisingly
no existing implementation depends on this, as they all pass the data
directly as argument to "AUTH". This will not work if we want to pass
an empty argument, though.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
(cherry picked from commit
2010873b4b49b223e0cc07d28205b09c693ef005)
Related: #
1838081
David Rheinsberg [Thu, 14 Mar 2019 12:26:50 +0000 (13:26 +0100)]
sd-bus: avoid magic number in SASL length calculation
Lets avoid magic numbers and use a constant `strlen()` instead.
Signed-off-by: David Rheinsberg <david.rheinsberg@gmail.com>
(cherry picked from commit
3cacdab925c40a5d9b7cf3f67719201bbaa17f67)
Related: #
1838081
Filipe Brandenburger [Thu, 24 Jan 2019 04:19:44 +0000 (20:19 -0800)]
core: downgrade CPUQuotaPeriodSec= clamping logs to debug
After the first warning log, further messages are downgraded to LOG_DEBUG.
(cherry picked from commit
527ede0c638b47b62a87900438a8a09dea42889e)
Related: #
1770379
Filipe Brandenburger [Fri, 2 Nov 2018 16:21:57 +0000 (09:21 -0700)]
core: add CPUQuotaPeriodSec=
This new setting allows configuration of CFS period on the CPU cgroup, instead
of using a hardcoded default of 100ms.
Tested:
- Legacy cgroup + Unified cgroup
- systemctl set-property
- systemctl show
- Confirmed that the cgroup settings (such as cpu.cfs_period_ns) were set
appropriately, including updating the CPU quota (cpu.cfs_quota_ns) when
CPUQuotaPeriodSec= is updated.
- Checked that clamping works properly when either period or (quota * period)
are below the resolution of 1ms, or if period is above the max of 1s.
(cherry picked from commit
10f28641115733c61754342d5dcbe70b083bea4b)
Resolves: #
1770379
Lennart Poettering [Tue, 20 Nov 2018 18:45:02 +0000 (19:45 +0100)]
cgroup: use structured initialization
(cherry picked from commit
de8a711a5849f9239c93aefa5554a62986dfce42)
Related: #
1770379
Filipe Brandenburger [Thu, 24 Jan 2019 03:48:54 +0000 (19:48 -0800)]
time-util: Introduce parse_sec_def_infinity
This works like parse_sec() but defaults to USEC_INFINITY when passed an
empty string or only whitespace.
Also introduce config_parse_sec_def_infinity, which can be used to parse
config options using this function.
This is useful for time options that use "infinity" for default and that
can be reset by unsetting them.
Introduce a test case to ensure it works as expected.
(cherry picked from commit
7b61ce3c44ef5908e817009ce4f9d2a7a37722be)
Related: #
1770379
Tejun Heo [Wed, 13 Jun 2018 21:16:35 +0000 (14:16 -0700)]
core: add IODeviceLatencyTargetSec
This adds support for the following proposed latency based IO control
mechanism.
https://lkml.org/lkml/2018/6/5/428
(cherry picked from commit
6ae4283cb14c4e4a895f4bbba703804e4128c86c)
Resolves: #
1831519
ypf791 [Fri, 19 Jul 2019 10:28:04 +0000 (18:28 +0800)]
core: coldplug possible nop_job
When a unit in a state INACTIVE or DEACTIVATING, JobType JOB_TRY_RESTART or
JOB_TRY_RELOAD will be collapsed to JOB_NOP. And use u->nop_job instead
of u->job.
If a JOB_NOP job is going on with a waiting state, a parallel daemon-reload
just install it during deserialization. Without a coldplug, the job will
not be in m->run_queue, which results in a hung try-restart or
try-reload process.
Reproduce:
run systemctl try-restart test.servcie (inactive) repeatly in a terminal.
run systemctl daemon-reload repeatly in other terminals.
After successful reproduce, systemctl list-jobs will list the hang job.
Upsteam:
systemd/systemd#13124
(cherry picked from commit
b49e14d5f3081dfcd363d8199a14c0924ae9152f)
Resolves: #
1829798
David Tardon [Tue, 17 Mar 2020 09:49:44 +0000 (10:49 +0100)]
mount: don't add Requires for tmp.mount
This is a follow-up to #
1619292.
rhel-only
Resolves: #
1748840
Filipe Brandenburger [Tue, 26 Jun 2018 01:07:48 +0000 (18:07 -0700)]
resolvconf: fixes for the compatibility interface
Also use compat_main() when called as `resolvconf`, since the interface
is closer to that of `systemd-resolve`.
Use a heap allocated string to set arg_ifname, since a stack allocated
one would be lost after the function returns. (This last one broke the
case where an interface name was suffixed with a dot, such as in
`resolvconf -a tap0.dhcp`.)
Tested:
$ build/resolvconf -a nonexistent.abc </etc/resolv.conf
Unknown interface 'nonexistent': No such device
Fixes #9423.
(cherry picked from commit
5a01b3f35d7b6182c78b6973db8d99bdabd4f9c3)
Resolves: #
1835594
Andreas Henriksson [Sun, 14 Oct 2018 12:53:09 +0000 (14:53 +0200)]
sulogin-shell: Use force if SYSTEMD_SULOGIN_FORCE set
When the root account is locked sulogin will either inform you of
this and not allow you in or if --force is used it will hand
you passwordless root (if using a recent enough version of util-linux).
Not being allowed a shell is ofcourse inconvenient, but at the same
time handing out passwordless root unconditionally is probably not
a good idea everywhere.
This patch thus allows to control which behaviour you want by
setting the SYSTEMD_SULOGIN_FORCE environment variable to true
or false to control the behaviour, eg. via adding this to
'systemctl edit rescue.service' (or emergency.service):
[Service]
Environment=SYSTEMD_SULOGIN_FORCE=1
Distributions who used locked root accounts and want the passwordless
behaviour could thus simply drop in the override file in
/etc/systemd/system/rescue.service.d/override.conf
Fixes: #7115
Addresses: https://bugs.debian.org/802211
(cherry picked from commit
33eb44fe4a8d7971b5614bc4c2d90f8d91cce66c)
Resolves: #
1625929
Zbigniew Jędrzejewski-Szmek [Wed, 19 Dec 2018 22:05:48 +0000 (23:05 +0100)]
tmpfiles: fix crash with NULL in arg_root and other fixes and tests
The function to replacement paths into the configuration file list was borked.
Apart from the crash with empty root prefix, it would incorrectly handle the
case where root *was* set, and the replacement file was supposed to override
an existing file.
prefix_root is used instead of path_join because prefix_root removes duplicate
slashes (when --root=dir/ is used).
A test is added.
Fixes #11124.
(cherry picked from commit
082bb1c59bd4300bcdc08488c94109680cfadf57)
Resolves: #
1836024
Jan Synacek [Thu, 4 Jun 2020 14:55:52 +0000 (16:55 +0200)]
seccomp: fix __NR__sysctl usage
Loosely based on
https://github.com/systemd/systemd/pull/14032 and
https://github.com/systemd/systemd/pull/14268.
Related: #
1843871
Zbigniew Jędrzejewski-Szmek [Tue, 30 Oct 2018 08:02:26 +0000 (09:02 +0100)]
fuzz-compress: add fuzzer for compression and decompression
(cherry picked from commit
029427043b2e0523a21f54374f872b23cf744350)
Resolves: #
1843871
Zbigniew Jędrzejewski-Szmek [Mon, 29 Oct 2018 13:55:33 +0000 (14:55 +0100)]
journal: adapt for new improved LZ4_decompress_safe_partial()
With lz4 1.8.3, this function can now decompress partial results into a smaller
buffer. The release news don't say anything interesting, but the test case that
was previously failing now works OK.
Fixes #10259.
A test is added. It shows that with *older* lz4, a partial decompression can
occur with the returned size smaller then the requested number of bytes _and_
smaller then the size of the compressed data:
(lz4-libs-1.8.2-1.fc29.x86_64)
Compressed
4194304 → 16464
Decompressed →
4194304
Decompressed partial 12/
4194304 →
4194304
Decompressed partial 1/1 → -2 (bad)
Decompressed partial 2/2 → -2 (bad)
Decompressed partial 3/3 → -2 (bad)
Decompressed partial 4/4 → -2 (bad)
Decompressed partial 5/5 → -2 (bad)
Decompressed partial 6/6 → 6 (good)
Decompressed partial 7/7 → 6 (good)
Decompressed partial 8/8 → 6 (good)
Decompressed partial 9/9 → 6 (good)
Decompressed partial 10/10 → 6 (good)
Decompressed partial 11/11 → 6 (good)
Decompressed partial 12/12 → 6 (good)
Decompressed partial 13/13 → 6 (good)
Decompressed partial 14/14 → 6 (good)
Decompressed partial 15/15 → 6 (good)
Decompressed partial 16/16 → 6 (good)
Decompressed partial 17/17 → 6 (good)
Decompressed partial 18/18 → -16459 (bad)
(lz4-libs-1.8.3-1.fc29.x86_64)
Compressed
4194304 → 16464
Decompressed →
4194304
Decompressed partial 12/
4194304 → 12
Decompressed partial 1/1 → 1 (good)
Decompressed partial 2/2 → 2 (good)
Decompressed partial 3/3 → 3 (good)
Decompressed partial 4/4 → 4 (good)
...
If we got such a short "successful" decompression in decompress_startswith() as
implemented before this patch, we could be confused and return a false negative
result. But it turns out that this only occurs with small output buffer
sizes. We use greedy_realloc() to manager the buffer, so it is always at least
64 bytes. I couldn't hit a case where decompress_startswith() would actually
return a bogus result. But since the lack of proof is not conclusive, the code
for *older* lz4 is changed too, just to be safe. We cannot rule out that on a
different architecture or with some unlucky compressed string we could hit this
corner case.
The fallback code is guarded by a version check. The check uses a function not
the compile-time define, because there was no soversion bump in lz4 or new
symbols, and we could be compiled against a newer lz4 and linked at runtime
with an older one. (This happens routinely e.g. when somebody upgrades a subset
of distro packages.)
(cherry picked from commit
e41ef6fd0027d3619dc1cf062100b2d224d0ee7e)
Resolves: #
1843871
Zbigniew Jędrzejewski-Szmek [Mon, 29 Oct 2018 21:21:28 +0000 (22:21 +0100)]
test-compress: add test for short decompress_startswith calls
I thought this might fail with lz4 < 1.8.3, but it seems that because of
greedy_realloc, we always use a buffer that is large enough, and it always
passes.
(cherry picked from commit
ba17efce44e6a1e139c1671205e9a6ed3824af1b)
Resolves: #
1843871
Zbigniew Jędrzejewski-Szmek [Mon, 29 Oct 2018 17:32:51 +0000 (18:32 +0100)]
Drop support for lz4 < 1.3.0
lz4-r130 was released on May 29th, 2015. Let's drop the work-around for older
versions. In particular, we won't test any new code against those ancient
releases, so we shouldn't pretend they are supported.
(cherry picked from commit
e0a1d4b049e6991919a0eacd5d96f7f39dc6ddd1)
Resolves: #
1843871
Anita Zhang [Sat, 29 Jun 2019 00:02:30 +0000 (17:02 -0700)]
core: ExecCondition= for services
Closes #10596
(cherry picked from commit
31cd5f63ce86a0784c4ef869c4d323a11ff14adc)
Resolves: #
1737283
Zbigniew Jędrzejewski-Szmek [Tue, 26 Mar 2019 10:38:55 +0000 (11:38 +0100)]
test-execute: provide custom failure message
test_exec_ambientcapabilities: exec-ambientcapabilities-nobody.service: exit status 0, expected 1
Sometimes we get just the last line, for example from the failure summary,
so make it as useful as possible.
(cherry picked from commit
6aed6a11577b108b9a39f26aeae5e45d98f20c90)
Related: #
1737283
Zbigniew Jędrzejewski-Szmek [Fri, 15 Mar 2019 12:42:55 +0000 (13:42 +0100)]
test-execute: allow filtering test cases by pattern
When debugging failure in one of the cases, it's annoying to have to wade
through the output from all the other cases. Let's allow picking select
cases.
(cherry picked from commit
9efb96315ae502dabeb94ab35816ea8955563b7a)
Related: #
1737283
Lennart Poettering [Mon, 19 Nov 2018 13:48:28 +0000 (14:48 +0100)]
tests: always use the right vtable wrapper calls
Prompted by https://github.com/systemd/systemd/pull/10836#discussion_r234598868
(cherry picked from commit
bd7989a3d90e5d97e09f1eef33d09b2469a79f4d)
Related: #
1737283
Lennart Poettering [Tue, 13 Nov 2018 22:28:09 +0000 (23:28 +0100)]
core: log a recognizable message when a unit succeeds, too
We already are doing it on failure, let's do it on success, too.
Fixes: #10265
(cherry picked from commit
523ee2d41471bfb738f52d59de9b469301842644)
Related: #
1737283
Lennart Poettering [Tue, 13 Nov 2018 20:25:22 +0000 (21:25 +0100)]
core: make log messages about units entering a 'failed' state recognizable
Let's make this recognizable, and carry result information in a
structure fashion.
(cherry picked from commit
7c047d7443347c109daf67023a01c118b5f361eb)
Related: #
1737283
Lennart Poettering [Mon, 10 Dec 2018 19:56:57 +0000 (20:56 +0100)]
core: split out all logic that updates a Job on a unit's unit_notify() invocation
Just some refactoring, no change in behaviour.
(cherry picked from commit
16c74914d233ec93012d77e5f93cf90e42939669)
Related: #
1737283
Lennart Poettering [Wed, 14 Nov 2018 10:08:16 +0000 (11:08 +0100)]
job: when a job was skipped due to a failed condition, log about it
Previously we'd neither show console status output nor log output. Let's
fix that, and still log something.
(cherry picked from commit
9a80f2f4533883d272e6a436512aa7e88cedc549)
Related: #
1737283
Lennart Poettering [Tue, 13 Nov 2018 18:57:43 +0000 (19:57 +0100)]
core: move unit_status_emit_starting_stopping_reloading() and related calls to job.c
This call is only used by job.c and very specific to job handling.
Moreover the very similar logic of job_emit_status_message() is already
in job.c.
Hence, let's clean this up, and move both sets of functions to job.c,
and rename them a bit so that they express precisely what they do:
1. unit_status_emit_starting_stopping_reloading() →
job_emit_begin_status_message()
2. job_emit_status_message() → job_emit_done_status_message()
The first call is after all what we call when we begin with the
execution of a job, and the second call what we call when we are done
wiht it.
Just some moving and renaming, not other changes, and hence no change in
behaviour.
(cherry picked from commit
33a3fdd9781329379f74e11a7a2707816aad8c61)
Related: #
1737283
Evgeny Vereshchagin [Mon, 17 Sep 2018 07:12:38 +0000 (07:12 +0000)]
nspawn: chown() the legacy hierarchy when it's used in a container
This is a follow-up to
720f0a2f3c928cc9379501a52146be9fbb4d9be2.
Closes https://github.com/systemd/systemd/issues/10026
Closes https://github.com/systemd/systemd/issues/9563
(cherry picked from commit
89f180201cd8c0f3ce5cb6e8dd7e2b3cbcf71527)
Resolves:
1837094
Lennart Poettering [Tue, 5 Mar 2019 17:57:53 +0000 (18:57 +0100)]
nspawn: move payload to sub-cgroup first, then sync cgroup trees
if we sync the legacy and unified trees before moving to the right
subcgroup then ultimately the cgroup paths in the hierarchies will be
out-of-sync... Hence, let's move the payload first, and sync then.
Addresses: https://github.com/systemd/systemd/pull/9762#issuecomment-
441187979
(cherry picked from commit
27da7ef0d09e00eae821f3ef26e1a666fe7aa087)
Resolves: #
1837094
Zsolt Dollenstein [Tue, 3 Jul 2018 19:22:29 +0000 (12:22 -0700)]
Add support for opening files for appending
Addresses part of #8983
(cherry picked from commit
566b7d23eb747e9c5a74e5647693077b52395fc5)
Resolves: #
1809175
Lennart Poettering [Mon, 1 Apr 2019 15:30:45 +0000 (17:30 +0200)]
man: be clearer that .timer time expressions need to be reset to override them
let's be clearer about the overriding concept for OnCalendar= settings.
Prompted by this thread:
https://lists.freedesktop.org/archives/systemd-devel/2019-March/042351.html
(cherry picked from commit
58031d99c6320855b86f4890baa9165597e3d841)
Resolves: #
1816908
Joerg Steffens [Tue, 21 Nov 2017 11:21:49 +0000 (12:21 +0100)]
udev-rules: make tape-changers also apprear in /dev/tape/by-path/
It is important to be able to access tape changer ("Medium Changers") by
persistant name.
While tape devices can be accessed via /dev/tape/by-id/ and
/dev/tape/by-path/, tape-changers could only be accessed by
/dev/tape/by-id/.
However, in some cases, especially when accessing Amazon Webservice
Storage Gateway VTLs (or accessing iSCSI VTLs in general?) this does not
work, as all tape devices and the tape changer have the same ENV{ID_SERIAL}.
The results is, that only the last device is available in
/dev/tape/by-id/, as the former devices have been overwritten.
As this behavior is hard to change without breaking consistentcy,
this additional device in /dev/tape/by-path/ can be used to access the medium changes.
The tape devices can also be accessed by this path.
The content of the directory will now look like:
# SCSI tape device, rewind (unchanged)
/dev/tape/by-path/$env{ID_PATH} -> ../../st*
# SCSI tape device, no-rewind (unchanged)
/dev/tape/by-path/$env{ID_PATH}-nst -> ../../nst*
# SCSI tape changer device (newly added)
/dev/tape/by-path/$env{ID_PATH}-changer -> ../../sg*
Tape devices and tape changer have different ID_PATHs.
SCSI tape changer get the suffix "-changer"
to make them better distinguishable from tape devices.
(cherry picked from commit
7f8ddf96a25162f06bd94a684cf700c128d18142)
Resolves: #
1820112
Lennart Poettering [Tue, 26 Nov 2019 08:46:00 +0000 (09:46 +0100)]
pid1: add new kernel cmdline arg systemd.cpu_affinity=
Let's allow configuration of the CPU affinity via the kernel cmdline,
overriding CPUAffinity= in /etc/systemd/system.conf
Prompted by:
https://lists.freedesktop.org/archives/systemd-devel/2019-November/043754.html
(cherry picked from commit
68d58f38693e586b5ce5785274f8e42a79625196)
Resolves: #
1812894
Frantisek Sumsal [Mon, 12 Aug 2019 22:14:54 +0000 (00:14 +0200)]
test: store coredumps in journal
To make debugging much easier, especially for crashes in tests under
QEMU, let's store the entire coredump bundle in the systemd journal,
which is usually kept around by various CIs. Right now, we usually end
up with a journal, but without the coredump itself, which is pretty
useless.
(cherry picked from commit
215bffe1b8d7cb72fe9f72ed53682d52d5c2a9c5)
Related: #
1823767
Frantisek Sumsal [Tue, 5 Mar 2019 15:08:00 +0000 (16:08 +0100)]
test: try to determine QEMU_SMP dynamically
If the QEMU_SMP value has not been explicitly set, try to determine it
from the number of online CPUs using the nproc utility. If this approach
fails, fall back to the default value QEMU_SMP=1.
This change should significantly help when running integration tests
under QEMU on multicore systems.
(cherry picked from commit
5bfb2a93a4a36bba0d24199553dcda6e560cbb75)
Related: #
1823767
Frantisek Sumsal [Tue, 5 Mar 2019 12:50:28 +0000 (13:50 +0100)]
test: parallelize tasks in TEST-24-UNIT-TESTS
(cherry picked from commit
2f2a0454efd07644a4e0ccb3f00f1db2d7043391)
Related: #
1823767
Yu Watanabe [Tue, 11 Sep 2018 00:18:33 +0000 (09:18 +0900)]
test: make test-catalog relocatable
Fixes #10045.
(cherry picked from commit
d9b6baa69968132d33e4ad8627c7fe0bd527c859)
Resolves: #
1823767
Yu Watanabe [Tue, 11 Sep 2018 00:17:22 +0000 (09:17 +0900)]
test: introduce test_is_running_from_builddir()
(cherry picked from commit
8cb10a4f4dabc508a04f76ea55f23ef517881b61)
Resolves: #
1823767
Yu Watanabe [Fri, 14 Sep 2018 06:47:42 +0000 (15:47 +0900)]
test-execute: skip several tests when running in container
(cherry picked from commit
642d1a6d6e98204ade25816bcc429cb67df92a29)
Resolves: #
1823767
Yu Watanabe [Wed, 12 Sep 2018 09:18:33 +0000 (18:18 +0900)]
test-execute: also check python3 is installed or not
(cherry picked from commit
738c74d7b163ea18e3c68115c3ed8ceed166cbf7)
Resolves: #
1823767
Yu Watanabe [Thu, 20 Sep 2018 07:08:38 +0000 (16:08 +0900)]
test-process-util: skip several verifications when running in unprivileged container
(cherry picked from commit
767eab47501b06327a0e6030e5c54860a3fc427f)
Resolves: #
1823767
Yu Watanabe [Fri, 14 Sep 2018 06:51:04 +0000 (15:51 +0900)]
test-fs-util: skip some tests when running in unprivileged container
(cherry picked from commit
9590065f37be040996f1c2b9a246b9952fdc0c0b)
Resolves: #
1823767
Yu Watanabe [Wed, 19 Sep 2018 01:54:28 +0000 (10:54 +0900)]
test: make install_keymaps() optionally install more keymaps
(cherry picked from commit
ad931fee506e1313e8a520ae0ecc1c8e275d9941)
Resolves: #
1823767
Yu Watanabe [Wed, 19 Sep 2018 01:54:16 +0000 (10:54 +0900)]
test: add paths of keymaps in install_keymaps()
It seems that the paths of directories storing keymaps are changed.
(cherry picked from commit
83a7051ee1edbfe8cd2278477d23083beb385409)
Resolves: #
1823767
Yu Watanabe [Wed, 12 Sep 2018 18:01:42 +0000 (03:01 +0900)]
test: replace duplicated Makefile by symbolic link
(cherry picked from commit
dd75c133d81f07c56c82ee4e7a80f391ffebd9ce)
Resolves: #
1823767
Yu Watanabe [Wed, 12 Sep 2018 09:20:31 +0000 (18:20 +0900)]
test: introduce install_zoneinfo()
But it is not called by default.
(cherry picked from commit
7d10ec1cda8fed20c36b16d2387f529583645cda)
Resolves: #
1823767
Yu Watanabe [Wed, 12 Sep 2018 09:19:45 +0000 (18:19 +0900)]
test: install libraries required by tests
(cherry picked from commit
e3d3dada248c5f30e2978840ca1f0a03a4675b53)
Resolves: #
1823767
Yu Watanabe [Fri, 14 Sep 2018 04:25:02 +0000 (13:25 +0900)]
test: do not use global variable to pass error
(cherry picked from commit
0013fac248a15be3acce84c17a65e3ae0377294b)
Resolves: #
1823767
Lennart Poettering [Wed, 22 Jan 2020 11:04:38 +0000 (12:04 +0100)]
logind: check PolicyKit before allowing VT switch
Let's lock this down a bit. Effectively nothing much changes, since the
default PK policy will allow users on the VT to change VT. Only users
with no local VT session won't be able to switch VTs.
(cherry picked from commit
4acf0cfd2f92edb94ad48d04f1ce6c9ab4e19d55)
Resolves: #
1797679
Zbigniew Jędrzejewski-Szmek [Tue, 13 Nov 2018 13:53:04 +0000 (14:53 +0100)]
udev: downgrade message when we fail to set inotify watch up
My logs are full of:
systemd-udevd[6586]: seq 13515 queued, 'add' 'block'
systemd-udevd[6586]: seq 13516 queued, 'change' 'block'
systemd-udevd[6586]: seq 13517 queued, 'change' 'block'
systemd-udevd[6586]: seq 13518 queued, 'remove' 'bdi'
systemd-udevd[6586]: seq 13519 queued, 'remove' 'block'
systemd-udevd[9865]: seq 13514 processed
systemd-udevd[9865]: seq 13515 running
systemd-udevd[9865]: GROUP 6 /usr/lib/udev/rules.d/50-udev-default.rules:59
systemd-udevd[9865]: IMPORT builtin 'blkid' /usr/lib/udev/rules.d/60-persistent-storage.rules:95
systemd-udevd[9865]: IMPORT builtin 'blkid' fails: No such file or directory
systemd-udevd[9865]: loop4: Failed to add device '/dev/loop4' to watch: No such file or directory
(the last line is at error level).
If we are too slow to set up a watch and the device is already gone by the time
we try, this is not an error.
(cherry picked from commit
7fe0d0d5c0ad5aa3f069bb282868938d414d7ad1)
Resolves: #
1808051
Michal Sekletár [Fri, 27 Mar 2020 16:01:59 +0000 (17:01 +0100)]
sd-journal: remove the dead code and actually fix #14695
journal_file_fstat() returns an error if we call it on already unlinked
journal file and hence we never reach remove_file_real() which is the
entire point.
I must have made some mistake while testing the fix that got me thinking
the issue is gone while opposite was true.
Fixes #14695
(cherry picked from commit
8581b9f9732d4c158bb5f773230a65ce77f2c292)
Resolves: #
1796128
Michal Sekletár [Tue, 4 Feb 2020 13:23:14 +0000 (14:23 +0100)]
sd-journal: close journal files that were deleted by journald before we've setup inotify watch
Fixes #14695
(cherry picked from commit
28ca867abdb20d0e4ac1901e2ed669cdb41ea3f6)
Related: #
1796128
Anita Zhang [Sat, 25 Jan 2020 15:46:16 +0000 (16:46 +0100)]
core: transition to FINAL_SIGTERM state after ExecStopPost=
Fixes #14566
(cherry picked from commit
c1566ef0d22ed786b9ecf4c476e53b8a91e67578)
Resolves: #
1766479
Michal Sekletár [Tue, 14 Apr 2020 14:16:45 +0000 (16:16 +0200)]
basic: use comma as separator in cpuset cgroup cpu ranges
This is a workaround for
https://bugzilla.redhat.com/show_bug.cgi?id=
1819152 and should be
reverted in RHEL-8.3.
RHEL-only
Related: #
1818054
Lennart Poettering [Thu, 9 Jan 2020 16:30:31 +0000 (17:30 +0100)]
core: fix re-realization of cgroup siblings
This is a fix-up for
eef85c4a3f8054d29383a176f6cebd1ef3a15b9a which
broke this.
Tracked down by @w-simon
Fixes: #14453
(cherry picked from commit
65f6b6bdcb500c576674b5838e4cc4c35e18bfde)
Related: #
1818054
Zbigniew Jędrzejewski-Szmek [Sun, 24 Nov 2019 13:14:43 +0000 (14:14 +0100)]
pid1: fix the names of AllowedCPUs= and AllowedMemoryNodes=
The original PR was submitted with CPUSetCpus and CPUSetMems, which was later
changed to AllowedCPUs and AllowedMemmoryNodes everywhere (including the parser
used by systemd-run), but not in the parser for unit files.
Since we already released -rc1, let's keep support for the old names. I think
we can remove it in a release or two if anyone remembers to do that.
Fixes #14126. Follow-up for
047f5d63d7a1ab75073f8485e2f9b550d25b0772.
(cherry picked from commit
0b8d3075872a05e0449906d24421ce192f50c29f)
Related: #
1818054
Lennart Poettering [Thu, 9 Aug 2018 14:26:27 +0000 (16:26 +0200)]
core: rework StopWhenUnneeded= logic
Previously, we'd act immediately on StopWhenUnneeded= when a unit state
changes. With this rework we'll maintain a queue instead: whenever
there's the chance that StopWhenUneeded= might have an effect we enqueue
the unit, and process it later when we have nothing better to do.
This should make the implementation a bit more reliable, as the unit notify event
cannot immediately enqueue tons of side-effect jobs that might
contradict each other, but we do so only in a strictly ordered fashion,
from the main event loop.
This slightly changes the check when to consider a unit "unneeded".
Previously, we'd assume that a unit in "deactivating" state could also
be cleaned up. With this new logic we'll only consider units unneeded
that are fully up and have no job queued. This means that whenever
there's something pending for a unit we won't clean it up.
(cherry picked from commit
a3c1168ac293f16d9343d248795bb4c246aaff4a)
Resolves: #
1798046
ven [Wed, 22 May 2019 06:24:28 +0000 (14:24 +0800)]
bus_open leak sd_event_source when udevadm trigger。
On my host, when executing the udevadm trigger, I only receive the change event, which causes memleak
(cherry picked from commit
b2774a3ae692113e1f47a336a6c09bac9cfb49ad)
Resolves: #
1798504
HATAYAMA Daisuke [Tue, 25 Feb 2020 18:35:50 +0000 (13:35 -0500)]
resolved: Recover missing PrivateTmp=yes and ProtectSystem=strict
Since the commit
b61e8046ebcb28225423fc0073183d68d4c577c4,
systemd-resolved.service often fails to start with the following message:
Failed at step NAMESPACE spawning /usr/bin/mount: Read-only file system
This is because dropping DynamicUser=yes dropped implicit PrivateTmp=yes and
also implicit After=systemd-tmpfiles-setup.service, and thus
systemd-resolved.service can start before systemd-remount-fs.service. As a
result, mount operations associated with PrivateDevices= can be performed to
still read-only filesystems.
To fix this issue, it's better to recover PrivateTmp=yes and
ProtectSystem=strict just as the upstream commit
62fb7e80fcc45a1530ed58a84980be8cfafa9b3e (Revert "resolve: enable DynamicUser=
for systemd-resolved.service").
Resolves: #
1810869
HATAYAMA Daisuke [Thu, 25 Jul 2019 03:54:48 +0000 (23:54 -0400)]
swap: finish the secondary swap units' jobs if deactivation of the primary swap unit fails
Currently, if deactivation of the primary swap unit fails:
# LANG=C systemctl --no-pager stop dev-mapper-fedora\\x2dswap.swap
Job for dev-mapper-fedora\x2dswap.swap failed.
See "systemctl status "dev-mapper-fedora\\x2dswap.swap"" and "journalctl -xe" for details.
then there are still the running stop jobs for all the secondary swap units
that follow the primary one:
# systemctl list-jobs
JOB UNIT TYPE STATE
3233 dev-disk-by\x2duuid-
2dc8b9b1\x2da0a5\x2d44d8\x2d89c4\x2d6cdd26cd5ce0.swap stop running
3232 dev-dm\x2d1.swap stop running
3231 dev-disk-by\x2did-dm\x2duuid\x2dLVM\x2dyuXWpCCIurGzz2nkGCVnUFSi7GH6E3ZcQjkKLnF0Fil0RJmhoLN8fcOnDybWCMTj.swap stop running
3230 dev-disk-by\x2did-dm\x2dname\x2dfedora\x2dswap.swap stop running
3234 dev-fedora-swap.swap stop running
5 jobs listed.
This remains endlessly because their JobTimeoutUSec is infinity:
# LANG=C systemctl show -p JobTimeoutUSec dev-fedora-swap.swap
JobTimeoutUSec=infinity
If this issue happens during system shutdown, the system shutdown appears to
get hang and the system will be forcibly shutdown or rebooted 30 minutes later
by the following configuration:
# grep -E "^JobTimeout" /usr/lib/systemd/system/reboot.target
JobTimeoutSec=30min
JobTimeoutAction=reboot-force
The scenario in the real world seems that there is some service unit with
KillMode=none, processes whose memory is being swapped out are not killed
during stop operation in the service unit and then swapoff command fails.
On the other hand, it works well in successful case of swapoff command because
the secondary jobs monitor /proc/swaps file and can detect deletion of the
corresponding swap file.
This commit fixes the issue by finishing the secondary swap units' jobs if
deactivation of the primary swap unit fails.
Fixes: #11577
(cherry picked from commit
9c1f969d40f84d5cc98d810bab8b24148b2d8928)
Resolves: #
1749622
Ryan Gonzalez [Sat, 23 Feb 2019 05:45:03 +0000 (23:45 -0600)]
cryptsetup: Treat key file errors as a failed password attempt
6f177c7dc092eb68762b4533d41b14244adb2a73 caused key file errors to immediately fail, which would make it hard to correct an issue due to e.g. a crypttab typo or a damaged key file.
Closes #11723.
(cherry picked from commit
c20db3887569e0c0d9c0e2845c5286e7edf0133a)
Related: #
1763155
Frantisek Sumsal [Tue, 3 Mar 2020 14:54:29 +0000 (15:54 +0100)]
test: replace cursor file with a plain cursor
systemd in RHEL 8 doesn't support the --cursor-file option, so let's
fall back to a plain cursor string
Related: #
1808940
rhel-only
Frantisek Sumsal [Sat, 10 Aug 2019 14:05:07 +0000 (16:05 +0200)]
test: drop the missed || exit 1 expression
...as we've already done in the rest of the testsuite, see
cc469c3dfc398210f38f819d367e68646c71d8da
(cherry picked from commit
67c434b03f8a24f5350f017dfb4b2464406046db)
Related: #
1808940
Frantisek Sumsal [Mon, 5 Aug 2019 12:38:45 +0000 (14:38 +0200)]
test: add a simple sanity check for systems without NUMA support
(cherry picked from commit
92f8e978923f962a57d744c5f358520ac06f7892)
Related: #
1808940
Frantisek Sumsal [Mon, 22 Jul 2019 22:56:04 +0000 (00:56 +0200)]
test: give strace some time to initialize
The `coproc` implementation seems to be a little bit different in older
bash versions, so the `strace` is sometimes started AFTER `systemctl
daemon-reload`, which causes unexpected fails. Let's help it a little by
sleeping for a bit.
(cherry picked from commit
c7367d7cfdfdcec98f8659f0ed3f1d7b77123903)
Related: #
1808940
Frantisek Sumsal [Tue, 2 Jul 2019 07:52:45 +0000 (09:52 +0200)]
test: skip the test on systems without NUMA support
(cherry picked from commit
b030847163e9bd63d3dd6eec6ac7f336411faba6)
Related: #
1808940
Frantisek Sumsal [Mon, 1 Jul 2019 17:53:45 +0000 (19:53 +0200)]
test: make sure the strace process is indeed dead
It may take a few moments for the strace process to properly terminate
and write all logs to the backing storage
(cherry picked from commit
56425e54a2140f47b4560b51c5db08aa2de199a6)
Related: #
1808940
Frantisek Sumsal [Mon, 1 Jul 2019 11:08:26 +0000 (13:08 +0200)]
test: support MPOL_LOCAL matching in unpatched strace versions
The MPOL_LOCAL constant is not recognized in current strace versions.
Let's match at least the numerical value of this constant until the
strace patch is approved & merged.
(cherry picked from commit
ac14396d027023e1be910327989cb422cb2f6724)
Related: #
1808940
Frantisek Sumsal [Mon, 1 Jul 2019 07:27:59 +0000 (09:27 +0200)]
test: replace `tail -f` with journal cursor which should be...
more reliable
(cherry picked from commit
d0b2178f3e79f302702bd7140766eee03643f734)
Related: #
1808940
Frantisek Sumsal [Tue, 25 Jun 2019 21:01:40 +0000 (23:01 +0200)]
test: introduce TEST-36-NUMAPOLICY
(cherry picked from commit
8f65e26508969610ac934d1aadbade8223bfcaac)
Related: #
1808940
Michal Sekletár [Tue, 3 Mar 2020 10:45:00 +0000 (11:45 +0100)]
cgroup: make sure that cpuset is supported on cgroup v2 and disabled with v1
Resolves: #
1808940
(rhel-only)
Franck Bui [Wed, 2 Oct 2019 09:58:16 +0000 (11:58 +0200)]
pid1: fix DefaultTasksMax initialization
Otherwise DefaultTasksMax is always set to "inifinity".
This was broken by
fb39af4ce42.
(cherry picked from commit
c0000de87d2c7934cb1f4ba66a533a85277600ff)
Resolves: #
1809037
Pavel Hrdina [Mon, 29 Jul 2019 15:50:05 +0000 (17:50 +0200)]
cgroup: introduce support for cgroup v2 CPUSET controller
Introduce support for configuring cpus and mems for processes using
cgroup v2 CPUSET controller. This allows users to limit which cpus
and memory NUMA nodes can be used by processes to better utilize
system resources.
The cgroup v2 interfaces to control it are cpuset.cpus and cpuset.mems
where the requested configuration is written. However, it doesn't mean
that the requested configuration will be actually used as parent cgroup
may limit the cpus or mems as well. In order to reflect the real
configuration cgroup v2 provides read-only files cpuset.cpus.effective
and cpuset.mems.effective which are exported to users as well.
(cherry picked from commit
047f5d63d7a1ab75073f8485e2f9b550d25b0772)
Related: #
1724617
Lennart Poettering [Wed, 20 Mar 2019 19:19:38 +0000 (20:19 +0100)]
core: imply NNP and SUID/SGID restriction for DynamicUser=yes service
Let's be safe, rather than sorry. This way DynamicUser=yes services can
neither take benefit of, nor create SUID/SGID binaries.
Given that DynamicUser= is a recent addition only we should be able to
get away with turning this on, even though this is strictly speaking a
binary compatibility breakage.
(cherry picked from commit
bf65b7e0c9fc215897b676ab9a7c9d1c688143ba)
Resolves: #
1687512
Lennart Poettering [Wed, 20 Mar 2019 18:52:20 +0000 (19:52 +0100)]
units: turn on RestrictSUIDSGID= in most of our long-running daemons
(cherry picked from commit
62aa29247c3d74bcec0607c347f2be23cd90675d)
Related: #
1687512
Lennart Poettering [Wed, 20 Mar 2019 18:45:32 +0000 (19:45 +0100)]
man: document the new RestrictSUIDSGID= setting
(cherry picked from commit
7445db6eb70e8d5989f481d0c5a08ace7047ae5b)
Related: #
1687512
Lennart Poettering [Wed, 20 Mar 2019 18:20:35 +0000 (19:20 +0100)]
analyze: check for RestrictSUIDSGID= in "systemd-analyze security"
And let's give it a heigh weight, since it pretty much can be used for
bad things only.
(cherry picked from commit
9d880b70ba5c6ca83c82952f4c90e86e56c7b70c)
Related: #
1687512
Lennart Poettering [Wed, 20 Mar 2019 18:09:09 +0000 (19:09 +0100)]
core: expose SUID/SGID restriction as new unit setting RestrictSUIDSGID=
(cherry picked from commit
f69567cbe26d09eac9d387c0be0fc32c65a83ada)
Related: #
1687512
Jan Synacek [Tue, 12 Nov 2019 12:27:49 +0000 (13:27 +0100)]
test: add test case for restrict_suid_sgid()
(cherry picked from commit
167fc10cb352b04d442c9010dab4f8dc24219749)
Related: #
1687512
Lennart Poettering [Wed, 20 Mar 2019 18:00:28 +0000 (19:00 +0100)]
seccomp: introduce seccomp_restrict_suid_sgid() for blocking chmod() for suid/sgid files
(cherry picked from commit
3c27973b13724ede05a06a5d346a569794cda433)
Related: #
1687512
Lennart Poettering [Thu, 11 Oct 2018 16:31:11 +0000 (18:31 +0200)]
main: introduce a define HIGH_RLIMIT_MEMLOCK similar to HIGH_RLIMIT_NOFILE
(cherry picked from commit
c8884aceefc85245b9bdfb626e2daf27521259bd)
Related: #
1789930
Jan Synacek [Wed, 12 Feb 2020 11:58:54 +0000 (12:58 +0100)]
pid1: make sure to restore correct default values for some rlimits
Commit
fb39af4ce42d7ef9af63009f271f404038703704 forgot to restore the default
rlimit values (RLIMIT_NOFILE and RLIMIT_MEMLOCK) while PID1 is reloading.
This patch extracts the code in charge of initializing the default values for
those rlimits in order to create dedicated functions, which take care of their
initialization.
These functions are then called in parse_configuration() so we make sure that
the default values for these rlimits get restored every time PID1 is reloading
its configuration.
(cherry picked from commit
a9fd4cd1206832a61aaf61fff583bb133e6cb965)
Resolves: #
1789930
Lennart Poettering [Thu, 17 Jan 2019 17:31:59 +0000 (18:31 +0100)]
sd-bus: use "queue" message references for managing r/w message queues in connection objects
Let's make use of the new concept the previous commit added.
See: #4846
(cherry picked from commit
c1757a70eac0382c4837a3833d683919f6a48ed7)
Related: CVE-2020-1712
Yu Watanabe [Tue, 28 May 2019 09:07:01 +0000 (18:07 +0900)]
journal: use cleanup attribute at one more place
(cherry picked from commit
627df1dc42b68a74b0882b06366d1185b1a34332)
Conflicts:
src/journal/journald-server.c
Related: #
1788085
Yu Watanabe [Tue, 28 May 2019 03:40:17 +0000 (12:40 +0900)]
journal: do not trigger assertion when journal_file_close() get NULL
We generally expect destructors to not complain if a NULL argument is passed.
Closes #12400.
(cherry picked from commit
c377a6f3ad3d9bed4ce7e873e8e9ec6b1650c57d)
Resolves: #
1788085
Michal Sekletár [Mon, 6 Jan 2020 11:30:58 +0000 (12:30 +0100)]
sysctl: let's by default increase the numeric PID range from 2^16 to 2^22
This should PID collisions a tiny bit less likely, and thus improve
security and robustness.
2^22 isn't particularly a lot either, but it's the current kernel
limitation.
Bumping this limit was suggested by Linus himself:
https://lwn.net/ml/linux-kernel/CAHk-=wiZ40LVjnXSi9iHLE_-ZBsWFGCgdmNiYZUXn1-V5YBg2g@mail.gmail.com/
Let's experiment with this in systemd upstream first. Downstreams and
users can after all still comment this easily.
Besides compat concern the most often heard issue with such high PIDs is
usability, since they are potentially hard to type. I am not entirely sure though
whether
4194304 (as largest new PID) is that much worse to type or to
copy than 65563.
This should also simplify management of per system tasks limits as by
this move the sysctl /proc/sys/kernel/threads-max becomes the primary
knob to control how many processes to have in parallel.
Resolves: #
1744214