CGroups stand for Control Groups. They were introduced into the kernel by Google in 2006 to restrict resources used by a process. All the resources a process can use have their own resource controller or CGroup subsystem.
Here is the list of the available resource controllers:
- blkio: sets limits on input/output access to and from block devices (see BlockIOWeight);
- cpu: uses the CPU scheduler to provide CGroup tasks an access to the CPU. It is mounted together with the cpuacct controller on the same mount (see CPUShares);
- cpuacct: creates automatic reports on CPU resources used by tasks in a CGroup. It is mounted together with the cpu controller on the same mount (see CPUShares);
- cpuset: assigns individual CPUs (on a multicore system) and memory nodes to tasks in a CGroup;
- devices: allows or denies access to devices for tasks in a CGroup;
- freezer: suspends or resumes tasks in a CGroup;
- memory: sets limits on memory use by tasks in a CGroup, and generates automatic reports on memory resources used by those tasks (see MemoryLimit);
- net_cls: tags network packets with a class identifier (classid) that allows the Linux traffic controller (the tc command) to identify packets originating from a particular CGroup task;
- perf_event: enables monitoring CGroups with the perf tool;
- hugetlb: allows to use virtual memory pages of large sizes, and to enforce resource limits on these pages.
CGroups were already available in RHEL 6. However, with the arrival of Systemd in RHEL 7, many things have changed.
Systemd organizes processes in control groups. For example, all the processes started by an apache webserver will be in the same control group, CGI scripts included. This makes stopping an apache webserver much easier. This also moves the resource management settings from the process level to the application level by binding the system of CGroup hierarchies with the Systemd unit tree.
The Systemd unit tree is made up of several parts:
- at the top, there is the root slice called -.slice,
- below, there are the system.slice (the default place for all system services), the user.slice (the default place for all user sessions) and the machine.slice (the default place for all virtual machines and Linux containers),
- still below there are scopes (group of externally created processes started via fork) and services (group of processes created through a unit file).
For example, to get the full hierarchy of control groups, type:
# systemd-cgls ├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 24 ├─user.slice │ └─user-0.slice │ ├─session-56.scope │ │ ├─19679 sshd: root@pts/1 │ │ ├─19683 -bash │ │ ├─19714 systemd-cgls │ │ └─19715 less │ └─session-40.scope │ ├─19370 sshd: root@pts/0 │ └─19374 -bash └─system.slice ├─httpd.service │ ├─2577 /usr/sbin/httpd -DFOREGROUND │ ├─2578 /usr/sbin/httpd -DFOREGROUND │ └─2579 /usr/sbin/httpd -DFOREGROUND ├─polkit.service │ └─730 /usr/lib/polkit-1/polkitd --no-debug ├─systemd-udevd.service │ └─455 /usr/lib/systemd/systemd-udevd ├─lvm2-lvmetad.service │ └─450 /usr/sbin/lvmetad -f ├─systemd-journald.service │ └─449 /usr/lib/systemd/systemd-journald ├─dbus.service │ └─611 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --sy ├─systemd-logind.service │ └─604 /usr/lib/systemd/systemd-logind ├─chronyd.service │ └─613 /usr/sbin/chronyd -u chrony ├─crond.service │ └─621 /usr/sbin/crond -n ├─postfix.service │ ├─ 1349 /usr/libexec/postfix/master -w │ ├─ 1358 qmgr -l -t unix -u │ └─19596 pickup -l -t unix -u ├─rsyslog.service │ └─589 /usr/sbin/rsyslogd -n ├─sshd.service │ └─1068 /usr/sbin/sshd -D ├─tuned.service │ └─583 /usr/bin/python -Es /usr/sbin/tuned -l -P ├─firewalld.service │ └─580 /usr/bin/python -Es /usr/sbin/firewalld --nofork --nopid ├─NetworkManager.service │ └─698 /usr/sbin/NetworkManager --no-daemon ├─system-getty.slice │ └─email@example.com │ └─631 /sbin/agetty --noclear tty1 └─system-serial\x2dgetty.slice └─serial-getty@ttyS0.service └─630 /sbin/agetty --keep-baud 115200 38400 9600 ttyS0
To get the list of control group ordered by CPU, memory and disk I/O load, type:
# systemd-cgtop Path Tasks %CPU Memory Input/s Output/s / 213 3.9 829.7M - - /system.slice 1 - - - - /system.slice/ModemManager.service 1 - - - -
To kill all the processes associated with an apache server (CGI scripts included), type:
# systemctl kill httpd
Note: Add the -s option to specify the signal to send (SIGTERM, SIGINT or SIGSTOP; by default SIGTERM).
Systemd Resource Controllers
Through Systemd, several resources can be restricted:
- CPUShares: by default at 1024,
- MemoryLimit: by default without limit, value expressed in Megabytes or Gigabytes.
- BlockIOWeight: by default without limit, value between 10 and 1000.
The RHEL 7.2 release brings three new service options:
- StartupCPUShares and StartupBlockIOWeight: they work like CPUShares and BlockIOWeight but only apply during system startup.
- CPUQuota: it restricts CPU time to the specified percentage, even if the machine is otherwise idle.
To put resource limits on a service (here 500 CPUShares), type:
# systemctl set-property httpd CPUShares=500 # systemctl daemon-reload
Note1: The change is written into the service unit file. Use the –runtime option to avoid this behaviour.
Note2: By default, each service owns 1024 CPUShares. Nothing prevents you from giving a value smaller or bigger.
To get the current CPUShares service value, type:
# systemctl show -p CPUShares httpd CPUShares=500
# systemctl show httpd | grep CPUShares CPUShares=500
Note: Each time a resource limit is set on a service, a directory of the same name with the .d suffix is created in /etc/systemd/system. For example, in the previous case, a directory named /etc/systemd/system/httpd.service.d is created with a file called 90-CPUShares.conf in it and the following content:
Note: The newly created directory (here /etc/systemd/system/httpd.service.d) can also be used to customize the service configuration file.
Also, if you need to use RT (Real-Time) services, be ready to apply additional RT configurations.
To better understand CGroups, let’s take an example. You want to run a website but you’ve got only one server.
You plan to use the classical LAMP stack (Linux, here Centos 7, Apache, MariaDB and PHP).
Your server’s got 4Gigabytes of memory and you want to allocate resources as follows:
- Apache service (here httpd.service): 40% of CPU, 500M of memory,
- PHP service (here php-fpm.service): 30% of CPU, 1G of memory,
- MariaDB service (here mariadb.service): 30% of CPU, 1G of memory.
You leave 1G of memory for the other processes (system, etc).
Note1: The values given are only for the sake of the discussion.
Note2: If you don’t configure CGroups, everything will work like in RHEL 6: all the processes will share the server power and the memory as they need.
As all your LAMP services are started from a Systemd unit file, they will be added in the system.slice.
Here is the configuration to set up with the systemctl set-property command:
- Apache service: CPUShares=4096 (4 x 1024); MemoryLimit=500M,
- PHP service: CPUShares=3072 (3 x 1024); MemoryLimit=1G,
- MariaDB service: CPUShares=3072 (3 x 1024); MemoryLimit=1G,
Note1: The Apache service will get 4096/(4096+3072+3072) CPUShares, the PHP service will get 3072/(4096+3072+3072) CPUShares, etc.
Note2: There are some other services in the system.slice (crond, postfix, chronyd, etc). But, as they are not very hungry, they will not consume their default allocated CPU resources (1024) and will not change anything to the situation. However, even though the Apache+MariaDB+PHP services use all their CPU resources, because the way it works, there will be still some resources for the other services.
Caution: Once you set up CPUShares CGroup restriction on one service in the system.slice, all the services there get CPUShares CGroup activated: even though you don’t specify anything, all new service started will be restricted to 1024 CPUShares by default. It is not possible to CPU-restrict some services and let the others without restriction. For a detailed explanation of the mechanism, see All control groups belong to us! video below in the Additional Resources section.
On this topic you can also:
- listen to CGroups (7min/2014) record for the full CGroups history,
- watch All control groups belong to us! (55min/2013) video to get some explanations from Systemd‘s creators,
- watch Georgios Magklaras’ demo (24min/2014),
- look at Andy Grimm’s Introduction to CGroups (60min/2014) for an explanation about children CPUShares computations,
- read Radoslaw Kujawa’s blog about CGroups:
RHEL/CentOS 7 service resource management with cgroups, RHEL/CentOS 7 run-time and session resource management with cgroups,
- read this Red Hat article about Controlling resources with cgroups for performance testing,
- read the official documentation: RHEL 7 Resource Management Guide,
- watch Lennart Poettering’s Systemd.conf 2015 presentation (45min/2015).