Login node cgroup to limit CPU and memory

1. Background

The business cluster provides several Linux servers for users to log in. In addition to normal file operations, users can also compile and submit remote computations. However, some users like or unconsciously use the local (ie login node) for computing (using the login node CPU and memory). For shared resources, this is a "harm" to our business system and other users.

Therefore, there is an urgent need to limit the resources available to users. The technology of limitation is the Linux kernel mechanism Cgroup

2. Operation

2.1 Check if cgroup is available

Log in to the node, and use service cgconfig status to check that the cgroup mechanism has been started, and the mount point is the /cgroup directory.

2.2 Configure Restriction Script

Because there are too many logged-in users, programs directly targeted to users are restricted. First create a restriction script as follows:

#!/bin/bash

# 1. mkdir sub-system
CPU_CGROUP=/cgroup/cpu/$(hostname)
MEM_CGROUP=/cgroup/memory/$(hostname)

mkdir -p ${CPU_CGROUP} ${MEM_CGROUP}

# 2. write file
CPU_LIMIT=${CPU_CGROUP}/cpu.cfs_quota_us
MEM_SW_LIMIT=${MEM_CGROUP}/memory.memsw.limit_in_bytes
MEM_LIMIT=${MEM_CGROUP}/memory.limit_in_bytes
CPU_TASK=${CPU_CGROUP}/tasks
MEM_TASK=${MEM_CGROUP}/tasks

# 3. set num
#6 core
CPUNUM=600000
#10G
#MEMNUM=31897681920
MEMNUM=10737418240
MEMSWNUM=10737418240
echo ${CPUNUM} > ${CPU_LIMIT}
echo ${MEMNUM} > ${MEM_LIMIT}
echo ${MEMSWNUM} > ${MEM_SW_LIMIT}

# 4. work
for user in $(/usr/bin/who|grep -v root|awk '{print $1}'|sort|uniq)
do

	for pid in $(/usr/bin/pgrep -u ${user})
	do
		echo ${pid} >> ${CPU_TASK} 2>/dev/null
		echo ${pid} >> ${MEM_TASK} 2>/dev/null
	done
done	

The above script mainly includes:

  • Set available CPU resources
  • Set available memory (swap partition) resource
  • Traverse the user program pid, and then set the available CPU for different pids

2.3 Create a scheduled task

Execute crontab -e on the login node, the content:

*/1 * * * * /bin/bash /root/cgroup_limit.sh

3. Test

Test the calculation-intensive and memory-intensive scripts separately, you can view the setting results, the CPU is limited to 6 cores, and if the memory exceeds 10G, it will be OOM

Posted by tcollie on Mon, 30 May 2022 14:32:59 +0530