AMD has optimized Linux 5.20 for excellent performance
2 min readAMD has optimized Linux 5.20 for excellent performance
Just a few lines of code: AMD has optimized Linux 5.20 for excellent performance
According to news from Phoronix , a patch submitted by AMD recently can further adjust the imbalance of the Linux kernel’s scheduler around NUMA.
For some workloads, the adjustment patch of the scheduler can significantly improve the performance of AMD Zen-based systems, and even Also available on Intel Xeon servers.
The main content of the patch is: Changes to the fair scheduler are taking into account CPU affinity when NUMA imbalance is allowed in the find_idlest_group() function. AMD engineer K Prateek Nayak explained:
For systems with multiple LLCs per socket, such as AMD Zen systems, users want to spread bandwidth-hungry applications across multiple LLCs. Stream is one such representative workload where optimal performance is achieved by limiting to one stream thread per LLC.
To ensure this, users are known to pin tasks to a specified subset of CPUs, each LLC consisting of one CPU, running such bandwidth-hungry tasks simultaneously.
We can detect and avoid this buildup by checking if the number of CPUs allowed in the local group is less than the number of tasks running in the local group, and use this information to spread out the tasks to the next socket (after all, this slow path The goal is to find the most idle group and the most idle CPU during the initial placement.)
The Stream memory benchmark test case results show that for the current Linux kernel, this patch can benefit Stream by 36~44% and improve performance by about 40%:
Interestingly, AMD-led optimizations can benefit not only AMD Zen-based processors, but also Intel CPUs in multi-socket servers. Tests show a 54-82% performance improvement for Stream on Intel Xeon Scalable “Ice Lake” servers.
And this kernel patch is only a few lines of code:
The patch is currently queued in sched/core and should be introduced in Linux 5.20 if there are no other contingencies.
More technical details can be read in the patch email .