ARM64 Linux Kernel Development Faces Opposition to Microarchitecture-Specific Optimizations
ARM64 Linux Kernel Development Faces Opposition to Microarchitecture-Specific Optimizations
- Why Enterprise RAID Rebuilding Succeeds Where Consumer Arrays Fail?
- Linus Torvalds Rejects MMC Subsystem Updates for Linux 7.0: “Complete Garbage”
- The Man Who Maintained Sudo for 30 Years Now Struggles to Fund the Work That Powers Millions of Servers
- How Close Are Quantum Computers to Breaking RSA-2048?
- Why Windows 10 Users Are Flocking to Zorin OS 18 Instead of Linux Mint?
- How to Prevent Ransomware Infection Risks?
- What is the best alternative to Microsoft Office?
The 64-bit ARM Linux kernel team opposes CPU-specific optimizations because they are challenging to maintain..
In the Linux x86_64 kernel, optimizations for specific microarchitectures are widespread, with various performance techniques applied to Intel and AMD CPU series.
However, maintainers of the ARM64 Linux kernel resist introducing new microarchitecture-specific optimizations because they could impact new ARM processors.

Ampere Computing has submitted a set of four patches to optimize its new AmpereOne server processor. These patches aim to benefit the performance of these high-core-count ARM server processors, particularly when using 4K page sizes. Reports indicate significant gains, up to “1.3 ~ 1.4 times” in continuous read performance tests when using HugeTLB or Tmpfs.
While these improvements are exciting for enhancing AmpereOne Linux performance, it appears that this work will not be merged into the mainline Linux kernel.
Renowned ARM Linux kernel developer Will Deacon shared his perspective on the performance-enhancing patches for the AmpereOne CPU:
“We tend to avoid microarchitecture optimizations in the arm64 kernel because they are challenging to maintain, difficult to test correctly, often result in bloat, and add extra barriers to updating our libraries.
Indeed, we’ve provided some help for Thunder-X1 in copy_page() (masquerading as ARM64_HAS_NO_HW_PREFETCH), but frankly, that machine needs all the help it can get.
So, I really don’t want to merge it; modern CPUs should do better in copying data. This is copy_to_user(), not rocket science.”
ARM’s Mark Rutland agrees with Deacon and supports the removal of targeted optimizations for Thunder-X1. Kernel developer Marc Zyngier also agrees and is working on a patch to eliminate specific code for Thunder-X1.
To maintain code maintainability and avoid complicating the ARM64 Linux kernel codebase, they are no longer pursuing specific optimizations for CPU/microarchitecture.
The future will reveal whether any ARM Linux distributions will carry such patches themselves or if Linux distributions optimized for AmpereOne will continue to progress.
This is particularly noteworthy considering Ampere’s focus on high-performance and energy-efficient ARM Linux servers, likely aiming to compete with AMD EPYC and Intel Xeon servers without leaving traces of optimizations.