Huge Page support for ARM

On ARMv7 Linux, the standard page size is 4KB. Workloads that require a lot of scattered memory accesses over a large range can benefit from using a larger (denoted in the kernel as "huge") page size as this reduces the number of lookups required, thus reduces the number of TLB misses. The kernel provides two mechanisms to support huge pages, HugeTLB support and Transparent Hugepage (THP) support.

HugeTLB pages are reserved in a special pre-allocated pool and are made available to software that requests them explicitly. A user space library, libhugetlbfs, provides access to huge pages for applications. A good introduction to libhugetlbfs can be found at: http://www.ibm.com/developerworks/systems/library/es-lop-leveragepages/

Transparent hugepages work slightly differently. Where possible, a kernel process khugepaged will periodically check for large contiguous virtual memory blocks in a process and "collapse" them into a physically contiguous huge page. Also if large memory allocations are requested, a set of huge pages can sometimes be returned. khugepaged can be configured to never collapse certain memory areas, or to only collapse memory areas that have been approved. A nice introduction to transparent huge pages can be found at: http://lwn.net/Articles/423584/.

Upstream kernel support

HugeTLBFS and THP are supported in 32-bit and 64-bit ARM kernels since Linux 3.11.

Enabling huge pages

To enable HugeTLB in the kernel, one must select:

  • File systems --> Pseudo filesystems --> HugeTLB file system support

To enable Transparent HugePages (THP), one must select:

  • Kernel Features --> Transparent Hugepage Support

Userspace tools - libhugetlbfs

There are userspace libraries for accessing huge pages, including one very convenient LD_PRELOADer that hooks malloc to give a huge pages without having to recompile software.

http://libhugetlbfs.sourceforge.net/

ARM support can be found in the next branch: git://libhugetlbfs.git.sourceforge.net/gitroot/libhugetlbfs/libhugetlbfs next

Automatic testing of Huge Pages

Kernel configuration

The following kernel options need to be compiled in:

HUGETLBFS=y
TRANSPARENT_HUGEPAGE=y
TRANSPARENT_HUGEPAGE_MADVISE=y
ARM_LPAE=y

and,

HUGETLBFS=y
TRANSPARENT_HUGEPAGE=y
TRANSPARENT_HUGEPAGE_MADVISE=y
ARM_LPAE=n

i.e. this is for two kernels, one with LPAE support and one without.

Test suite (hugetlbfs - next branch)

To get hold of libhugetlbfs I did the following:

git clone git://libhugetlbfs.git.sourceforge.net/gitroot/libhugetlbfs/libhugetlbfs -b next

(ARM support is in the next branch only as of 24/02/2013)

To run the test suite one can issue the following:

# mkdir /dev/hugepages
# mount -t hugetlbfs hugetlbfs /dev/hugepages/
# echo 200 > /proc/sys/vm/nr_hugepages

Then one can run make check to launch the test suite.

I get the following results:

********** TEST SUMMARY
*                      2M            
*                      32-bit 64-bit 
*     Total testcases:    87      0   
*             Skipped:     0      0   
*                PASS:    85      0   
*                FAIL:     1      0   
*    Killed by signal:     0      0   
*   Bad configuration:     1      0   
*       Expected FAIL:     0      0   
*     Unexpected PASS:     0      0   
* Strange test result:     0      0   
**********

The map_high_truncate_2 unit test fails and this is expected as it maps very high memory addresses. From what I can tell this is to test huge_pmd_share which is not in the current ARM patches.

The direct unit test gives a bad configuration result. This was due to my /tmp filesystem (tmpfs) not supporting O_DIRECT files. The direct.c file can be changed with an updated path and this should then PASS.

Test Transparent HugePages

To give transparent huge pages a good test one should activate them for all circumstances:

# echo always > /sys/kernel/mm/transparent_hugepage/enabled

I ran the following very simple workload with the output piped out to a text file. (There is a subdirectory linux containing the linux sources make menuconfiged ready for building)

cd linux
ITERATION=1

while true; do
        echo Iteration $ITERATION $(date)

        echo Memory stats
        cat /proc/vmstat
        free -m

        echo

        make clean
        time make -j10

        yum clean all
        yum check-update

        let ITERATION=ITERATION+1
done

Although the above looks a little silly, it contains the following:

  • Lots of very memory sensitive processes (gcc will crash very quickly if its memory gets corrupted).
  • A long running process that khugepaged can collapse into hugepages, ld.

  • I found yum also had an uncanny ability to turf up memory problems when updating its repos over the network.

LEG/ServerArchitecture/HugePages (last modified 2017-08-18 07:32:49)