Skip to content

NUMA Considerations

Modern multi-socket servers use a memory architecture called NUMA (Non-Uniform Memory Access) ⧉. In a NUMA system, each CPU socket has its own local memory and I/O paths. Accessing local resources is faster than reaching across sockets to remote memory or devices. Simplyblock is fully NUMA-aware.

On a host with more than one socket, by default one or two storage nodes are deployed per socket.

Two storage nodes per socket are deployed if: - more than 32 vCPUs (cores) per NUMA socket are dedicated to simplyblock per socket - more than 10 NVMe devices are connected to the NUMA socket

Users can change this behavior. Either by setting the appropriate Helm Chart parameters (in case of Kubernetes-based storage node deployment) or by manually modifying the initially created configuration file on the storage node (after running sbctl sn configure).

It is critical for performance that all NVMe devices of a storage node are directly connected to the NUMA socket to which the storage node is deployed.

If a socket has no NVMe devices connected, it will not qualify to run a simplyblock storage node.

It is also important that the NIC(s) used by simplyblock for storage traffic are connected to the same NUMA socket. However, simplyblock does not auto-assign a NIC and users have manually to take care of that.

Checking NUMA Configuration

Before configuring simplyblock, the system configuration should be checked for multiple NUMA nodes. This can be done using the lscpu tool.

How to check the NUMA configuration
lscpu | grep -i numa
Example output of the NUMA configuration
root@demo:~# lscpu | grep -i numa
NUMA node(s):                         2
NUMA node0 CPU(s):                    0-31
NUMA node1 CPU(s):                    32-63

In the example above, the system has two NUMA nodes.

Recommendation

If the system consists of multiple NUMA nodes, it is recommended to configure simplyblock with multiple storage nodes per storage host. The number of storage nodes should match the number of NUMA nodes.

Ensuring NUMA-Aware Devices

For optimal performance, there should be a similar number of NVMe devices per NUMA node. Additionally, it is recommended to provide one Ethernet NIC per NUMA node.

To check the NUMA assignment of PCI-e devices, the lspci tool and a small script can be used.

Install pciutils which includes lspci
yum install pciutils
Small script to list all PCI-e devices and their NUMA nodes
#!/bin/bash

for i in  /sys/class/*/*/device; do
    pci=$(basename "$(readlink $i)")
    if [ -e $i/numa_node ]; then
        echo "NUMA Node: `cat $i/numa_node` ($i): `lspci -s $pci`" ;
    fi
done | sort