Shared Memory Communications (SMC) diagnostics

更新时间:
复制 MD 格式

This topic describes how to diagnose Shared Memory Communications (SMC) faults.

Prerequisites

The smc-tools O&M toolset provided by Alibaba Cloud Linux 3 is installed.

If the smc-tools toolset is not installed, you can run the following command to install it.

sudo yum install -y smc-tools

Fallback diagnostics

The SMC protocol stack automatically negotiates with the peer during connection establishment to decide whether to use SMC. If the conditions for SMC are not met, the connection safely falls back to TCP. You can use the smcss command to view the fallback reason for each SMC connection. For more information, see Connection monitoring.

Packet capture diagnostics

The SMC protocol stack uses technologies such as elastic Remote Direct Memory Access (eRDMA) for data transmission. This process bypasses the traditional network protocol stack. Therefore, traditional sniffing points cannot directly capture data and control packets.

Starting from version ANCK 5.10.134-18, the Alibaba Cloud Linux 3 kernel provides a solution to capture data and control packets from the SMC protocol stack:

  • When the SMC kernel stack sends or receives connection control messages and data blocks in shared memory, it constructs a UDP socket buffer (skb). The skb points to the target data memory and is sent to a specific virtual network device (dummy device). The virtual network device silently consumes the skb, which never actually enters the network. During this process, the skb passes through existing kernel sniffing points, such as dev_queue_xmit_nit(). This allows tcpdump and libpcap to capture the data and control packets that the SMC stack sends or receives on the virtual network device.

    image
  • The TCP network packets from the SMC handshake process are still captured on the original Ethernet network interface controller (NIC).

The following table summarizes the packet capture methods for these two stages.

Packet type

Traffic-bearing device

Capture device

Captured packet format

SMC handshake negotiation TCP packets

or packets sent after a fallback to the TCP protocol stack

Ethernet NIC (for example, eth0)

Ethernet NIC (for example, eth0)

TCP message

SMC data or control packets

RDMA NIC (for example, erdma_0)

Virtual NIC (for example, dummy0)

UDP datagram

smc-tools provides the smcdump-ex packet capture tool based on this solution. smcdump-ex is a wrapper script for the tcpdump tool that captures the entire SMC communication process. The script creates a virtual network device named smc-dummy{four-random-characters} in the current net namespace and enables the capture of SMC data and control packets. It uses tcpdump to capture SMC handshake negotiation packets, data packets, and control packets on Ethernet or virtual network devices as required. When it receives a SIGINT signal (Ctrl+C), it stops the capture, disables the capture feature, and destroys the virtual network device.

  • Tool usage

    You can run the following command to view the usage of smcdump-ex.

    Warning

    smcdump-ex is an experimental tool. Its usage may change in the future.

    smcdump-ex -h
    usage: smcdump-ex [-h] [-m {all,smc,smcd,smcr}] [-t {all,raw,cdc}]
                      [--param PARAM] [--filter FILTER] [--legacy]
    SMC Dump - SMC Traffic Capture (Experimental)
    optional arguments:
      -h, --help            show this help message and exit
      -m {all,smc,smcd,smcr}, --mode {all,smc,smcd,smcr}
                            Select the mode (default: smc)
      -t {all,raw,cdc}, --type {all,raw,cdc}
                            Select the packet type (default: all)
      --param PARAM         Additional parameters for tcpdump. e.g. --param '-w
                            packets.pcap'
      --filter FILTER       Additional filter expressions for tcpdump. e.g.
                            --filter 'host xxx.xxx.x.x and port 8080'
      --legacy              Use the legacy SMC dump header format
    • Use -m to set the data capture mode

      The following capture modes are available:

      • all: Captures packets on all network interfaces, which is equivalent to calling tcpdump -i any. This captures SMC handshake negotiation packets on the Ethernet interface and SMC data and control packets on the new virtual network interface. In this mode, you can use --filter to set additional tcpdump filter expressions. This narrows the scope of packet capture to accurately capture SMC negotiation packets. For more information, see the description of --filter.

      • smc, smcr, and smcd: Captures only SMC, SMC-R, or SMC-D data and control packets on the new virtual network interface. This is equivalent to calling tcpdump -i smc-dummy{four-random-characters}.

    • Use -t to set the type of network packets to capture

      The following network packet types can be captured:

      • all: SMC data and control packets.

      • raw: SMC data packets only.

      • cdc: SMC control packets only.

    • Use --param to set other tcpdump parameters

      For example, you can use --param '-w smcdata.pcap' to dump the captured content to a file.

    • Use --filter to set other tcpdump filter rules

      For example, you can use --filter 'host <ip> and port <port>' to set the IP address and port of the packets to capture. This ensures that SMC negotiation packets are accurately captured and prevents other TCP network packets from interfering with the analysis.

    • Use --legacy to parse the old smc dump header format

      This parameter is only for compatibility with kernel version ANCK 5.10.134-17.3. For kernel versions ANCK 5.10.134-18 and later, you do not need to set this parameter.

  • Usage example

    For example, to capture all packets during the communication of an SMC connection with the IP address 192.168.2.5 and port 5201, and dump them to the smc.pcap file, run the following command:

    smcdump-ex -m all -t all --param '-w smc.pcap' --filter 'host 192.168.2.5 and port 5201'

    The captured results saved to a .pcap file using the --param '-w <pcap file>' parameter can be further analyzed with the Wireshark tool and a Lua plugin:

    1. Download and install Wireshark.

    2. You can use Wireshark with the Lua plugin to analyze the captured results stored in the .pcap file.

      1. Download the Lua plugin: https://os-smc-new.oss-cn-hangzhou.aliyuncs.com/smc_dump.lua.

        Note

        To use this Lua plugin, ensure that the Lua interpreter in Wireshark is version 5.3 or later.

      2. Find the installation path for the Wireshark Lua script and place the Lua script in that path.

        • You can view the installation path of the Wireshark Lua script on macOS.

          1. Run Wireshark.

          2. From the menu bar, click Wireshark > About Wireshark > Folders.

            image

            image

        • You can find the installation path of the Wireshark Lua script on Windows.

          1. Run Wireshark.

          2. From the main Wireshark interface, click Help > Folders.

            image

    3. Restart Wireshark or reload the Lua plugins to apply the script. After this, all UDP packets that have an SMC dump header in their payload will be parsed as the SMC protocol, as shown in the following figure.

      image

Performance diagnostics

If you encounter issues such as SMC performance degradation during comparative performance tests, you can use basic performance testing tools to perform an initial diagnosis.

  • For SMC, you can use basic network performance testing tools such as sockperf, qperf, iperf3, and netperf to run bandwidth and latency benchmark tests.

  • For eRDMA, you can use the perftest basic performance testing tool to run bandwidth and latency benchmark tests. For more information, see eRDMA network performance tests.

If SMC or eRDMA shows performance degradation during the benchmark tests, submit a ticket for further assistance. If the benchmark tests are normal, check your SMC configuration as described in Enable and configure Shared Memory Communications (SMC).