Linux Kernel Debug Workshop

This page is to demonstrate the Linux kernel debug workshop on Hikey board. So it goes through Linux kernel debug techniques one by one and better can demonstrate with real case if we can find out.

Enable debugging environment on Hikey

Build mainline kernel

git clone https://git.linaro.org/people/leo.yan/linux-debug-workshop.git/
make ARCH=arm64 defconfig
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- Image dtbs modules

Flash boot image

Please refer to 96boards.org

Enable ssh connection

It's convenient to create networking connection between PC and target board, so below are two options we can use: option 1: Use USB host to connect USB to ethernet convert; option 2: Use USB OTG port to enable ethernet on USB, so we can create network card on USB slave port.

Enable dependent configurations in kernel menuconfig

  • Device Drivers --->

    • USB support --->

      • <*> USB Gadget Support --->

        • <*> USB functions configurable through configfs

        • [*] Ethernet Control Model (CDC ECM)
        • [*] Ethernet Control Model (CDC ECM) subset
        • [*] RNDIS

Enable configfs after board booting up

if [ ! -e /sys/kernel/config/usb_gadget/g1 ]; then
        mount -t configfs none /sys/kernel/config
        mkdir /sys/kernel/config/usb_gadget/g1
        echo "0x1d6b" > /sys/kernel/config/usb_gadget/g1/idVendor
        echo "0x0104" > /sys/kernel/config/usb_gadget/g1/idProduct
        mkdir /sys/kernel/config/usb_gadget/g1/strings/0x409
        echo "0123456789" > /sys/kernel/config/usb_gadget/g1/strings/0x409/serialnumber
        echo "HISILICON" > /sys/kernel/config/usb_gadget/g1/strings/0x409/manufacturer
        echo "Ethernet Gadget" > /sys/kernel/config/usb_gadget/g1/strings/0x409/product
        mkdir /sys/kernel/config/usb_gadget/g1/functions/ecm.usb0
        mkdir /sys/kernel/config/usb_gadget/g1/configs/c.1
        mkdir /sys/kernel/config/usb_gadget/g1/configs/c.1/strings/0x409
        echo "CDC" > /sys/kernel/config/usb_gadget/g1/configs/c.1/strings/0x409/configuration
        echo '6a:7d:db:b7:38:28' > /sys/kernel/config/usb_gadget/g1/functions/ecm.usb0/host_addr
        ln -s /sys/kernel/config/usb_gadget/g1/functions/ecm.usb0 /sys/kernel/config/usb_gadget/g1/configs/c.1
        echo "f72c0000.usb" > /sys/kernel/config/usb_gadget/g1/UDC
        echo "disconnect" > /sys/class/udc/f72c0000.usb/soft_connect
        echo "connect" > /sys/class/udc/f72c0000.usb/soft_connect
fi

ifconfig usb0 down
ifconfig usb0 hw ether 02:11:22:33:44:55
ifconfig usb0 192.168.1.20
ifconfig usb0 up

Test cases

Stack detecing

Static checking

For static checking, aarch64-linux-gnu-gcc (4.9.2) if failed at my side. Though gcc declares FRAME_WARN is supported from gcc 4.4; need to update to aarch64-linux-gnu-gcc (6.2.1).

Dynamic checking

The test case

Memory leakage detecting

The test case

# echo clear > /sys/kernel/debug/kmemleak
# insmod kmemleak-test.ko
# echo scan > /sys/kernel/debug/kmemleak
# cat /sys/kernel/debug/kmemleak

Ftrace

Manually capture ftrace data:

TRACEPOINTS="-e cpu_idle -e cpu_frequency -e sched_wakeup \
             -e sched_wakeup_new -e sched_switch \
             -e sched_migrate_task -e irq_handler_entry \
             -e softirq_raise -e sched*"

TRACECMD=/root/trace-cmd

if [ "$1" = "record" ]; then
        $TRACECMD record -o /root/trace.dat $TRACEPOINTS
elif [ "$1" = "start" ]; then
        echo 10240 > /sys/kernel/debug/tracing/buffer_size_kb
        cat /sys/kernel/debug/tracing/buffer_size_kb
        $TRACECMD reset
        $TRACECMD start $TRACEPOINTS
        echo TRACE_MARKER_START > /sys/kernel/debug/tracing/trace_marker
elif [ "$1" = "stop" ]; then
        # after run the case, stop trace
        echo TRACE_MARKER_STOP > /sys/kernel/debug/tracing/trace_marker
        $TRACECMD stop
        $TRACECMD extract -o /root/trace.dat
fi

Will use lisa to demonstrate the "dynamic" debugging with ftrace.

Perf: hotspot

The test case

  • Device Drivers --->

    • Misc devices --->

      • [*] Debug workshop hotspot

./perf record -e cycles -p XXX -- sleep 20
./perf report -i perf.data

Perf: cachemiss

The test case

  • Device Drivers --->

    • Misc devices --->

      • [*] Debug workshop cache miss

./perf stat -a -e cache-references,cache-misses -- sleep 10
./perf record -a -e cache-references,cache-misses -- sleep 10
./perf report -i perf.data

' PMU is missed on Hikey, so cannot support profile with PMU for cache miss now. Need check furthermore. '

Boot hang

The test case

  • Device Drivers --->

    • Misc devices --->

      • [*] Debug workshop boot hang

Solution: add "initcall_debug" in Kernel command line

CPU hard lockup

The test case

  • Device Drivers --->

    • Misc devices --->

      • [*] Debug workshop CPU lockup

Trigger lockup: cat /proc/cpu_hard_lock

Solution:

Kdump

Build kernel

Kdump needs to provide two kernels, one is first boot kernel and another is crash dump kernel. The two kernel can be the same one. Before build the kernel, we should enable below configuration:

  • Kernel Features --->

    • [*] Build kdump crash kernel

The test case

  • Device Drivers --->

    • Misc devices --->

      • [*] Debug workshop kdump

Build kexec

git clone http://git.linaro.org/people/takahiro.akashi/kexec-tools.git
cd kexec-tools
git checkout origin/arm64/kdump

./configure --build=x86_64-linux --host=aarch64-linux-gnu --target=aarch64-linux-gnu --without-xen
make

Load crash dump kernel

Copy vmlinux and dtb file onto target board, this can use ssh commands:

scp vmlinux root@192.168.1.20:/root/
scp arch/arm64/boot/dts/hisilicon/hi6220-hikey.dtb root@192.168.1.20:/root/

./kexec -p vmlinux --dtb=hi6220-hikey.dtb --append="root=/dev/mmcblk0p9 rw  maxcpus=1 reset_devices earlycon=pl011,0xf7113000 nohlt initcall_debug console=tty0 console=ttyAMA3,115200 clk_ignore_unused"

Save core dump

cp /proc/vmcore ./vmcore
scp ./vmcore leoy@hostpc:~/
./crash vmlinux vmcore

Pstore

The test case

Enable pstore for function tracing

echo 1 > /sys/kernel/debug/pstore/record_ftrace

Dump log after second booting

mount -t pstore pstore /mnt
cat /mnt/xxx > pstore.log

Ktest

The test case

Use this Ktest script for bisection for booting verification: ./ktest.pl examples/hikey_boot_bisect.conf; this script uses relay1 for board power supply and relay5 for fastboot 5-6 pins closing

Linux-kernel-debug-workshop-hikey (last modified 2018-02-23 08:26:16)