来自 /dev/kmsg 的内核调试

Kernel debugging from /dev/kmsg

我在我的嵌入式系统上运行的(定制)驱动程序 (smsc95xx) 遇到了一些问题,我需要了解问题的确切来源。

例如,这是来自 /dev/kmsg 报告问题的内核错误消息:

1,737,1433656890,-;Unable to handle kernel NULL pointer dereference at virtual address 000001a0
1,738,1433665618,-;pgd = daafc000
1,739,1433668609,-;[000001a0] *pgd=9d5dd831, *pte=00000000, *ppte=00000000
0,740,1433675720,-;Internal error: Oops: 17 [#2] SMP ARM
4,741,1433680664,-;Modules linked in: ctr ccm ecb hci_uart rfcomm bnep bluetooth arc4 usb_trimble(O) wl18xx wlcore mac80211 cfg80211 rfkill wlcore_sdio twl4030_madc industrialio ftdi_sio smsc95xx(O) usbserial(O) ipv6
4,742,1433700378,-;CPU: 0 PID: 17418 Comm: sh Tainted: G      D    O   3.18.18-custom #20
4,743,1433708343,-;task: de30cd40 ti: da9b8000 task.ti: da9b8000
4,744,1433714050,-;PC is at __pm_runtime_resume+0x1c/0x64
4,745,1433719085,-;LR is at usb_autopm_get_interface+0x18/0x5c
4,746,1433724578,-;pc : [<c03cb590>]    lr : [<c04677d4>]    psr: 20000013\x0asp : da9b9ea8  ip : da9b9f14  fp : 00000000
4,747,1433736633,-;r10: daa22a4c  r9 : 00000024  r8 : 00000004
4,748,1433742126,-;r7 : 000000a0  r6 : 00000004  r5 : 00000000  r4 : 00000020
4,749,1433748992,-;r3 : 000001a0  r2 : 00000040  r1 : 00000004  r0 : 00000020
4,750,1433755859,-;Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
4,751,1433763366,-;Control: 10c5387d  Table: 9aafc019  DAC: 00000015
0,752,1433769378,-;Process sh (pid: 17418, stack limit = 0xda9b8240)
0,753,1433775421,-;Stack: (0xda9b9ea8 to 0xda9ba000)
0,754,1433779998,-;9ea0:                   00000000 00000000 00000000 00000020 000000a0 c04677d4
0,755,1433788604,-;9ec0: dd31f680 00000000 00000040 c04574c8 c01ae218 c0085f58 00000001 00000000
0,756,1433797210,-;9ee0: 00000000 00000024 c04574a0 dd31f680 c0457510 de687a00 da9b9f88 bf0d44e4
0,757,1433805816,-;9f00: 00000024 da9b9f14 00000004 de687a00 da9b9f88 01110000 00000000 bf0d7990
0,758,1433814422,-;9f20: bf0d7cbc 00000000 00000000 bf0d4554 00000002 00000002 daa22a40 c01ae24c
0,759,1433823028,-;9f40: 00000000 00000000 dd3721c0 00000002 000eb408 da9b9f88 c000e824 da9b8000
0,760,1433831634,-;9f60: 00000000 c0145fd8 de30cd40 c08f20d4 dd3721c0 dd3721c0 00000002 000eb408
0,761,1433840240,-;9f80: c000e824 c01464e0 00000000 00000000 00000000 00000002 000eb408 b6ee1d60
0,762,1433848815,-;9fa0: 00000004 c000e660 00000002 000eb408 00000001 000eb408 00000002 00000000
0,763,1433857421,-;9fc0: 00000002 000eb408 b6ee1d60 00000004 00000000 000e515c 00000001 00000000
0,764,1433865997,-;9fe0: 00000000 beaef904 b6e1946c b6e7139c 60000010 00000001 00000000 00000000
4,765,1433874603,-;[<c03cb590>] (__pm_runtime_resume) from [<c04677d4>] (usb_autopm_get_interface+0x18/0x5c)
4,766,1433884307,-;[<c04677d4>] (usb_autopm_get_interface) from [<c04574c8>] (usbnet_write_cmd+0x28/0x70)
4,767,1433893737,-;[<c04574c8>] (usbnet_write_cmd) from [<bf0d44e4>] (__smsc95xx_write_reg+0x50/0x8c [smsc95xx])
4,768,1433903839,-;[<bf0d44e4>] (__smsc95xx_write_reg [smsc95xx]) from [<bf0d4554>] (smsc95xx_store+0x34/0x218 [smsc95xx])
4,769,1433914794,-;[<bf0d4554>] (smsc95xx_store [smsc95xx]) from [<c01ae24c>] (kernfs_fop_write+0xc0/0x184)
4,770,1433924438,-;[<c01ae24c>] (kernfs_fop_write) from [<c0145fd8>] (vfs_write+0xa0/0x1ac)
4,771,1433932586,-;[<c0145fd8>] (vfs_write) from [<c01464e0>] (SyS_write+0x44/0x9c)
4,772,1433940002,-;[<c01464e0>] (SyS_write) from [<c000e660>] (ret_fast_syscall+0x0/0x50)
0,773,1433947967,-;Code: e1a04000 0a000006 e2803d06 f5d3f000 (e1932f9f) 
4,774,1433954650,-;---[ end trace bdd277dec40e1d5c ]---

我想最重要的部分是最后几行:

4,765,1433874603,-;[<c03cb590>] (__pm_runtime_resume) from [<c04677d4>] (usb_autopm_get_interface+0x18/0x5c)
4,766,1433884307,-;[<c04677d4>] (usb_autopm_get_interface) from [<c04574c8>] (usbnet_write_cmd+0x28/0x70)
4,767,1433893737,-;[<c04574c8>] (usbnet_write_cmd) from [<bf0d44e4>] (__smsc95xx_write_reg+0x50/0x8c [smsc95xx])
4,768,1433903839,-;[<bf0d44e4>] (__smsc95xx_write_reg [smsc95xx]) from [<bf0d4554>] (smsc95xx_store+0x34/0x218 [smsc95xx])
4,769,1433914794,-;[<bf0d4554>] (smsc95xx_store [smsc95xx]) from [<c01ae24c>] (kernfs_fop_write+0xc0/0x184)
4,770,1433924438,-;[<c01ae24c>] (kernfs_fop_write) from [<c0145fd8>] (vfs_write+0xa0/0x1ac)
4,771,1433932586,-;[<c0145fd8>] (vfs_write) from [<c01464e0>] (SyS_write+0x44/0x9c)
4,772,1433940002,-;[<c01464e0>] (SyS_write) from [<c000e660>] (ret_fast_syscall+0x0/0x50)

但也许有比检查 /dev/kmsg 更好的方法来理解此输出?

问题已解决。

已修改驱动程序以将文件创建到

/sys/class/dirnamae/files 

目录(目录名和文件在驱动程序代码中命名)。 问题是驱动程序没有删除之前创建的目录,所以拔下并重新插入设备然后写入文件导致之前显示的内核错误,因为它就像写入一个不再被引用的内存区域。

解决方法是删除

/sys/class/dirnamae 

并在每次拔下设备时重新创建它。