如果设备是 ext4,如何从 vm.block_dump=1 产生的 dmesg 输出将块号映射到文件名?
How to map block numbers to file names from the dmesg output resultant from vm.block_dump=1, if device is ext4?
tl;dr: 我想 see/get dmesg 上报告的块编号所属的文件名,当 vm.block_dump=1
示例 dmesg: bash(13515): READ block 5434824 on xvda3 (32 sectors)
当例如。 sudo sysctl -w vm.block_dump=1
或例如。 echo '1' | sudo tee /proc/sys/vm/block_dump
然后 "Linux reports all disk read and write operations that take place, and all block dirtyings done to files. [...] The output of block_dump is written to the kernel output, and it can be retrieved using "dmesg"。当你使用 block_dump 并且你的内核日志记录级别还包括内核调试消息时,你可能想要关闭 klogd,否则 block_dump 的输出将被记录,导致通常不存在的磁盘 activity。” (引自 here)
"block dirtyings" 不是问题,例如。
[ 3140.559675] systemd-journal(291): dirtied inode 399135 (system.journal) on xvda3
我可以看到它的文件名是这样的:
$ echo -e 'open /dev/xvda3\n ncheck 399135' | sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: ncheck 399135
Inode Pathname
399135 /var/log/journal/12c5e521101c444594b96b53751551a8/system.journal
问题出在 "Linux reports all disk read and write operations that take place"(引用上面的内容),因为它们是按块报告的,例如
[ 3140.376827] kworker/u24:3(21616): WRITE block 11037768 on xvda3 (8 sectors)
[ 3140.724725] bash(13515): READ block 5434824 on xvda3 (32 sectors)
[ 3140.725483] date(13515): READ block 5434896 on xvda3 (160 sectors)
[ 3140.728946] sed(13519): READ block 5143680 on xvda3 (32 sectors)
[ 3140.736022] sleep(13522): READ block 5379184 on xvda3 (24 sectors)
[ 3140.804803] qubes-gui(522): READ block 5179952 on xvda3 (16 sectors)
[ 3140.806519] Xorg(599): READ block 7420192 on xvda3 (176 sectors)
[ 3140.810348] InputThread(613): READ block 7418560 on xvda3 (112 sectors)
[ 3140.815866] at-spi2-registr(812): READ block 5654512 on xvda3 (8 sectors)
[ 3140.816860] xdg-desktop-por(888): READ block 5795168 on xvda3 (8 sectors)
[ 3140.818716] gnome-terminal-(865): READ block 5804672 on xvda3 (16 sectors)
[ 3141.064524] sed(13531): READ block 3446048 on xvda3 (16 sectors)
[ 3141.130808] systemd(571): READ block 4744136 on xvda3 (184 sectors)
可以在这里看到负责显示此类消息的内核代码:
https://elixir.bootlin.com/linux/v4.18.5/source/block/blk-core.c#L2542
None 这些块产生任何 inode 号,使用这个:
$ echo -e 'open /dev/xvda3\n icheck 11037768' |sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: icheck 11037768
Block Inode number
11037768 <block not found>
而不是上面的 <block not found>
我应该得到一个 inode 编号,然后我可以将它与前面的 echo 一起使用来查看文件名。
我试过的:以防块号是扇区大小(512 字节)的倍数并且我知道 ext4 块大小是 4096 字节(sudo blockdev --getbsz /dev/xvda
),我也尝试过(使用乘法的结果for): 11037768*2, 11037768*4 and 11037768*8 结果相同: <block not found>
我在这里错过了什么?这些块是指向目录条目还是不是文件名的东西?即便如此,debugfs 不应该仍然产生 inode 编号吗?
有没有更好的方法来获取文件名?
编辑: icheck
(在 debugfs
内)期望的块号与 debugfs stat
在 [=34 下报告的一样=],例如 2172716-2172721
范围内的任何数字,如下所示:
$ sudo debugfs -R "stat /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py" /dev/xvda3
debugfs 1.44.2 (14-May-2018)
Inode: 550529 Type: regular Mode: 0644 Flags: 0x80000
Generation: 1781055959 Version: 0x00000000:00000001
User: 0 Group: 0 Project: 0 Size: 22179
File ACL: 0
Links: 1 Blockcount: 48
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x5b6ec29d:1b2e0200 -- Sat Aug 11 13:03:57 2018
atime: 0x5b33b5a9:00000000 -- Wed Jun 27 18:04:57 2018
mtime: 0x5b33b5a9:00000000 -- Wed Jun 27 18:04:57 2018
crtime: 0x5b6ec29d:1af0f900 -- Sat Aug 11 13:03:57 2018
Size of extra inode fields: 32
EXTENTS:
(0-5):2172716-2172721
另一种方式:
$ filefrag -s -v /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py
Filesystem type is: ef53
File size of /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py is 22179 (6 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 5: 2172716.. 2172721: 6: last,eof
/usr/lib/python2.7/site-packages/salt/modules/zonecfg.py: 1 extent found
现在问题仍然存在:这些设备块号(在 dmesg 中为 xvda3 报告的块号)与这些物理偏移量有何关系?
EDIT2: 我刚刚确认这些物理偏移号与块设备号相同(只是,显然与 dmesg 上报告的不一样);下面显示了上述文件的最后一个块,我可以确认它与使用 vim
:
查看文件时相同
$ sudo dd bs=4096 skip=2172721 count=1 if=/dev/xvda3 | hexdump -C
我已经在 Qubes OS R4.0 Fedora 28 AppVM 中的 4.18.5 内核下对此进行了测试。 (如有必要,我可以使用自定义 .config
/补丁重新编译自定义内核 - 欢迎提出建议)
当 vm.block_dump=1
时 dmesg 上报告的块号是扇区号 - 每个块 512
字节 - 但是 debugfs
命令期望其 icheck
的块号对于 ext4,命令通常每个块 4096
字节,因此所需的只是除以 8
。
虽然块大小通常为 4096 字节,但您必须确定,所以这样获取块大小:
$ sudo blockdev --getbsz /dev/xvda3
4096
虽然此值通常为 4096,但在某些情况下可能会有所不同,例如。 mkfs.ext4
和 -b
可以创建具有不同块大小的 ext4 文件系统; vfat 的块大小为 512 字节。
要从块大小中一次性获得除法器(在bash中):
$ echo $(( `sudo blockdev --getbsz /dev/xvda3` / 512 ))
8
让我们使用这一行作为输入:
[ 3140.736022] sleep(13522): READ block 5379184 on xvda3 (24 sectors)
将上述除以8
得到icheck
的块号:
$ bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
5379184/8
672398.00000000000000000000
我使用 bc -l
来确保我没有打错数字(如果它以 .00000000000000000000
结尾,我很可能没有)
使用debugfs
获取路径名:
$ sudo debugfs
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: icheck 672398
Block Inode number
672398 134571
debugfs: ncheck 134571
Inode Pathname
134571 /usr/bin/sleep
debugfs: close
debugfs: quit
我不确定 dmesg 中报告的块号是否可以是每个扇区 512 字节以外的值的倍数。例如,如果底层磁盘有 4096 bytes per sector physically,它们是否仍然是每个块 512 字节?如果我猜的话,我会说假设它们总是每个块 512 字节是安全的。
这是上面的脚本版本:
#!/bin/bash
#./showblock rev.02 rewritten for question/answer from:
#----
bytes_per_sector=512 #assumed that dmesg block numbers are 512 bytes each (ie. 512 bytes per sector; aka block size is 512)!
#----
#use `sudo` only when not already root
if test "`id -u`" != "0"; then
sudo='sudo'
else
sudo=''
fi
if ! test "$#" -ge "2"; then
echo "Usage: [=14=] <device> <dmesgblocknumber> [dmesgblocknumber ...]"
echo "Examples:"
echo "[=14=] /dev/xvda3 5379184"
echo "[=14=] /dev/xvda3 5379184 5129952 7420192"
exit 1
fi
within_exit() {
echo -e "\nSkipped current instruction within on_exit()"
}
on_exit() {
#trap - EXIT SIGINT SIGQUIT SIGHUP #will exit by skipping the rest of all instrunction from on_exit() eg. if C-c
trap within_exit EXIT SIGINT SIGQUIT SIGHUP #skip only current instruction from on_exit() eg. when C-c is pressed
#echo "first sleep"
#sleep 10
#echo "second sleep"
#sleep 10
if test "${#remaining_args[@]}" -gt 0; then
echo -n "WARNING: There are '${#remaining_args[@]}' remaining args not processed, they are: " >&2
for i in `seq 0 1 "$(( "${#remaining_args[@]}" - 1 ))"`; do #seq is part of coreutils package
echo -n "'${remaining_args[${i}]}' " >&2
done
echo >&2
fi
}
trap on_exit EXIT SIGINT SIGQUIT SIGHUP
dev=""
shift 1
if test -z "$dev" -o ! -b "$dev"; then
echo "Bad device name or not a device: '$dev'" >&2
exit 1
fi
blocksize="`$sudo blockdev --getbsz "$dev"`"
if test "${blocksize:-0}" -le "0"; then #handles empty arg too
echo "Failed getting block size for '$dev', got '$blocksize'" >&2
exit 1
fi
#TODO: check and fail if not a multiplier
divider="$(( $blocksize / $bytes_per_sector ))"
if ! test "${divider:-0}" -gt "0"; then
echo "Failed computing divider from: '$blocksize' / '$bytes_per_sector', got '$divider'" >&2
exit 1
fi
# for each passed-in dmesg block number do
while test "$#" -gt "0"; do
dmesgblock=""
shift
remaining_args=("$@") #for on_exit() above
echo '--------'
echo "Passed-in dmesg block($bytes_per_sector bytes per block) number: '$dmesgblock'"
#have to handle the case when $dmesgblock is empty and when it's negative eg. "-1" so using a default value(of 0) when unset in the below 'if' block will help not break the 'test' expecting an integer while also allowing negative numbers ("0$dmesgblock" would otherwise yield "0-1" a non-integer):
if test "${dmesgblock:-0}" -le "0"; then
echo "Bad passed-in dmesg block number: '$dmesgblock'" >&2
exit 1
fi
#TODO: check and fail(or warn? nah, it should be fail!) if not a multiplier (eg. use modullo? is it "%" ?)
block=$(( $dmesgblock / 8 ))
if ! test "${block:--1}" -ge "0"; then
echo "Badly computed device block number: '$block'" >&2
exit 1
fi
echo "Actual block number(of $blocksize bytes per block): $block"
inode="$(echo "open ${dev}"$'\n'"icheck ${block}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2 -d$'\t')"
if test "<block not found>" == "$inode"; then
echo "No inode was found for the provided dmesg block number '$dmesgblock' which mapped to dev block number '$block'" >&2
exit 1
else
#assuming number TODO: check for this!
echo "Found inode: $inode"
fpath="$(echo "open ${dev}"$'\n'"ncheck ${inode}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2- -d$'\t')"
#fpath always begins with '/' right?
if test "$fpath" != "Pathname"; then
echo "Found path : $fpath"
else
echo "not found"
fi
fi
done
示例用法:
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 "" -1 "" 1200 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: ''
Bad passed-in dmesg block number: ''
WARNING: There are '3' remaining args not processed, they are: '-1' '' '1200'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 -1 "" "" 1200 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: '-1'
Bad passed-in dmesg block number: '-1'
WARNING: There are '3' remaining args not processed, they are: '' '' '1200'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 0 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: '0'
Bad passed-in dmesg block number: '0'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 3 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: '3'
Actual block number(of 4096 bytes per block): 0
No inode was found for the provided dmesg block number '3' which mapped to dev block number '0'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
exit code: 0
这是一个处理整个 dmesg 日志的脚本:
#!/bin/bash
#./showallblocks rev.01 rewritten for question/answer from:
if test "`id -u`" != "0"; then
sudo='sudo'
else
sudo=''
fi
dmesglog=""
if test -z "$dmesglog"; then
echo "Usage: '[=16=]' <dmesglogfile>"
echo "Examples:"
echo "sudo dmesg > dmesg1.log && '[=16=]' dmesg1.log"
echo "'[=16=]' <(sudo dmesg)"
#Note: '[=16=]' used for the case when [=16=] has spaces or other things in its path names, and user wants to copy paste, for whatever reason, the output of the above into the command line.
exit 1
fi
#(optional) Stop logging if already in progress:
$sudo sysctl -w vm.block_dump=0
#Using the answer from here(thanks to glenn jackman): https://unix.stackexchange.com/a/467377/306023
#grep --color=never -E -- 'READ block [0-9]+ on xvda3' "$dmesglog" |
#cat "$dmesglog" |
$sudo perl -pe '
if (! /READ block [0-9]+ on [A-Za-z0-9]+ .*$/) {
s{.*}{}s
}
s{(READ block) (\d+) (on) ([A-Za-z0-9]+) ([^$]*)\n$}
{join " ",, , , , , qx(./showblock "/dev/" "" | grep -F -- "Found path :" | cut -f4- -d" ")}es
' -- "$dmesglog"
#Note: the output of qx(...) above is purposely allowed to have trailing newline! (I did wonder if purposely is correct here or it should be purposefully, https://www.merriam-webster.com/words-at-play/purposely-purposefully-usage )
#To find out what "}es"(above) is, see perlre modifiers: https://perldoc.perl.org/perlre.html#Modifiers
#FIXME: noobish try to exclude lines not matching the lines that need to be replaced, from output! used 'if' above
示例输出:
[user@dev01-w-s-f-fdr28 ~]$ ./showallblocks sample.dmesg.log
vm.block_dump = 0
[ 6031.953619] sysctl(20774): READ block 5285528 on xvda3 (32 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.954317] sysctl(20774): READ block 5285768 on xvda3 (8 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.954598] sysctl(20774): READ block 5285648 on xvda3 (120 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.954617] sysctl(20774): READ block 5285776 on xvda3 (24 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.955482] sysctl(20774): READ block 5285560 on xvda3 (88 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.955699] sysctl(20774): READ block 4473568 on xvda3 (8 sectors) /usr/lib64/libuuid.so.1.3.0
[ 6031.955730] sysctl(20774): READ block 4473584 on xvda3 (16 sectors) /usr/lib64/libuuid.so.1.3.0
[ 6031.955787] sysctl(20774): READ block 4749496 on xvda3 (224 sectors) /usr/lib64/libblkid.so.1.1.0
[ 6030.638492] bash(20757): READ block 4543096 on xvda3 (32 sectors) /usr/bin/gawk
[ 6030.639133] awk(20757): READ block 4544280 on xvda3 (176 sectors) /usr/bin/gawk
tl;dr: 我想 see/get dmesg 上报告的块编号所属的文件名,当 vm.block_dump=1
示例 dmesg: bash(13515): READ block 5434824 on xvda3 (32 sectors)
当例如。 sudo sysctl -w vm.block_dump=1
或例如。 echo '1' | sudo tee /proc/sys/vm/block_dump
然后 "Linux reports all disk read and write operations that take place, and all block dirtyings done to files. [...] The output of block_dump is written to the kernel output, and it can be retrieved using "dmesg"。当你使用 block_dump 并且你的内核日志记录级别还包括内核调试消息时,你可能想要关闭 klogd,否则 block_dump 的输出将被记录,导致通常不存在的磁盘 activity。” (引自 here)
"block dirtyings" 不是问题,例如。
[ 3140.559675] systemd-journal(291): dirtied inode 399135 (system.journal) on xvda3
我可以看到它的文件名是这样的:
$ echo -e 'open /dev/xvda3\n ncheck 399135' | sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: ncheck 399135
Inode Pathname
399135 /var/log/journal/12c5e521101c444594b96b53751551a8/system.journal
问题出在 "Linux reports all disk read and write operations that take place"(引用上面的内容),因为它们是按块报告的,例如
[ 3140.376827] kworker/u24:3(21616): WRITE block 11037768 on xvda3 (8 sectors)
[ 3140.724725] bash(13515): READ block 5434824 on xvda3 (32 sectors)
[ 3140.725483] date(13515): READ block 5434896 on xvda3 (160 sectors)
[ 3140.728946] sed(13519): READ block 5143680 on xvda3 (32 sectors)
[ 3140.736022] sleep(13522): READ block 5379184 on xvda3 (24 sectors)
[ 3140.804803] qubes-gui(522): READ block 5179952 on xvda3 (16 sectors)
[ 3140.806519] Xorg(599): READ block 7420192 on xvda3 (176 sectors)
[ 3140.810348] InputThread(613): READ block 7418560 on xvda3 (112 sectors)
[ 3140.815866] at-spi2-registr(812): READ block 5654512 on xvda3 (8 sectors)
[ 3140.816860] xdg-desktop-por(888): READ block 5795168 on xvda3 (8 sectors)
[ 3140.818716] gnome-terminal-(865): READ block 5804672 on xvda3 (16 sectors)
[ 3141.064524] sed(13531): READ block 3446048 on xvda3 (16 sectors)
[ 3141.130808] systemd(571): READ block 4744136 on xvda3 (184 sectors)
可以在这里看到负责显示此类消息的内核代码: https://elixir.bootlin.com/linux/v4.18.5/source/block/blk-core.c#L2542
None 这些块产生任何 inode 号,使用这个:
$ echo -e 'open /dev/xvda3\n icheck 11037768' |sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: icheck 11037768
Block Inode number
11037768 <block not found>
而不是上面的 <block not found>
我应该得到一个 inode 编号,然后我可以将它与前面的 echo 一起使用来查看文件名。
我试过的:以防块号是扇区大小(512 字节)的倍数并且我知道 ext4 块大小是 4096 字节(sudo blockdev --getbsz /dev/xvda
),我也尝试过(使用乘法的结果for): 11037768*2, 11037768*4 and 11037768*8 结果相同: <block not found>
我在这里错过了什么?这些块是指向目录条目还是不是文件名的东西?即便如此,debugfs 不应该仍然产生 inode 编号吗?
有没有更好的方法来获取文件名?
编辑: icheck
(在 debugfs
内)期望的块号与 debugfs stat
在 [=34 下报告的一样=],例如 2172716-2172721
范围内的任何数字,如下所示:
$ sudo debugfs -R "stat /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py" /dev/xvda3
debugfs 1.44.2 (14-May-2018)
Inode: 550529 Type: regular Mode: 0644 Flags: 0x80000
Generation: 1781055959 Version: 0x00000000:00000001
User: 0 Group: 0 Project: 0 Size: 22179
File ACL: 0
Links: 1 Blockcount: 48
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x5b6ec29d:1b2e0200 -- Sat Aug 11 13:03:57 2018
atime: 0x5b33b5a9:00000000 -- Wed Jun 27 18:04:57 2018
mtime: 0x5b33b5a9:00000000 -- Wed Jun 27 18:04:57 2018
crtime: 0x5b6ec29d:1af0f900 -- Sat Aug 11 13:03:57 2018
Size of extra inode fields: 32
EXTENTS:
(0-5):2172716-2172721
另一种方式:
$ filefrag -s -v /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py
Filesystem type is: ef53
File size of /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py is 22179 (6 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 5: 2172716.. 2172721: 6: last,eof
/usr/lib/python2.7/site-packages/salt/modules/zonecfg.py: 1 extent found
现在问题仍然存在:这些设备块号(在 dmesg 中为 xvda3 报告的块号)与这些物理偏移量有何关系?
EDIT2: 我刚刚确认这些物理偏移号与块设备号相同(只是,显然与 dmesg 上报告的不一样);下面显示了上述文件的最后一个块,我可以确认它与使用 vim
:
$ sudo dd bs=4096 skip=2172721 count=1 if=/dev/xvda3 | hexdump -C
我已经在 Qubes OS R4.0 Fedora 28 AppVM 中的 4.18.5 内核下对此进行了测试。 (如有必要,我可以使用自定义 .config
/补丁重新编译自定义内核 - 欢迎提出建议)
当 vm.block_dump=1
时 dmesg 上报告的块号是扇区号 - 每个块 512
字节 - 但是 debugfs
命令期望其 icheck
的块号对于 ext4,命令通常每个块 4096
字节,因此所需的只是除以 8
。
虽然块大小通常为 4096 字节,但您必须确定,所以这样获取块大小:
$ sudo blockdev --getbsz /dev/xvda3
4096
虽然此值通常为 4096,但在某些情况下可能会有所不同,例如。 mkfs.ext4
和 -b
可以创建具有不同块大小的 ext4 文件系统; vfat 的块大小为 512 字节。
要从块大小中一次性获得除法器(在bash中):
$ echo $(( `sudo blockdev --getbsz /dev/xvda3` / 512 ))
8
让我们使用这一行作为输入:
[ 3140.736022] sleep(13522): READ block 5379184 on xvda3 (24 sectors)
将上述除以8
得到icheck
的块号:
$ bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
5379184/8
672398.00000000000000000000
我使用 bc -l
来确保我没有打错数字(如果它以 .00000000000000000000
结尾,我很可能没有)
使用debugfs
获取路径名:
$ sudo debugfs
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: icheck 672398
Block Inode number
672398 134571
debugfs: ncheck 134571
Inode Pathname
134571 /usr/bin/sleep
debugfs: close
debugfs: quit
我不确定 dmesg 中报告的块号是否可以是每个扇区 512 字节以外的值的倍数。例如,如果底层磁盘有 4096 bytes per sector physically,它们是否仍然是每个块 512 字节?如果我猜的话,我会说假设它们总是每个块 512 字节是安全的。
这是上面的脚本版本:
#!/bin/bash
#./showblock rev.02 rewritten for question/answer from:
#----
bytes_per_sector=512 #assumed that dmesg block numbers are 512 bytes each (ie. 512 bytes per sector; aka block size is 512)!
#----
#use `sudo` only when not already root
if test "`id -u`" != "0"; then
sudo='sudo'
else
sudo=''
fi
if ! test "$#" -ge "2"; then
echo "Usage: [=14=] <device> <dmesgblocknumber> [dmesgblocknumber ...]"
echo "Examples:"
echo "[=14=] /dev/xvda3 5379184"
echo "[=14=] /dev/xvda3 5379184 5129952 7420192"
exit 1
fi
within_exit() {
echo -e "\nSkipped current instruction within on_exit()"
}
on_exit() {
#trap - EXIT SIGINT SIGQUIT SIGHUP #will exit by skipping the rest of all instrunction from on_exit() eg. if C-c
trap within_exit EXIT SIGINT SIGQUIT SIGHUP #skip only current instruction from on_exit() eg. when C-c is pressed
#echo "first sleep"
#sleep 10
#echo "second sleep"
#sleep 10
if test "${#remaining_args[@]}" -gt 0; then
echo -n "WARNING: There are '${#remaining_args[@]}' remaining args not processed, they are: " >&2
for i in `seq 0 1 "$(( "${#remaining_args[@]}" - 1 ))"`; do #seq is part of coreutils package
echo -n "'${remaining_args[${i}]}' " >&2
done
echo >&2
fi
}
trap on_exit EXIT SIGINT SIGQUIT SIGHUP
dev=""
shift 1
if test -z "$dev" -o ! -b "$dev"; then
echo "Bad device name or not a device: '$dev'" >&2
exit 1
fi
blocksize="`$sudo blockdev --getbsz "$dev"`"
if test "${blocksize:-0}" -le "0"; then #handles empty arg too
echo "Failed getting block size for '$dev', got '$blocksize'" >&2
exit 1
fi
#TODO: check and fail if not a multiplier
divider="$(( $blocksize / $bytes_per_sector ))"
if ! test "${divider:-0}" -gt "0"; then
echo "Failed computing divider from: '$blocksize' / '$bytes_per_sector', got '$divider'" >&2
exit 1
fi
# for each passed-in dmesg block number do
while test "$#" -gt "0"; do
dmesgblock=""
shift
remaining_args=("$@") #for on_exit() above
echo '--------'
echo "Passed-in dmesg block($bytes_per_sector bytes per block) number: '$dmesgblock'"
#have to handle the case when $dmesgblock is empty and when it's negative eg. "-1" so using a default value(of 0) when unset in the below 'if' block will help not break the 'test' expecting an integer while also allowing negative numbers ("0$dmesgblock" would otherwise yield "0-1" a non-integer):
if test "${dmesgblock:-0}" -le "0"; then
echo "Bad passed-in dmesg block number: '$dmesgblock'" >&2
exit 1
fi
#TODO: check and fail(or warn? nah, it should be fail!) if not a multiplier (eg. use modullo? is it "%" ?)
block=$(( $dmesgblock / 8 ))
if ! test "${block:--1}" -ge "0"; then
echo "Badly computed device block number: '$block'" >&2
exit 1
fi
echo "Actual block number(of $blocksize bytes per block): $block"
inode="$(echo "open ${dev}"$'\n'"icheck ${block}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2 -d$'\t')"
if test "<block not found>" == "$inode"; then
echo "No inode was found for the provided dmesg block number '$dmesgblock' which mapped to dev block number '$block'" >&2
exit 1
else
#assuming number TODO: check for this!
echo "Found inode: $inode"
fpath="$(echo "open ${dev}"$'\n'"ncheck ${inode}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2- -d$'\t')"
#fpath always begins with '/' right?
if test "$fpath" != "Pathname"; then
echo "Found path : $fpath"
else
echo "not found"
fi
fi
done
示例用法:
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 "" -1 "" 1200 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: ''
Bad passed-in dmesg block number: ''
WARNING: There are '3' remaining args not processed, they are: '-1' '' '1200'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 -1 "" "" 1200 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: '-1'
Bad passed-in dmesg block number: '-1'
WARNING: There are '3' remaining args not processed, they are: '' '' '1200'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 0 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: '0'
Bad passed-in dmesg block number: '0'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 3 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: '3'
Actual block number(of 4096 bytes per block): 0
No inode was found for the provided dmesg block number '3' which mapped to dev block number '0'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
exit code: 0
这是一个处理整个 dmesg 日志的脚本:
#!/bin/bash
#./showallblocks rev.01 rewritten for question/answer from:
if test "`id -u`" != "0"; then
sudo='sudo'
else
sudo=''
fi
dmesglog=""
if test -z "$dmesglog"; then
echo "Usage: '[=16=]' <dmesglogfile>"
echo "Examples:"
echo "sudo dmesg > dmesg1.log && '[=16=]' dmesg1.log"
echo "'[=16=]' <(sudo dmesg)"
#Note: '[=16=]' used for the case when [=16=] has spaces or other things in its path names, and user wants to copy paste, for whatever reason, the output of the above into the command line.
exit 1
fi
#(optional) Stop logging if already in progress:
$sudo sysctl -w vm.block_dump=0
#Using the answer from here(thanks to glenn jackman): https://unix.stackexchange.com/a/467377/306023
#grep --color=never -E -- 'READ block [0-9]+ on xvda3' "$dmesglog" |
#cat "$dmesglog" |
$sudo perl -pe '
if (! /READ block [0-9]+ on [A-Za-z0-9]+ .*$/) {
s{.*}{}s
}
s{(READ block) (\d+) (on) ([A-Za-z0-9]+) ([^$]*)\n$}
{join " ",, , , , , qx(./showblock "/dev/" "" | grep -F -- "Found path :" | cut -f4- -d" ")}es
' -- "$dmesglog"
#Note: the output of qx(...) above is purposely allowed to have trailing newline! (I did wonder if purposely is correct here or it should be purposefully, https://www.merriam-webster.com/words-at-play/purposely-purposefully-usage )
#To find out what "}es"(above) is, see perlre modifiers: https://perldoc.perl.org/perlre.html#Modifiers
#FIXME: noobish try to exclude lines not matching the lines that need to be replaced, from output! used 'if' above
示例输出:
[user@dev01-w-s-f-fdr28 ~]$ ./showallblocks sample.dmesg.log
vm.block_dump = 0
[ 6031.953619] sysctl(20774): READ block 5285528 on xvda3 (32 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.954317] sysctl(20774): READ block 5285768 on xvda3 (8 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.954598] sysctl(20774): READ block 5285648 on xvda3 (120 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.954617] sysctl(20774): READ block 5285776 on xvda3 (24 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.955482] sysctl(20774): READ block 5285560 on xvda3 (88 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.955699] sysctl(20774): READ block 4473568 on xvda3 (8 sectors) /usr/lib64/libuuid.so.1.3.0
[ 6031.955730] sysctl(20774): READ block 4473584 on xvda3 (16 sectors) /usr/lib64/libuuid.so.1.3.0
[ 6031.955787] sysctl(20774): READ block 4749496 on xvda3 (224 sectors) /usr/lib64/libblkid.so.1.1.0
[ 6030.638492] bash(20757): READ block 4543096 on xvda3 (32 sectors) /usr/bin/gawk
[ 6030.639133] awk(20757): READ block 4544280 on xvda3 (176 sectors) /usr/bin/gawk