为什么 Intel Pin 无法检测 open 系统调用?
Why is Intel Pin not able to instrument open syscall?
我正在尝试构建一个 pintool,它应该能够检测针对特定 file/directory 的 open()
系统调用,并将文件路径参数替换为另一个值。
例如,这是我要检测的非常简单的代码:
#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
using namespace std;
int main(int argc, char **argv)
{
int i = open("/home/preet_derasari/important.txt", O_RDONLY);
cout << "fid: " << i << endl;
}
在此示例中,我希望 Pin 将文件路径从 /home/preet_derasari/important.txt
更改为 /home/preet_derasari/dummy.txt
。
为了做到这一点,我在参考了一些示例 pintools 和 Pin API 之后编写了一个非常简单的 pintool:
#include "pin.H"
#include <iostream>
#include <fstream>
#include <syscall.h>
#include <string>
using namespace std;
INT32 Usage()
{
cout << "This tool prints out the number of dynamically executed " << endl
<< "instructions, basic blocks and threads in the application." << endl
<< endl;
cout << KNOB_BASE::StringKnobSummary() << endl;
return -1;
}
void SyscallEntry(THREADID threadIndex, CONTEXT *ctxt, SYSCALL_STANDARD std, void *v)
{
ADDRINT sysNum = PIN_GetSyscallNumber(ctxt, std);
cout << "entered syscall: " << sysNum << endl;
if(sysNum == SYS_open)
{
cout << "open encountered!" << endl;
char *path = (char *)PIN_GetSyscallArgument(ctxt, std, 0);
cout << "Original File Path: " << path << endl;
int match = strcmp((char *)PIN_GetSyscallArgument(ctxt, std, 0), "/home/preet_derasari/important.txt");
if(!match)
{
string pathDummy = "/home/preet_derasari/dummy.txt";
PIN_SetSyscallArgument (ctxt, std, 0, (ADDRINT) pathDummy.c_str());
cout << "Dummy File Path: " << pathDummy << endl;
}
}
}
int main(int argc, char* argv[])
{
cout << "Open Syscall Value: " << SYS_open << endl;
if (PIN_Init(argc, argv))
{
return Usage();
}
cout << "===============================================" << endl;
cout << "This application is instrumented by MyPinTool" << endl;
cout << "===============================================" << endl;
PIN_AddSyscallEntryFunction(SyscallEntry, 0);
// Start the program, never returns
PIN_StartProgram();
return 0;
}
I 运行 pintool 使用此命令:../../../pin -t obj-intel64/MY_pin.so -- test
其中 MY_pin.so
是 pintool 共享对象库,test 是上面提到的示例代码。
输出让我感到困惑,因为 Pin 正在检测所有系统调用 except open:
Open Syscall Value: 2
===============================================
This application is instrumented by MyPinTool
===============================================
entered syscall: 12
entered syscall: 158
entered syscall: 21
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 257
entered syscall: 0
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 5
entered syscall: 9
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 3
entered syscall: 158
entered syscall: 10
entered syscall: 10
entered syscall: 10
entered syscall: 11
entered syscall: 12
entered syscall: 12
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 3
正如您所看到的,除了 open
之外的所有系统调用都使用 pin 工具,即系统调用编号 2(基于 x86_64
ISA)。
一个有趣的观察是该程序没有从我的测试程序 (cout << "fid: " << i << endl;
) 中输出 cout
,这让我怀疑 Pin 是否对打开的系统调用做了一些奇怪的事情?
规格:
- 引脚版本 - pin-3.21-98484-e7cd811fd
- gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
- ISA:x86_64
- CPU: AMD Ryzen 7 1700X 八核处理器
有人可以帮我理解为什么会这样吗?
strace cat foo
表明程序不再使用旧的 open(2)
系统调用:
...
openat(AT_FDCWD, "foo", O_RDONLY) = 3
...
__NR_openat
是 257,您的 PIN 工具观察到了 3 次。显然,甚至 open()
libc 包装函数在内部也使用 openat
Linux 系统调用。 (__NR_open = 2
系统调用仍然有效;内核也有代码将其 args 传递给当前实现。IDK 效率更高,就像它可能只是设置一个 AT_FDCWD
arg 并调用 sys_openat()
必须再次解码,就像 glibc 在 user-space 中所做的那样。)
The open(2) man page also documents openat(2).
The dirfd argument is used in conjunction with the pathname
argument as follows:
If the pathname given in pathname is absolute, then dirfd is
ignored.
If the pathname given in pathname is relative and dirfd is the
special value AT_FDCWD
, then pathname is interpreted relative
to the current working directory of the calling process (like
open()).
...
openat
/ linkat
等等,当与 open(O_DIRECTORY)
中的 fd
一起使用时,让像 find
这样的程序避免 TOCTOU 竞赛,and/or 让 multi-threaded 程序避免实际上必须 chdir
(因为每个进程只有一个 CWD,而不是每个线程。)
将它们与 AT_FDCWD
一起使用与 old-style open(2)
.
没有优势或劣势
我正在尝试构建一个 pintool,它应该能够检测针对特定 file/directory 的 open()
系统调用,并将文件路径参数替换为另一个值。
例如,这是我要检测的非常简单的代码:
#include <iostream>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
using namespace std;
int main(int argc, char **argv)
{
int i = open("/home/preet_derasari/important.txt", O_RDONLY);
cout << "fid: " << i << endl;
}
在此示例中,我希望 Pin 将文件路径从 /home/preet_derasari/important.txt
更改为 /home/preet_derasari/dummy.txt
。
为了做到这一点,我在参考了一些示例 pintools 和 Pin API 之后编写了一个非常简单的 pintool:
#include "pin.H"
#include <iostream>
#include <fstream>
#include <syscall.h>
#include <string>
using namespace std;
INT32 Usage()
{
cout << "This tool prints out the number of dynamically executed " << endl
<< "instructions, basic blocks and threads in the application." << endl
<< endl;
cout << KNOB_BASE::StringKnobSummary() << endl;
return -1;
}
void SyscallEntry(THREADID threadIndex, CONTEXT *ctxt, SYSCALL_STANDARD std, void *v)
{
ADDRINT sysNum = PIN_GetSyscallNumber(ctxt, std);
cout << "entered syscall: " << sysNum << endl;
if(sysNum == SYS_open)
{
cout << "open encountered!" << endl;
char *path = (char *)PIN_GetSyscallArgument(ctxt, std, 0);
cout << "Original File Path: " << path << endl;
int match = strcmp((char *)PIN_GetSyscallArgument(ctxt, std, 0), "/home/preet_derasari/important.txt");
if(!match)
{
string pathDummy = "/home/preet_derasari/dummy.txt";
PIN_SetSyscallArgument (ctxt, std, 0, (ADDRINT) pathDummy.c_str());
cout << "Dummy File Path: " << pathDummy << endl;
}
}
}
int main(int argc, char* argv[])
{
cout << "Open Syscall Value: " << SYS_open << endl;
if (PIN_Init(argc, argv))
{
return Usage();
}
cout << "===============================================" << endl;
cout << "This application is instrumented by MyPinTool" << endl;
cout << "===============================================" << endl;
PIN_AddSyscallEntryFunction(SyscallEntry, 0);
// Start the program, never returns
PIN_StartProgram();
return 0;
}
I 运行 pintool 使用此命令:../../../pin -t obj-intel64/MY_pin.so -- test
其中 MY_pin.so
是 pintool 共享对象库,test 是上面提到的示例代码。
输出让我感到困惑,因为 Pin 正在检测所有系统调用 except open:
Open Syscall Value: 2
===============================================
This application is instrumented by MyPinTool
===============================================
entered syscall: 12
entered syscall: 158
entered syscall: 21
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 257
entered syscall: 0
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 5
entered syscall: 9
entered syscall: 17
entered syscall: 17
entered syscall: 17
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 9
entered syscall: 3
entered syscall: 158
entered syscall: 10
entered syscall: 10
entered syscall: 10
entered syscall: 11
entered syscall: 12
entered syscall: 12
entered syscall: 257
entered syscall: 5
entered syscall: 9
entered syscall: 3
entered syscall: 3
正如您所看到的,除了 open
之外的所有系统调用都使用 pin 工具,即系统调用编号 2(基于 x86_64
ISA)。
一个有趣的观察是该程序没有从我的测试程序 (cout << "fid: " << i << endl;
) 中输出 cout
,这让我怀疑 Pin 是否对打开的系统调用做了一些奇怪的事情?
规格:
- 引脚版本 - pin-3.21-98484-e7cd811fd
- gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
- ISA:x86_64
- CPU: AMD Ryzen 7 1700X 八核处理器
有人可以帮我理解为什么会这样吗?
strace cat foo
表明程序不再使用旧的 open(2)
系统调用:
...
openat(AT_FDCWD, "foo", O_RDONLY) = 3
...
__NR_openat
是 257,您的 PIN 工具观察到了 3 次。显然,甚至 open()
libc 包装函数在内部也使用 openat
Linux 系统调用。 (__NR_open = 2
系统调用仍然有效;内核也有代码将其 args 传递给当前实现。IDK 效率更高,就像它可能只是设置一个 AT_FDCWD
arg 并调用 sys_openat()
必须再次解码,就像 glibc 在 user-space 中所做的那样。)
The open(2) man page also documents openat(2).
The dirfd argument is used in conjunction with the pathname argument as follows:
If the pathname given in pathname is absolute, then dirfd is ignored.
If the pathname given in pathname is relative and dirfd is the special value
AT_FDCWD
, then pathname is interpreted relative to the current working directory of the calling process (like open())....
openat
/ linkat
等等,当与 open(O_DIRECTORY)
中的 fd
一起使用时,让像 find
这样的程序避免 TOCTOU 竞赛,and/or 让 multi-threaded 程序避免实际上必须 chdir
(因为每个进程只有一个 CWD,而不是每个线程。)
将它们与 AT_FDCWD
一起使用与 old-style open(2)
.