1 个用于多线程的文件描述符,在 losf 上显示多个打开的文件
1 file descriptor for multiple threads, shows multiple open files on losf
我有一个程序,一次有大约 200 个线程处于活动状态。当我打开一个 fd 时,我知道它是在线程之间共享的。
在 /proc/[pid]/fd
中,我实际上只能看到 1 个 fd,但是在查看所有打开的文件时,使用 lsof
我可以看到每个线程都打开了文件。 (例如,同一个文件显示 200 次,pid 相同,tid 不同)
这是什么原因?
另外,我需要让不同的线程写入同一个文件(不同的位置)。使用这个 1 fd 是线程安全的吗? (对我来说没有意义,但如果文件已经每个线程打开一次,如 lsof
所示,它可能是安全的)。
lsof
列出每个 "thread" 的文件,因为 Linux 线程不是真正的线程,因为底层 OS 设计。
Linux 上的第一个话题是 "LinuxThreads":
In the Linux operating system, LinuxThreads was a partial
implementation of POSIX Threads. It has since been superseded by the
Native POSIX Thread Library (NPTL).1 The main developer of
LinuxThreads was Xavier Leroy.
LinuxThreads had a number of problems, mainly owing to the
implementation, which used the clone system call to create a new
process sharing the parent's address space. For example, threads had
distinct process identifiers, causing problems for signal handling;
LinuxThreads used the signals SIGUSR1 and SIGUSR2 for inter-thread
coordination, meaning these signals could not be used by programs.
To improve the situation, two competing projects were started to
develop a replacement; NGPT (Next Generation POSIX Threads) and NPTL.
NPTL won out and is today shipped with the vast majority of Linux
systems.
Linux线程已替换为 NPTL - Native POSIX Thread Library。但是仍然缺乏真正的、完整的内核级线程:
Design
NPTL uses a similar approach to LinuxThreads, in that the primary abstraction known by the kernel is still a process, and new
threads are created with the clone() system call (called from the NPTL
library).
大多数时候,Linux 缺少完整的内核级线程这一事实并不明显。
OS 如何处理并发处理并不重要。
但这就是为什么 lsof
将文件列为被多个 "processes" 打开的原因。因为它是。只是那些 "processes" 与许多其他资源共享相同的地址 space。
请注意,"shared resources" 之一是打开的文件描述符的当前偏移量 - 如果您更改一个线程中的偏移量,则会为进程中的所有线程更改它。
如果需要从多个线程写入通过一个文件描述符打开的文件,您可以使用 the pwrite()
function 自动写入文件中的任意位置,而不考虑描述符的当前偏移量:
#include <unistd.h>
ssize_t pwrite(int fildes, const void *buf, size_t nbyte,
off_t offset);
...
The pwrite()
function shall be equivalent to write()
, except that
it writes into a given position and does not change the file offset
(regardless of whether O_APPEND
is set). The first three arguments to
pwrite()
are the same as write()
with the addition of a fourth
argument offset
for the desired position inside the file. An attempt
to perform a pwrite()
on a file that is incapable of seeking shall
result in an error.
请注意,在 Linux 上,如果您使用 O_APPEND
打开文件,pwrite()
is broken:
BUGS
POSIX requires that opening a file with the O_APPEND
flag should
have no effect on the location at which pwrite()
writes data.
However, on Linux, if a file is opened with O_APPEND
, pwrite()
appends data to the end of the file, regardless of the value of
offset.
我有一个程序,一次有大约 200 个线程处于活动状态。当我打开一个 fd 时,我知道它是在线程之间共享的。
在 /proc/[pid]/fd
中,我实际上只能看到 1 个 fd,但是在查看所有打开的文件时,使用 lsof
我可以看到每个线程都打开了文件。 (例如,同一个文件显示 200 次,pid 相同,tid 不同)
这是什么原因?
另外,我需要让不同的线程写入同一个文件(不同的位置)。使用这个 1 fd 是线程安全的吗? (对我来说没有意义,但如果文件已经每个线程打开一次,如 lsof
所示,它可能是安全的)。
lsof
列出每个 "thread" 的文件,因为 Linux 线程不是真正的线程,因为底层 OS 设计。
Linux 上的第一个话题是 "LinuxThreads":
In the Linux operating system, LinuxThreads was a partial implementation of POSIX Threads. It has since been superseded by the Native POSIX Thread Library (NPTL).1 The main developer of LinuxThreads was Xavier Leroy.
LinuxThreads had a number of problems, mainly owing to the implementation, which used the clone system call to create a new process sharing the parent's address space. For example, threads had distinct process identifiers, causing problems for signal handling; LinuxThreads used the signals SIGUSR1 and SIGUSR2 for inter-thread coordination, meaning these signals could not be used by programs.
To improve the situation, two competing projects were started to develop a replacement; NGPT (Next Generation POSIX Threads) and NPTL. NPTL won out and is today shipped with the vast majority of Linux systems.
Linux线程已替换为 NPTL - Native POSIX Thread Library。但是仍然缺乏真正的、完整的内核级线程:
Design
NPTL uses a similar approach to LinuxThreads, in that the primary abstraction known by the kernel is still a process, and new threads are created with the clone() system call (called from the NPTL library).
大多数时候,Linux 缺少完整的内核级线程这一事实并不明显。
OS 如何处理并发处理并不重要。
但这就是为什么 lsof
将文件列为被多个 "processes" 打开的原因。因为它是。只是那些 "processes" 与许多其他资源共享相同的地址 space。
请注意,"shared resources" 之一是打开的文件描述符的当前偏移量 - 如果您更改一个线程中的偏移量,则会为进程中的所有线程更改它。
如果需要从多个线程写入通过一个文件描述符打开的文件,您可以使用 the pwrite()
function 自动写入文件中的任意位置,而不考虑描述符的当前偏移量:
#include <unistd.h> ssize_t pwrite(int fildes, const void *buf, size_t nbyte, off_t offset);
...
The
pwrite()
function shall be equivalent towrite()
, except that it writes into a given position and does not change the file offset (regardless of whetherO_APPEND
is set). The first three arguments topwrite()
are the same aswrite()
with the addition of a fourth argumentoffset
for the desired position inside the file. An attempt to perform apwrite()
on a file that is incapable of seeking shall result in an error.
请注意,在 Linux 上,如果您使用 O_APPEND
打开文件,pwrite()
is broken:
BUGS
POSIX requires that opening a file with the
O_APPEND
flag should have no effect on the location at whichpwrite()
writes data. However, on Linux, if a file is opened withO_APPEND
,pwrite()
appends data to the end of the file, regardless of the value of offset.