什么会导致非阻塞套接字在 `recv` 上阻塞?
What could cause a non-blocking socket to block on `recv`?
我有一个 TCP/IP 设置为非阻塞的套接字,它无论如何都是阻塞的。套接字仅在一个线程中被引用。此代码适用于 Windows(有一些呼叫替换)但不适用于 Linux。我的代码看起来像这样(不要介意 C 风格的转换——这是很久以前写的。另外,我对它进行了一些修整,所以如果我不小心修掉了一个步骤,请告诉我。很可能我我实际上是在做那一步。实际代码在另一台计算机上,所以我无法复制粘贴。):
// In the real code, these are class members. I'm not bonkers
int mSocket;
sockaddr_in mAddress;
void CreateSocket(
unsigned int ipAddress,
unsigned short port)
{
// Omitting my error checking in this question for brevity because everything comes back valid
mSocket = socket(AF_INET, SOCK_STREAM, 0); // Not -1
int oldFlags = fctnl(mSocket, F_GETFL, 0); // Not -1
fcntl(mSocket, F_SETFL, oldFlags | O_NONBLOCK); // Not -1
mAddress.sin_family = AF_INET;
mAddress.sin_addr.s_addr = ipAddress; // address is valid
mAddress.sin_port = htons((u_short)port); // port is not 0 and allowed on firewall
memset(mAddress.sin_zero, 0, sizeof(mAddress.sin_zero));
// <Connect attempt loop starts here>
connect(mSocket, (sockaddr*)&mAddress, sizeof(mAddress)); // Not -1 to exit loop
// <Connect attempt loop ends here>
// Connection is now successful ('connect' returned a value other than -1)
}
// ... Stuff happens ...
// ... Then this is called because 'select' call shows read data available ...
void AttemptReceive(
MyReturnBufferTypeThatsNotImportant &returnedBytes)
{
// Read socket
const size_t bufferSize = 4096;
char buffer[bufferSize];
int result = 0;
do {
// Debugging code: sanity checks
int socketFlags = fcntl(mSocket, F_GETFL, 0); // Not -1
printf("result=%d\n", result);
printf("O_NONBLOCK? %d\n", socketFlags & O_NONBLOCK); // Always prints "O_NONBLOCK? 2048"
result = recv(mSocket, buffer, bufferSize, 0); // NEVER -1 or 0 after hundreds to thousands of calls, then suddenly blocks
// ... Save off and package read data into user format for output to caller ...
} while (result == bufferSize);
}
我相信,因为调用 AttemptReceive 是为了响应 select,套接字恰好包含正好等于缓冲区大小 (4096) 倍数的字节数。我已经用 printf 语句证实了这一点,所以它永远不会在第一次循环时阻塞。每次发生此错误时,在线程块之前打印的最后两行是:
result=4096
O_NONBLOCK? 2048
将 recv
行更改为 recv(mSocket, buffer, bufferSize, MSG_DONTWAIT);
实际上 "fixes" 问题(突然间,recv 偶尔 returns -1 with errno EWOULDBLOCK/EAGAIN (两者都等于彼此在我的 OS)) 上,但恐怕我只是在涌出的伤口上贴创可贴,可以这么说。有什么想法吗?
P.S。地址是 "localhost",但我认为这不重要。
注意:我使用的是旧编译器(不是自愿的),2010 年的 g++ 4.4.7-23。这可能与问题有关。
socket()
使用我的操作系统和编译器自动在套接字上设置 O_RDWR
,但似乎 O_RDWR
在开始时不小心在有问题的套接字上取消设置程序(如果有数据要读取,它会以某种方式允许它正常读取,否则会阻塞)。修复该错误会导致套接字停止阻塞。显然,O_RDWR
和 O_NONBLOCK
都是避免套接字阻塞所必需的,至少在我的操作系统和编译器上是这样。
我有一个 TCP/IP 设置为非阻塞的套接字,它无论如何都是阻塞的。套接字仅在一个线程中被引用。此代码适用于 Windows(有一些呼叫替换)但不适用于 Linux。我的代码看起来像这样(不要介意 C 风格的转换——这是很久以前写的。另外,我对它进行了一些修整,所以如果我不小心修掉了一个步骤,请告诉我。很可能我我实际上是在做那一步。实际代码在另一台计算机上,所以我无法复制粘贴。):
// In the real code, these are class members. I'm not bonkers
int mSocket;
sockaddr_in mAddress;
void CreateSocket(
unsigned int ipAddress,
unsigned short port)
{
// Omitting my error checking in this question for brevity because everything comes back valid
mSocket = socket(AF_INET, SOCK_STREAM, 0); // Not -1
int oldFlags = fctnl(mSocket, F_GETFL, 0); // Not -1
fcntl(mSocket, F_SETFL, oldFlags | O_NONBLOCK); // Not -1
mAddress.sin_family = AF_INET;
mAddress.sin_addr.s_addr = ipAddress; // address is valid
mAddress.sin_port = htons((u_short)port); // port is not 0 and allowed on firewall
memset(mAddress.sin_zero, 0, sizeof(mAddress.sin_zero));
// <Connect attempt loop starts here>
connect(mSocket, (sockaddr*)&mAddress, sizeof(mAddress)); // Not -1 to exit loop
// <Connect attempt loop ends here>
// Connection is now successful ('connect' returned a value other than -1)
}
// ... Stuff happens ...
// ... Then this is called because 'select' call shows read data available ...
void AttemptReceive(
MyReturnBufferTypeThatsNotImportant &returnedBytes)
{
// Read socket
const size_t bufferSize = 4096;
char buffer[bufferSize];
int result = 0;
do {
// Debugging code: sanity checks
int socketFlags = fcntl(mSocket, F_GETFL, 0); // Not -1
printf("result=%d\n", result);
printf("O_NONBLOCK? %d\n", socketFlags & O_NONBLOCK); // Always prints "O_NONBLOCK? 2048"
result = recv(mSocket, buffer, bufferSize, 0); // NEVER -1 or 0 after hundreds to thousands of calls, then suddenly blocks
// ... Save off and package read data into user format for output to caller ...
} while (result == bufferSize);
}
我相信,因为调用 AttemptReceive 是为了响应 select,套接字恰好包含正好等于缓冲区大小 (4096) 倍数的字节数。我已经用 printf 语句证实了这一点,所以它永远不会在第一次循环时阻塞。每次发生此错误时,在线程块之前打印的最后两行是:
result=4096
O_NONBLOCK? 2048
将 recv
行更改为 recv(mSocket, buffer, bufferSize, MSG_DONTWAIT);
实际上 "fixes" 问题(突然间,recv 偶尔 returns -1 with errno EWOULDBLOCK/EAGAIN (两者都等于彼此在我的 OS)) 上,但恐怕我只是在涌出的伤口上贴创可贴,可以这么说。有什么想法吗?
P.S。地址是 "localhost",但我认为这不重要。
注意:我使用的是旧编译器(不是自愿的),2010 年的 g++ 4.4.7-23。这可能与问题有关。
socket()
使用我的操作系统和编译器自动在套接字上设置 O_RDWR
,但似乎 O_RDWR
在开始时不小心在有问题的套接字上取消设置程序(如果有数据要读取,它会以某种方式允许它正常读取,否则会阻塞)。修复该错误会导致套接字停止阻塞。显然,O_RDWR
和 O_NONBLOCK
都是避免套接字阻塞所必需的,至少在我的操作系统和编译器上是这样。