使用 libpcap 收集有关连接的统计信息

Using libpcap to gather statistics on connections

我有一个 HTTP 线程代理,即对于来自客户端的每个请求,我都会生成一个线程:现在我想收集一些统计信息,例如每秒位数 (bps) 和每秒数据包数 (pps)。

我喜欢我的代码只做一件事,所以如果一个线程处理一个连接,它也不会为每个数据包计算 bps 和 pps,我将把它留给另一个线程。

我为来自客户端的每个 HTTP 请求创建一个线程,如果代理成功连接到请求的远程服务器,proxt 将实际的 HTTP 请求发送到服务器,在路由数据之前,连接线程创建一个日志线程:logging线程将计算 bps 和 pps,直到连接打开。连接线程向日志线程提供有关要过滤哪些数据包的信息(本地 IP 地址、本地端口、远程 IP 地址、远程端口),因此每个日志线程将仅过滤来自父连接线程的数据包。

我在计算每个数据包的 bps 和 pps 时遇到问题。

这是我在日志线程中循环捕获数据包的伪代码:

// pcap variables
pcap_t *handle;
struct pcap_pkthdr *header;
const u_char *pkt_data;
// timevals used to calculate delay from last filtered packet
struct timeval oldTimevalUpload;
struct timeval oldTimevalDownload;      
memset(&oldTimevalUpload, 0, sizeof(oldTimevalUpload));
memset(&oldTimevalDownload, 0, sizeof(oldTimevalDownload));

// stopLogging is a boolean flag declared in connection "parent" thread: 
// it is set to false when connection thread has done sending and
// receiving data and connection is going to be closed
while (((res = pcap_next_ex(handle, &header, &pkt_data)) >= 0) && (!stopLogging)) {
    // check res
    if (packet is upload) {
        struct timeval difference;
        timeval_subtract(&difference, &(header->ts), &oldTimevalUpload);
        long long delay = (difference.tv_sec * 1000000) + difference.tv_usec;
        long long acceptedPackets = ((long long)(pkt_data)) * 1000000;
        long long acceptedBits = ((long long)(pkt_data+8)) * 8 * 1000000;
        long long pps = acceptedPackets / delay;
        long long bps = acceptedBits / delay;
        debugRed(host << ", UPLOAD DIVIDE ACCEPTED PKTS " << acceptedPackets << 
            " AND ACCEPTED BITS " << acceptedBits << " PER DELAY " << delay << 
            " IS PPS " << pps << " AND BPS " << bps);
        oldTimevalUpload.tv_sec = header->ts.tv_sec;
        oldTimevalUpload.tv_usec = header->ts.tv_usec;
    } else if (packet is download) {
        // basically the same as above
    }
}
debug("Quit logging connection " << localIPaddr << ":" << localPort << " and "
    << remoteIPaddr << ":" << remotePort);
pcap_close(handle);

这是一个示例输出:

www.netflix.com UPLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 1479811349890053 IS PPS 0 AND BPS 0
www.netflix.com DOWNLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 1479811350032141 IS PPS 0 AND BPS 0
www.netflix.com DOWNLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 4344 IS PPS 22845174953 AND BPS 182761414364
www.netflix.com UPLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 146464 IS PPS 677568822 AND BPS 5420551015
www.netflix.com DOWNLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 2815 IS PPS 35253797513 AND BPS 282030402841
www.netflix.com UPLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 2808 IS PPS 35341680911 AND BPS 282733470085
www.netflix.com DOWNLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 1120 IS PPS 88606642857 AND BPS 708853200000
www.netflix.com UPLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 1134 IS PPS 87512733686 AND BPS 700101925925
www.netflix.com UPLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 39658 IS PPS 2502381360 AND BPS 20019052498
www.netflix.com DOWNLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 176317 IS PPS 562846690 AND BPS 4502773890
www.netflix.com UPLOAD, DIVIDE ACCEPTED PKTS 99239440000000 AND ACCEPTED BITS 793915584000000 PER DELAY 136687 IS PPS 726034224 AND BPS 5808274261

我从来不知道我的家庭网络可以承受超过 500 MBps,所以一定是出了问题。

This page 显示了如何计算 bps 和 pps,并解释了 acceptedBits 中 8 chars 的偏移,但我还是要报告下来。这里可以看到函数的第二个和第三个参数pcap_next_ex:

基本上他说的我都做了!为什么我的 bps 和 pps 这么大而且很奇怪?

正在处理 Ubuntu 14.04;不知道如何检查 libpcap 版本,但是 locate libpcap 给出了这个:

/home/dexter/Desktop/wireshark-1.99.9/wiretap/libpcap.c
/home/dexter/Desktop/wireshark-1.99.9/wiretap/libpcap.h
/usr/lib/x86_64-linux-gnu/libpcap.a
/usr/lib/x86_64-linux-gnu/libpcap.so
/usr/lib/x86_64-linux-gnu/libpcap.so.0.8
/usr/lib/x86_64-linux-gnu/libpcap.so.1.5.3
/usr/share/doc/libpcap-dev
/usr/share/doc/libpcap0.8
/usr/share/doc/libpcap0.8-dev
/usr/share/doc/libpcap-dev/changelog.Debian.gz
/usr/share/doc/libpcap-dev/copyright
/usr/share/doc/libpcap0.8/CREDITS.gz
/usr/share/doc/libpcap0.8/README.Debian
/usr/share/doc/libpcap0.8/README.gz
/usr/share/doc/libpcap0.8/changelog.Debian.gz
/usr/share/doc/libpcap0.8/copyright
/usr/share/doc/libpcap0.8-dev/changelog.Debian.gz
/usr/share/doc/libpcap0.8-dev/copyright
/var/lib/dpkg/info/libpcap-dev.[list,md5sums]
/var/lib/dpkg/info/libpcap0.8-dev.[list,md5sums,preinst] 
/var/lib/dpkg/info/libpcap0.8:amd64.[list,md5sums,postinst,postrm,shlibs,symbols]

在您的代码中:

long long acceptedPackets = ((long long)(pkt_data)) * 1000000;
long long acceptedBits = ((long long)(pkt_data+8)) * 8 * 1000000;

而pkt_data是指针。

你所做的基本上是获取数据包数据的地址,将其转换为 long long,加 8(对于第二行),将其乘以一个常数并将其视为你的值,这在语义上是不正确的。您应该取消引用该指针,同时考虑到您的数据类型(将 pkt_data 转换为指向 long long 的指针)。

在代码中:

long long acceptedPackets = (*(long long*)(pkt_data)) * 1000000;
long long acceptedBits = (*(long long*)(pkt_data+8)) * 8 * 1000000; 
// this also works:
//long long acceptedPackets = *(long long*)pkt_data * 1000000;
//long long acceptedBits = *((long long*)pkt_data + 1) * 8 * 1000000;

有关示例,请参阅 http://ideone.com/JqmRre

编辑

来自 this guide:

The last argument is the most interesting of them all, and the most confusing to the average novice pcap programmer. It is another pointer to a u_char, and it points to the first byte of a chunk of data containing the entire packet

表示pkt_data是数据包内容本身。除非你的数据包的前 16 个字节包含所需的信息(这是不正确的,因为它包含原始数据包,所以它有 ETH、IP 和 TCP/UDP headers)你不能使用该数据。为了获得 PPS 指标,您必须在循环中实现一个简单的计数器(因为您每帧都打印该指标,所以一个简单的 long long pps = (long long)(1.0 / delay); 就足够了 - 请注意除法在浮点。对于您的 BPS 指标,您应该使用帧 header 信息。所以 long long bps = (long long)(header->caplen * 8.0 / delay); 应该这样做。

附带说明一下,对于时间指标,由于您使用的是 C++11,请尝试使用 chrono。比 timeval:

更清晰更安全

添加一个#include <chrono>.

您的最终代码应如下所示:

// pcap variables
pcap_t *handle;
struct pcap_pkthdr *header;
const u_char *pkt_data;
// Use of high_resolution_clock
std::chrono::high_resolution_clock::time_point oldTimeUpload = std::chrono::high_resolution_clock::now();
std::chrono::high_resolution_clock::time_point oldTimeDownload = std::chrono::high_resolution_clock::now();

// stopLogging is a boolean flag declared in connection "parent" thread: 
// it is set to false when connection thread has done sending and
// receiving data and connection is going to be closed
while (((res = pcap_next_ex(handle, &header, &pkt_data)) >= 0) && (!stopLogging)) {
    // check res
    if (packet is upload) {
        std::chrono::high_resolution_clock::time_point now = std::chrono::high_resolution_clock::now();
        long long delay = std::chrono::duration_cast<std::chrono::nanoseconds>(now - oldTimeUpload).count();
        long long pps = (long long)(1000000000.0 / delay);
        long long bps = (long long)(header->caplen * 8 * 1000000000.0 / delay);
        debugRed(host << ", UPLOAD DIVIDE PER DELAY " << delay << 
            " IS PPS " << pps << " AND BPS " << bps);
        oldTimevalUpload.tv_sec = header->ts.tv_sec;
        oldTimevalUpload.tv_usec = header->ts.tv_usec;
    } else if (packet is download) {
        // basically the same as above
    }
}
debug("Quit logging connection " << localIPaddr << ":" << localPort << " and "
    << remoteIPaddr << ":" << remotePort);
pcap_close(handle);

在另一个进程而不是另一个线程上计算 pps 和 bps 怎么样?我可以推荐 HttpAnalyzer 实用程序,它直接从网络接口捕获 HTTP 数据包并计算 pps、bps 和更多统计数据。

由于它是开源的,您可以更改代码以满足您的目的或按原样使用它。

这是该实用程序的示例输出:

STATS SUMMARY
=============

General stats
--------------------

Sample time:                                     18.374 [Seconds]
Number of HTTP packets:                            5662 [Packets]
Rate of HTTP packets:                           291.910 [Packets/sec]
Number of HTTP flows:                                55 [Flows]
Rate of HTTP flows:                               2.836 [Flows/sec]
Number of HTTP pipelining flows:                      0 [Flows]
Number of HTTP transactions:                        322 [Transactions]
Rate of HTTP transactions:                       16.601 [Transactions/sec]
Total HTTP data:                                5916120 [Bytes]
Rate of HTTP data:                           305011.600 [Bytes/sec]
Average packets per flow:                       102.945 [Packets]
Average transactions per flow:                    5.963 [Transactions]
Average data per flow:                       107565.818 [Bytes]

HTTP request stats
--------------------

Number of HTTP requests:                            323 [Requests]
Rate of HTTP requests:                           16.653 [Requests/sec]
Total data in headers:                           188596 [Bytes]
Average header size:                            583.889 [Bytes]

HTTP response stats
--------------------

Number of HTTP responses:                           332 [Responses]
Rate of HTTP responses:                          17.117 [Responses/sec]
Total data in headers:                           119577 [Bytes]
Average header size:                            360.172 [Bytes]
Num of responses with content-length:               320 [Responses]
Total body size (may be compressed):            5409410 [Bytes]
Average body size:                            16904.406 [Bytes]

HTTP request methods
--------------------

| Method    | Count |
---------------------
| GET       | 321   |
| POST      | 2     |
---------------------

Hostnames count
--------------------

| Hostname                                 | Count |
----------------------------------------------------
| images1.teny.co.qq                       | 180   |
| www.teny.co.qq                           | 82    |
| go.teny.co.qq                            | 14    |
| www.niwwin.co.qq                         | 8     |
| az835984.vo.msecnd.net                   | 5     |
| asset.pagefair.com                       | 3     |
| b.scorecardresearch.com                  | 3     |
| cdn.oolala.com                           | 3     |
| asset.pagefair.net                       | 2     |
| dy2.teny.co.qq                           | 2     |
| ecdn.firstimpression.io                  | 2     |
| pagead2.googlesyndication.com            | 2     |
| server.exposebox.com                     | 2     |
| totalmedia2.teny.co.qq                   | 2     |
| vrp.mybrain.com                          | 1     |
| trc.oolala.com                           | 1     |
| zdwidget3-bs.sphereup.com                | 1     |
| vrt.mybrain.com                          | 1     |
| www.googletagmanager.com                 | 1     |
| a.visualrevenue.com                      | 1     |
| tpc.googlesyndication.com                | 1     |
| static.dynamicyield.com                  | 1     |
| st.dynamicyield.com                      | 1     |
| sf.exposebox.com                         | 1     |
| mediadownload.teny.co.qq                 | 1     |
| cdn.firstimpression.io                   | 1     |
| ajax.googleapis.com                      | 1     |
----------------------------------------------------

Status code count
--------------------

| Status Code                  | Count |
----------------------------------------
| 200 OK                       | 327   |
| 204 No Content               | 1     |
| 301 Moved Permanently        | 1     |
| 302 Moved Temporarily        | 1     |
| 304 Not Modified             | 2     |
----------------------------------------

Content-type count
--------------------

| Content-type                   | Count |
------------------------------------------
| application/javascript         | 11    |
| application/json               | 1     |
| application/x-javascript       | 23    |
| image/gif                      | 22    |
| image/jpeg                     | 157   |
| image/png                      | 85    |
| text/css                       | 9     |
| text/html                      | 8     |
| text/javascript                | 13    |
------------------------------------------