关于 c 中的跟踪路由的一些相关问题：

Question

根据Wikipedia，traceroute程序

Traceroute, by default, sends a sequence of User Datagram Protocol (UDP) packets addressed to a destination host[...] The time-to-live (TTL) value, also known as hop limit, is used in determining the intermediate routers being traversed towards the destination. Routers decrement packets' TTL value by 1 when routing and discard packets whose TTL value has reached zero, returning the ICMP error message ICMP Time Exceeded.[..]

我开始编写程序（使用示例 UDP 程序作为指南）来遵守此规范，

#include <sys/socket.h>
#include <assert.h>
#include <netinet/udp.h>     //Provides declarations for udp header
#include <netinet/ip.h>      //Provides declarations for ip header
#include <stdio.h>
#include <string.h>
#include <arpa/inet.h>
#include <unistd.h>

#define DATAGRAM_LEN sizeof(struct iphdr) + sizeof(struct iphdr)

unsigned short csum(unsigned short *ptr,int nbytes) {
    register long sum;
    unsigned short oddbyte;
    register short answer;

    sum=0;
    while(nbytes>1) {
        sum+=*ptr++;
        nbytes-=2;
    }
    if(nbytes==1) {
        oddbyte=0;
        *((u_char*)&oddbyte)=*(u_char*)ptr;
        sum+=oddbyte;
    }

    sum = (sum>>16)+(sum & 0xffff);
    sum = sum + (sum>>16);
    answer=(short)~sum;

    return(answer);
}

char *new_packet(int ttl, struct sockaddr_in sin) {
    static int id = 0;
    char *datagram = malloc(DATAGRAM_LEN);
    struct iphdr *iph = (struct iphdr*) datagram;
    struct udphdr *udph = (struct udphdr*)(datagram + sizeof (struct iphdr));

    iph->ihl = 5;
    iph->version = 4;
    iph->tos = 0;
    iph->tot_len = DATAGRAM_LEN;
    iph->id = htonl(++id); //Id of this packet
    iph->frag_off = 0;
    iph->ttl = ttl;
    iph->protocol = IPPROTO_UDP;
    iph->saddr = inet_addr("127.0.0.1");//Spoof the source ip address
    iph->daddr = sin.sin_addr.s_addr;
    iph->check = csum((unsigned short*)datagram, iph->tot_len);

    udph->source = htons(6666);
    udph->dest = htons(8622);
    udph->len = htons(8); //udp header size
    udph->check = csum((unsigned short*)datagram, DATAGRAM_LEN);

    return datagram;
}

int main(int argc, char **argv) {
    int s, ttl, repeat;
    struct sockaddr_in sin;
    char *data;

    printf("\n");

    if (argc != 3) {
        printf("usage: %s <host> <port>", argv[0]);
        return __LINE__;
    }

    sin.sin_family = AF_INET;
    sin.sin_addr.s_addr = inet_addr(argv[1]);
    sin.sin_port = htons(atoi(argv[2]));

    if ((s = socket(AF_PACKET, SOCK_RAW, 0)) < 0) {
        printf("Failed to create socket.\n");
        return __LINE__;
    }

    ttl = 1, repeat = 0;
    while (ttl < 2) {
        data = new_packet(ttl);
        if (write(s, data, DATAGRAM_LEN) != DATAGRAM_LEN) {
            printf("Socket failed to send packet.\n");
            return __LINE__;
        }
        read(s, data, DATAGRAM_LEN);
        free(data);
        if (++repeat > 2) {
            repeat = 0;
            ttl++;
        }
    }
    return 0;
}

...但是此时我有几个问题。

是read(s, data, ...一次读取整个数据包，还是需要解析从套接字读取的数据；寻找特定于 IP 数据包的标记？
什么是唯一标记我的数据包的最佳方法，因为它们 return 到我的盒子是过期的？
我应该用 IPPROTO_ICMP 标志设置第二个套接字，还是写一个过滤器更容易；接受一切？
是否存在其他常见错误；还是可以预见任何常见的障碍？

Answer 1

一个常见的陷阱是，在此级别的编程需要非常小心地使用正确的包含文件。例如，您的程序将无法在 NetBSD 上编译，而 NetBSD 通常非常严格地遵循相关标准。即使我添加了一些包含，也没有 struct iphdr 而是有一个 struct udpiphdr。

所以现在我剩下的答案不是基于在实践中尝试你的程序。

read(2) 可用于一次读取单个数据包。对于面向数据包的协议，例如 UDP，您从中获取的数据永远不会超过单个数据包。但是，您也可以使用 recvfrom(2)、recv(2) 或 recvmsg(2) 来接收数据包。

If fildes refers to a socket, read() shall be equivalent to recv() with no flags set.

要识别数据包，我相信通常可以使用 id 字段，正如您已经使用的那样。我不确定你说的 "mark my packets as they return to my box as expired" 是什么意思，因为你的数据包不会 return 给你。您可能会收到的是 ICMP Time Exceeded 消息。这些通常会在几秒钟内到达，如果它们完全到达的话。有时它们不会被发送，有时它们可能会被您和它们的发件人之间配置错误的路由器阻止。请注意，这假设您在数据包中设置的 IP ID 受到您正在使用的网络堆栈的尊重。有可能它没有，并用不同的 ID 替换您选择的 ID。 traceroute command as found in NetBSD 的原作者 Van Jacobson 因此使用了不同的方法：

 * The udp port usage may appear bizarre (well, ok, it is bizarre).
 * The problem is that an icmp message only contains 8 bytes of
 * data from the original datagram.  8 bytes is the size of a udp
 * header so, if we want to associate replies with the original
 * datagram, the necessary information must be encoded into the
 * udp header (the ip id could be used but there's no way to
 * interlock with the kernel's assignment of ip id's and, anyway,
 * it would have taken a lot more kernel hacking to allow this
 * code to set the ip id).  So, to allow two or more users to
 * use traceroute simultaneously, we use this task's pid as the
 * source port (the high bit is set to move the port number out
 * of the "likely" range).  To keep track of which probe is being
 * replied to (so times and/or hop counts don't get confused by a
 * reply that was delayed in transit), we increment the destination
 * port number before each probe.

使用IPPROTO_ICMP 套接字接收回复比尝试接收所有数据包更有效。这样做也需要更少的特权。当然，发送原始数据包通常已经需要 root，但如果使用更细粒度的权限系统，它可能会有所不同。

Answer 2

这是我的一些建议（基于假设它是一台 Linux 机器）。

读取数据包您可能想要读取整个 1500 字节的数据包（整个以太网帧）。别担心 - 较小的帧仍然可以完全读取 read returning 读取数据的长度。
添加标记的最佳方法是使用一些 UDP 负载（一个简单的无符号整数）就足够了。在发送的每个数据包上增加它。（我刚刚在 traceroute 上做了一个 tcpdump - ICMP 错误 - 返回 return 整个 IP 帧 - 所以你可以查看 returned IP 帧，解析 UDP 有效负载等等。注意你的DATAGRAM_LEN会相应变化。）当然你可以使用ID - 但要注意ID主要用于分片。你应该没问题 - 因为你不会接近具有这些数据包大小的任何中间路由器的碎片限制。一般来说，'steal' 协议字段不是一个好主意，因为我们的自定义目的是为了其他目的。
更简洁的方法可能是在原始套接字上实际使用 IPPROTO_ICMP（如果您的机器上安装了手册 man 7 raw 和 man 7 icmp）。您不希望在您的设备上接收 all 数据包的副本并忽略那些不是 ICMP 的数据包。
如果您在 AF_PACKET 上使用类型 SOCKET_RAW，您将必须手动附加一个 link 层 header 或者您可以执行 SOCKET_DGRAM 并检查。还有 man 7 packet 很多细节。

希望对您有所帮助，或者您正在查看一些实际代码？

关于 c 中的跟踪路由的一些相关问题：

A few related questions regarding traceroutes in c:

c

sockets

networking

traceroute