tutorialspoint 使用 tcpsocket 的简单网络浏览器
tutorialspoint's simple web browser using tcpsocket
这段代码据说可以获取任何网页的内容:
require 'socket'
host = 'www.tutorialspoint.com' # The web server
port = 80 # Default HTTP port
path = "/index.htm" # The file we want
# This is the HTTP request we send to fetch a file
request = "GET #{path} HTTP/1.0\r\n\r\n"
socket = TCPSocket.open(host,port) # Connect to server
socket.print(request) # Send request
response = socket.read # Read complete response
# Split response at first blank line into headers and body
headers,body = response.split("\r\n\r\n", 2)
puts headers
puts body
当我在命令行中 运行 时,我得到一个 404 错误,但是当我转到 www.tutorialspoint.com/index.htm 时,它就在那里,那是什么原因呢?:
404 Error Information
尽管如此,我在使用 open-uri 库获取网页内容时没有遇到任何问题。但是我想知道如何使用这个。
您的请求缺少 Host 参数:
host = 'www.tutorialspoint.com' # The web server
port = 80 # Default HTTP port
path = "/index.htm" # The file we want
# This is the HTTP request we send to fetch a file
request = "GET #{path} HTTP/1.0\r\nHost: #{host}\r\n\r\n"
请注意,显然并非所有网络服务器都需要 "Host:" 行(但请参阅评论)。
这段代码据说可以获取任何网页的内容:
require 'socket'
host = 'www.tutorialspoint.com' # The web server
port = 80 # Default HTTP port
path = "/index.htm" # The file we want
# This is the HTTP request we send to fetch a file
request = "GET #{path} HTTP/1.0\r\n\r\n"
socket = TCPSocket.open(host,port) # Connect to server
socket.print(request) # Send request
response = socket.read # Read complete response
# Split response at first blank line into headers and body
headers,body = response.split("\r\n\r\n", 2)
puts headers
puts body
当我在命令行中 运行 时,我得到一个 404 错误,但是当我转到 www.tutorialspoint.com/index.htm 时,它就在那里,那是什么原因呢?:
404 Error Information
尽管如此,我在使用 open-uri 库获取网页内容时没有遇到任何问题。但是我想知道如何使用这个。
您的请求缺少 Host 参数:
host = 'www.tutorialspoint.com' # The web server
port = 80 # Default HTTP port
path = "/index.htm" # The file we want
# This is the HTTP request we send to fetch a file
request = "GET #{path} HTTP/1.0\r\nHost: #{host}\r\n\r\n"
请注意,显然并非所有网络服务器都需要 "Host:" 行(但请参阅评论)。