如何通过命令行将爬虫数据发送到PHP?
How to send crawler data to PHP via command line?
我可以发送结果而不是存储在 JSON 文件中,将其发送到 PHP 吗?
我有这两个文件
settings.json
{
"outputFile" : "C:\wamp\www\drestip\admin\crawls\mimshoes.json",
"logFile" : "C:\wamp\www\drestip\admin\crawls\mimshoes.tsv",
"pause" : 1,
"local" : false,
"connections" : 3,
"cookiesEnabled" : false,
"robotsDisabled" : false,
"advancedMode" : true,
"crawlTemplate" : [ "www.mimshoes.com/" ],
"startUrls" : [ PAGES ],
"maxDepth" : 10,
"dataTemplate" : [ "www.mimshoes.com/{alpha}-{alpha}_{alpha}-{alpha}$" ],
"destination" : "JSON",
"connectorGuid" : "xxxxxxxxxxxxxxxxxxxxxxxx",
"canonicalDisabled" : false
}
user.json
{
"userGuid": "xxxxxxxxxxxxxxxxxxxx",
"apiKey": "xxxxxxxxxxxxxxx"
}
命令行:
C:\Users\creatingweb03\AppData\Roaming\import.io\import.ioc.exe -crawl settings.json user.json
如果您在 settings.json 中使用 "target" 参数,您可以 POST 将结果直接发送到 API 端点。
// The url that crawled data will be HTTP POSTed to
"target" : "http://localhost:9200/index/datatype",
这里有更多信息:
我可以发送结果而不是存储在 JSON 文件中,将其发送到 PHP 吗?
我有这两个文件
settings.json
{
"outputFile" : "C:\wamp\www\drestip\admin\crawls\mimshoes.json",
"logFile" : "C:\wamp\www\drestip\admin\crawls\mimshoes.tsv",
"pause" : 1,
"local" : false,
"connections" : 3,
"cookiesEnabled" : false,
"robotsDisabled" : false,
"advancedMode" : true,
"crawlTemplate" : [ "www.mimshoes.com/" ],
"startUrls" : [ PAGES ],
"maxDepth" : 10,
"dataTemplate" : [ "www.mimshoes.com/{alpha}-{alpha}_{alpha}-{alpha}$" ],
"destination" : "JSON",
"connectorGuid" : "xxxxxxxxxxxxxxxxxxxxxxxx",
"canonicalDisabled" : false
}
user.json
{
"userGuid": "xxxxxxxxxxxxxxxxxxxx",
"apiKey": "xxxxxxxxxxxxxxx"
}
命令行:
C:\Users\creatingweb03\AppData\Roaming\import.io\import.ioc.exe -crawl settings.json user.json
如果您在 settings.json 中使用 "target" 参数,您可以 POST 将结果直接发送到 API 端点。
// The url that crawled data will be HTTP POSTed to
"target" : "http://localhost:9200/index/datatype",
这里有更多信息: