使用 rcurl 从 https 获取和发布 html 表单

getting and posting html form from https using rcurl

我正在尝试从 https site 中获取 post 表格。

表格的用户名(RamiLevi),没有密码。一旦我得到表格,我就可以 post 它与 rvest 并得到我真正需要的 html。

GET("https://url.retail.publishedprices.co.il/login")
Error in function (type, msg, asError = TRUE)  : 
SSL certificate problem: unable to get local issuer certificate

尝试 ssl.verifypeer=F

GET("https://url.retail.publishedprices.co.il/login",.opts = list(ssl.verifypeer = FALSE))
Error in function (type, msg, asError = TRUE)  : 
SSL certificate problem: unable to get local issuer certificate

尝试使用所需的表单用户名 (RamiLevi),但没有密码

GET("https://url.retail.publishedprices.co.il/login",
.opts = list(ssl.verifypeer = FALSE,userpwd="RamiLevi:"))
Error in function (type, msg, asError = TRUE)  : 
SSL certificate problem: unable to get local issuer certificate

通常,最简单的做法是从 RCurl::GET() 切换到 httr::GET(),这将自动处理 SSL。或者,您可以在 RCurl 调用中指定一个 cacert.pem 文件(例如,使用参数 cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))。由于某种原因,这在这里不起作用。

您可以做的一件事是避免使用 SSL,这对您有用,但出于其他(安全)原因,这可能不是一个好主意:

RCurl::getURL("https://url.retail.publishedprices.co.il/login", ssl.verifyhost = 0L, ssl.verifypeer = 0L)
[1] "<!DOCTYPE html>\r\n<html lang=\"en\">\r\n\t<head>\r\n\t\t<meta charset=\"utf-8\" />\r\n\t\t<meta http-equiv=\"X-UA-Compatible\" content=\"IE=edge\" />\r\n\t\t<meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0, maximum-scale=1.0\" />\r\n        <meta name=\"mobile-web-app-capable\" content=\"yes\">\r\n\t\t<title>Cerberus Web Client</title>\r\n\t\t\t\t\r\n\t\t<link rel=\"shortcut icon\" href=\"/favicon.ico\" />\r\n\t\t<link rel=\"icon\" sizes=\"196x196\" href=\"/images/android-icon-196x196.png\" />\r\n        <link rel=\"apple-touch-icon\" sizes=\"114x114\" href=\"/images/apple-icon-114x114.png\" />\r\n        <link rel=\"apple-touch-icon\" sizes=\"144x144\" href=\"/images/apple-icon-144x144.png\" /> \r\n               \r\n\t\t<!-- Bootstrap core CSS -->\r\n\t\t<link rel=\"stylesheet\" href=\"/css/bootstrap.min.css\" />\r\n\t\t<link rel=\"stylesheet\" href=\"/css/bootstrap-datetimepicker.min.css\" />\r\n\t\t\r\n\t\t<link rel=\"stylesheet\" href=\"/custom/css/default-theme.css\" />\r\n        \r\n\t\t<link rel=\"stylesheet\" href=\"/css/common-3.0.css\" />\r\n\t\t\r\n\t\t\r\n\t\t<link rel=\"stylesheet\" href=\"/css/login-3.0.css\" />\r\n\r\n\t\t<!--[if gte IE 9]>\r\n\t\t\t<style type=\"text/css\">\r\n\t\t\t\t.gradient { filter: none; }\r\n\t\t\t</style>\r\n\t\t<![endif]-->\r\n\r\n\t\t<!--[if lte IE 9]>\r\n\t\t\t<link rel=\"stylesheet\" href=\"/css/common-3.0-ie8.css\" />\r\n\t\t<![endif]-->\t\r\n\r\n\t\t<!-- HTML5 shim and Respond.js IE8 support of HTML5 elements and media queries -->\r\n\t\t<!--[if lt IE 9]>\r\n\t\t\t<script src=\"/js/html5shiv.min.js\"></script>\r\n\t\t\t<script src=\"/js/respond.min.js\"></script>\r\n\t\t<![endif]-->\t\r\n\t\r\n\t\t<script src=\"/js/jquery-1.11.2.min.js\"></script> \r\n\t\t<script src=\"/js/bootstrap.min.js\"></script> \r\n\t\t<script src=\"/js/functions-3.0.js\"></script>\r\n\t\t\r\n\t\t\r\n\t\t\r\n\t\t<!-- Javascript language file -->\r\n\t\t<script src=\"/js/lang-1/en-us.js\"></script> \r\n\t\t\r\n\t</head> \r\n\t<body>\r\n   \r\n    <!-- Wrap all page content here -->\r\n\t<div id=\"wrap\">\r\n    \r\n\t\t<!-- Fixed navbar -->\r\n\t\t<div class=\"navbar navbar-default navbar-static-top\">\r\n            <div class=\"container-fluid\">\r\n              <div class=\"navbar-header\">\r\n                <button type=\"button\" class=\"navbar-toggle\" data-toggle=\"collapse\" data-target=\".navbar-collapse\">\r\n\t\t\t\t\t<span class=\"sr-only\">Toggle navigation</span>\r\n\t\t\t\t\t<span class=\"icon-bar\"></span>\r\n\t\t\t\t\t<span class=\"icon-bar\"></span>\r\n\t\t\t\t\t<span class=\"icon-bar\"></span>\r\n                </button>\r\n                <a class=\"navbar-brand\" href=\"/\"><img class=\"hidden-xs\" src=\"/images/logo@2x.png\" width=\"230\" height=\"70\" alt=\"logo\"><span class=\"visible-xs-inline\">Cerberus</span></a>\r\n              </div>\r\n              <div class=\"collapse navbar-collapse\">\r\n\t\t\t\t<div class=\"navbar-nav navbar-right\">\r\n                    <div class=\"navbar-text navbar-right\" style=\"margin: 10px 10px 10px 5px\">\r\n                            <small>Not currently logged in\r\n                            </small>\r\n                    </div>\r\n                    <div class=\"clearfix\"></div>\r\n                    <ul class=\"nav nav-tabs navbar-right\" style=\"border-bottom: none\">\r\n                        <li class=\"nav_home_item active\"><a href=\"/\">Home</a></li>\r\n                        <li class=\"\"><a href=\"/account?action=display\">Account</a></li>\r\n                        \r\n                    </ul>\r\n                    <div class=\"clearfix\"></div>\r\n                </div>\r\n\t\t\t\t<div class=\"clearfix\"></div>\r\n              </div><!--/.nav-collapse -->\r\n            </div>\r\n\t\t</div>\r\n\r\n        \r\n        <div class=\"page-header\">\r\n            <div class=\"container-fluid\">\r\n                <h1 class=\"page-title\"><span class=\"glyphicon glyphicon-log-in\"></span>&nbsp;&nbsp;Sign in</h1>\r\n            </div>\r\n        </div>\r\n        \r\n\t\t<!-- Begin page content -->\r\n\t\t<div class=\"container-fluid\">\r\n            <div id=\"status-container\">\r\n                \r\n                \r\n                \r\n                \r\n            </div>\r\n\t\r\n\t\r\n\t\r\n\t\r\n\t\t<div class=\"row\">\r\n\t\r\n\t        <div class=\"col-lg-offset-2 col-lg-4 col-md-offset-1 col-md-5 col-sm-6\">\r\n\t            <div id=\"welcome\" class=\"panel panel-default\">\r\n\t                <div class=\"panel-body\">\r\n\t\t\t\t\t\t<div><p>Welcome to Public Published Prices Server <br />            Created by NCR L.T.D <br /> <br /> <br />** The site is open! Have a good day. </p></div>\r\n\t        \t\t</div>\r\n\t            </div>\r\n\t        </div>\r\n\t\t\t<div class=\"col-lg-4 col-md-5 col-sm-6\">\r\n\t\r\n\t\r\n\t\t\t\t<div id=\"login\" class=\"panel panel-default\">\r\n\t\t\t\t\t<div class=\"panel-body\">\r\n\t\t\t\t\t\t<form id=\"login-form\" action=\"/login/user\" method=\"post\">\r\n\t                    \t<input type=\"hidden\" name=\"csrftoken\" id=\"csrftoken\" value=\"w5hEwotawr5KwplPasK1wq3DnMO4w6gvNT/CqCwq\" />\r\n\t                        <div class=\"pull-left\"><img src=\"/images/login-icon.png\" width=\"70\" height=\"70\" alt=\"Login icon\" /></div>\r\n\t                        <div class=\"pull-left login-text\">Client Login</div>\r\n\t                        <div class=\"clearfix\"></div>\r\n\t                        <div class=\"form-group\" style=\"margin-top: 1.5em\">\r\n\t                            <label class=\"control-label\" for=\"username\">Username:</label>\r\n\t                            <input type=\"text\" name=\"username\" id=\"username\" value=\"\" class=\"form-control\" maxlength=\"255\" required />\r\n\t                        </div>\r\n\t                        <div class=\"form-group\">\r\n\t                            <label class=\"control-label\" for=\"password\">Password:</label>\r\n\t                            <input type=\"password\" name=\"password\" id=\"password\" value=\"\" autocomplete=\"off\" class=\"form-control\" maxlength=\"255\" />\r\n\t                        </div>\r\n\t\r\n\t                        <div class=\"login-spacer\">\r\n\t                        </div>\r\n\t                        <div class=\"row\">\r\n\t                            <div class=\"col-sm-6\">\r\n\t                                <div id=\"reqacct-link\">\r\n\t\r\n\t                            \t\t<small><a href=\"/request/\" class=\"tip\" data-placement=\"bottom\" data-content=\"You can use this link to submit a new account request to this server's administrator\" title=\"New Account Request Form\">Request an Account</a></small>\r\n\t\r\n\t                                </div>\r\n\t\t\t\t\t\t\t\t</div>\r\n\t\r\n\t                            <div class=\"col-sm-6\">\r\n\t                                <button class=\"btn btn-primary btn-block\" type=\"submit\" id=\"login-button\" name=\"Submit\" data-loading-text=\"Please wait\" value=\"Sign in\">Sign in</button>\r\n\t                            </div>\r\n\t                        </div>\r\n\t                    </form>\r\n\t                </div>\r\n\t            </div>\r\n\t        </div>\r\n\t\t</div>\r\n\t\r\n\t\r\n\t\r\n\t\t<script type=\"text/javascript\">\r\n\t\t//<![CDATA[\r\n\t\t\t$(document).ready(function() {\r\n\t\r\n\t\t\t\t$(\"#username\").focus();\r\n\t\r\n\t\t\t\t$(\"#login-button\").button(cftp_msg.login_btn);\r\n\t\t\t\t$(\"#login-button\").attr(\"data-loading-text\", cftp_msg.please_wait);\r\n\t\t\t\t\r\n\t\t\t\t$(\"#login-form\").submit(function() {\r\n\t\t\t \r\n\t\t\t\t\t$(\"#login-button\").button(\"loading\");\r\n\t\t\t\t});\t\r\n\t\r\n\t\t\t});\r\n\t\r\n\t\t//]]>\r\n\t\t</script>\t\r\n\r\n\t\t</div>\r\n\t</div>\r\n\r\n\t<div id=\"footer\">\r\n\t\t<div class=\"container-fluid\">\r\n\t\t\t<p class=\"text-muted credit\">\r\n\t\t\t\t<a href=\"/\">Home</a>\r\n\t\t\t\t&nbsp;<span class=\"sep_footer_nav\">|</span><a href=\"/account?action=display\">Account</a>\r\n\t\t\t\t\r\n                &nbsp;<span class=\"sep_footer_nav\">|</span><a href=\"#\" id=\"_context_help_\"><span class=\"glyphicon glyphicon-question-sign\"></span> Help</a>\r\n\t\t\t</p>\r\n\t\t</div>\r\n\t</div>\r\n\r\n\t</body>\r\n</html>"