通过 Siteminder 使用 Python 请求
Using Python Requests With Siteminder
我在使用请求从网页检索一些数据时遇到了一些问题。它使用 Siteminder,初始表单只有三个字段,但当我提交时,我的密码更改为十六进制,并添加了其他字段。似乎根本无法让它工作。我一直返回一条错误消息。
感谢任何帮助,对于这么长的时间我深表歉意post!
编辑:包含两个 JavaScript 函数,因为它们会更改数据。
Python:
from requests import session
with session() as s:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36'
}
payload = {
'USER': 'username',
'PASSWORD': 'pw',
'TARGET': 'https://www.THISSITE.com/pg'
}
resp = s.post('https://www.THISSITE.com/THIS/fcc/THISSITE.fcc', headers=headers, data=payload )
html = resp.text
print(html)
表格:
<form
id="login"
method="post"
name="Login"
action="https://www.THISSITE.com/THIS/fcc/THISSITE.fcc">
<input
type="hidden"
name="TARGET"
value="https://www.THISSITE.com/pg"
></input>
<div class="form-group">
<input
type="text"
id="USER"
name="USER"
value=""
></input>
<div class="form-group">
<input
type="password"
id="PASSWORD"
name="PASSWORD"
value=""
></input>
</div>
<input
type="submit"
name="OK"
value="Login"
onclick="submitAuthForm(this.form);"
></input>
submitAuthForm(表单):
function submitAuthForm(form) {
var strval = form.PASSWORD.value;
if(!isJson(strval)){
var info = {};
info["Password"] = hexEncode(strval);
form.PASSWORD.value = JSON.stringify(info);
}
}
hexEncode(str):
function hexEncode(s){
var chrsz = 8;
var hexcase = 0;
function str2binb (str) {
var bin = Array();
var mask = (1 << chrsz) - 1;
for(var i = 0; i < str.length * chrsz; i += chrsz) {
bin[i>>5] |= (str.charCodeAt(i / chrsz) & mask) << (24 - i%32);
}
return bin;
}
function Utf8Encode(string) {
string = string.replace(/\r\n/g,"\n");
var utftext = "";
for (var n = 0; n < string.length; n++) {
var c = string.charCodeAt(n);
if (c < 128) {
utftext += String.fromCharCode(c);
}
else if((c > 127) && (c < 2048)) {
utftext += String.fromCharCode((c >> 6) | 192);
utftext += String.fromCharCode((c & 63) | 128);
}
else {
utftext += String.fromCharCode((c >> 12) | 224);
utftext += String.fromCharCode(((c >> 6) & 63) | 128);
utftext += String.fromCharCode((c & 63) | 128);
}
}
return utftext;
}
function binb2hex (binarray) {
var hex_tab = hexcase ? "0123456789ABCDEF" : "0123456789abcdef";
var str = "";
for(var i = 0; i < binarray.length * 4; i++) {
str += hex_tab.charAt((binarray[i>>2] >> ((3 - i%4)*8+4)) & 0xF) +
hex_tab.charAt((binarray[i>>2] >> ((3 - i%4)*8 )) & 0xF);
}
return str;
}
s = Utf8Encode(s);
return binb2hex(str2binb(s));
}
网页提交时的参数:
SMENC: UTF-8
SMLOCALE: US-EN
target: https://www.THISSITE.com/pg
smauthreason: 27
smagentname: mR2h1e4BPUPZ5eTpyZckvJXpXO1mE5RpNTYtnh9C8sMfqiHlbrnBjW2SNjbwIRz+
type:
realmoid:
smusermsg:
USER: username
PASSWORD: {"TokenId":"longstringoflettersandnumbersHEX???","Password":""}
hexEncode
函数正在获取一个字符串并将其转换为一系列十六进制表示,这些表示是其 UTF8 编码表示的组成字节。 Python 中的等效项是将输入的 unicode 字符串编码为 UTF-8,然后将其结果重新编码为 hex,例如
>>> import binascii
>>> binascii.hexlify('d'.encode('utf-8'))
b'64'
>>> binascii.hexlify('¡¢£¤¥'.encode('utf-8'))
b'c2a1c2a2c2a3c2a4c2a5'
注意:在 Python 2.7 中这将是 —
>>> 'd'.encode('utf-8').encode('hex')
'64'
>>> u'¡¢£¤¥'.encode('utf-8').encode('hex')
'c2a1c2a2c2a3c2a4c2a5'
如果您使用示例密码对其进行测试,它应该会产生与网站相同的输出,但有一点需要注意。
hexEncode('d')
"64000000"
注意 Javascript 添加了多个尾随 0,使字符串的长度成为 8 的倍数。我们需要填充结果才能获得相同的输出。
>>> s = binascii.hexlify('d'.encode('utf-8'))
>>> n = len(s)
>>> from math import ceil
>>> next_8_multiple = int(ceil(n/8.0) * 8)
>>> s.ljust(next_8_multiple, b'0')
b'6400000000'
我们可以将其包装在一个完整的函数中:
from math import ceil
import binascii
def hex_encode_and_pad(s):
hex = binascii.hexlify(s.encode('utf-8'))
n = len(hex)
next_8_multiple = int(ceil(n/8.0) * 8)
zeros_to_append = next_8_multiple - n
return hex.ljust(next_8_multiple, b'0')
这现在给出与 Javascript 函数相同的结果:
>>> hex_encode_and_pad('d')
'64000000'
下一步是将其包装在 JSON 的字符串表示中。您可以通过手动编码字符串+仅插入令牌来做到这一点,例如
value = '{"TokenId":"%s","Password":""}' % token
或者从 Python 字典创建 JSON 字符串 —
import json
data = {'TokenId': token, 'Password': ''}
value = json.dumps(data)
基于上面显示的示例请求的完整代码为:
import binascii
import json
from math import ceil
from requests import session
def hex_encode_and_pad(s):
hex = binascii.hexlify(s.encode('utf-8'))
n = len(hex)
next_8_multiple = int(ceil(n/8.0) * 8)
zeros_to_append = next_8_multiple - n
return hex.ljust(next_8_multiple, b'0')
with session() as s:
password = u'your_password'
token = hex_encode_and_pad(password)
data = {'TokenId': token, 'Password': ''}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36'
}
payload = {
'USER': 'username',
'PASSWORD': json.dumps(data),
'TARGET': 'https://www.THISSITE.com/pg'
}
resp = s.post('https://www.THISSITE.com/THIS/fcc/THISSITE.fcc', headers=headers, data=payload )
html = resp.text
print(html)
我在使用请求从网页检索一些数据时遇到了一些问题。它使用 Siteminder,初始表单只有三个字段,但当我提交时,我的密码更改为十六进制,并添加了其他字段。似乎根本无法让它工作。我一直返回一条错误消息。
感谢任何帮助,对于这么长的时间我深表歉意post!
编辑:包含两个 JavaScript 函数,因为它们会更改数据。
Python:
from requests import session
with session() as s:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36'
}
payload = {
'USER': 'username',
'PASSWORD': 'pw',
'TARGET': 'https://www.THISSITE.com/pg'
}
resp = s.post('https://www.THISSITE.com/THIS/fcc/THISSITE.fcc', headers=headers, data=payload )
html = resp.text
print(html)
表格:
<form
id="login"
method="post"
name="Login"
action="https://www.THISSITE.com/THIS/fcc/THISSITE.fcc">
<input
type="hidden"
name="TARGET"
value="https://www.THISSITE.com/pg"
></input>
<div class="form-group">
<input
type="text"
id="USER"
name="USER"
value=""
></input>
<div class="form-group">
<input
type="password"
id="PASSWORD"
name="PASSWORD"
value=""
></input>
</div>
<input
type="submit"
name="OK"
value="Login"
onclick="submitAuthForm(this.form);"
></input>
submitAuthForm(表单):
function submitAuthForm(form) {
var strval = form.PASSWORD.value;
if(!isJson(strval)){
var info = {};
info["Password"] = hexEncode(strval);
form.PASSWORD.value = JSON.stringify(info);
}
}
hexEncode(str):
function hexEncode(s){
var chrsz = 8;
var hexcase = 0;
function str2binb (str) {
var bin = Array();
var mask = (1 << chrsz) - 1;
for(var i = 0; i < str.length * chrsz; i += chrsz) {
bin[i>>5] |= (str.charCodeAt(i / chrsz) & mask) << (24 - i%32);
}
return bin;
}
function Utf8Encode(string) {
string = string.replace(/\r\n/g,"\n");
var utftext = "";
for (var n = 0; n < string.length; n++) {
var c = string.charCodeAt(n);
if (c < 128) {
utftext += String.fromCharCode(c);
}
else if((c > 127) && (c < 2048)) {
utftext += String.fromCharCode((c >> 6) | 192);
utftext += String.fromCharCode((c & 63) | 128);
}
else {
utftext += String.fromCharCode((c >> 12) | 224);
utftext += String.fromCharCode(((c >> 6) & 63) | 128);
utftext += String.fromCharCode((c & 63) | 128);
}
}
return utftext;
}
function binb2hex (binarray) {
var hex_tab = hexcase ? "0123456789ABCDEF" : "0123456789abcdef";
var str = "";
for(var i = 0; i < binarray.length * 4; i++) {
str += hex_tab.charAt((binarray[i>>2] >> ((3 - i%4)*8+4)) & 0xF) +
hex_tab.charAt((binarray[i>>2] >> ((3 - i%4)*8 )) & 0xF);
}
return str;
}
s = Utf8Encode(s);
return binb2hex(str2binb(s));
}
网页提交时的参数:
SMENC: UTF-8
SMLOCALE: US-EN
target: https://www.THISSITE.com/pg
smauthreason: 27
smagentname: mR2h1e4BPUPZ5eTpyZckvJXpXO1mE5RpNTYtnh9C8sMfqiHlbrnBjW2SNjbwIRz+
type:
realmoid:
smusermsg:
USER: username
PASSWORD: {"TokenId":"longstringoflettersandnumbersHEX???","Password":""}
hexEncode
函数正在获取一个字符串并将其转换为一系列十六进制表示,这些表示是其 UTF8 编码表示的组成字节。 Python 中的等效项是将输入的 unicode 字符串编码为 UTF-8,然后将其结果重新编码为 hex,例如
>>> import binascii
>>> binascii.hexlify('d'.encode('utf-8'))
b'64'
>>> binascii.hexlify('¡¢£¤¥'.encode('utf-8'))
b'c2a1c2a2c2a3c2a4c2a5'
注意:在 Python 2.7 中这将是 —
>>> 'd'.encode('utf-8').encode('hex')
'64'
>>> u'¡¢£¤¥'.encode('utf-8').encode('hex')
'c2a1c2a2c2a3c2a4c2a5'
如果您使用示例密码对其进行测试,它应该会产生与网站相同的输出,但有一点需要注意。
hexEncode('d')
"64000000"
注意 Javascript 添加了多个尾随 0,使字符串的长度成为 8 的倍数。我们需要填充结果才能获得相同的输出。
>>> s = binascii.hexlify('d'.encode('utf-8'))
>>> n = len(s)
>>> from math import ceil
>>> next_8_multiple = int(ceil(n/8.0) * 8)
>>> s.ljust(next_8_multiple, b'0')
b'6400000000'
我们可以将其包装在一个完整的函数中:
from math import ceil
import binascii
def hex_encode_and_pad(s):
hex = binascii.hexlify(s.encode('utf-8'))
n = len(hex)
next_8_multiple = int(ceil(n/8.0) * 8)
zeros_to_append = next_8_multiple - n
return hex.ljust(next_8_multiple, b'0')
这现在给出与 Javascript 函数相同的结果:
>>> hex_encode_and_pad('d')
'64000000'
下一步是将其包装在 JSON 的字符串表示中。您可以通过手动编码字符串+仅插入令牌来做到这一点,例如
value = '{"TokenId":"%s","Password":""}' % token
或者从 Python 字典创建 JSON 字符串 —
import json
data = {'TokenId': token, 'Password': ''}
value = json.dumps(data)
基于上面显示的示例请求的完整代码为:
import binascii
import json
from math import ceil
from requests import session
def hex_encode_and_pad(s):
hex = binascii.hexlify(s.encode('utf-8'))
n = len(hex)
next_8_multiple = int(ceil(n/8.0) * 8)
zeros_to_append = next_8_multiple - n
return hex.ljust(next_8_multiple, b'0')
with session() as s:
password = u'your_password'
token = hex_encode_and_pad(password)
data = {'TokenId': token, 'Password': ''}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36'
}
payload = {
'USER': 'username',
'PASSWORD': json.dumps(data),
'TARGET': 'https://www.THISSITE.com/pg'
}
resp = s.post('https://www.THISSITE.com/THIS/fcc/THISSITE.fcc', headers=headers, data=payload )
html = resp.text
print(html)