How to Access Microsoft cognitive API (HTTPError: HTTP Error 400: Bad Request)
How to Access Microsoft cognitive API (HTTPError: HTTP Error 400: Bad Request)
我正在尝试在 Azure
上使用文本分析 api 在 csv 文件上构建情绪分析模型
这是我使用的代码:
for j in range(0,num_of_batches): # this loop will add num_of_batches strings to input_texts
input_texts.set_value(j,"") # initialize input_texts string j
for i in range(j*l//num_of_batches,(j+1)*l//num_of_batches): #loop through a window of rows from the dataset
comment = str(mydata["tweet"][i]) #grab the comment from the current row
comment = comment.replace("\"", "'") #remove backslashes (why? I don’t remember. #honestblogger)
#add the current comment to the end of the string we’re building in input_texts string j
input_texts.set_value(j, input_texts[j] + '{"language":"' + "pt"',"id":"' + str(i) + '","text":"'+ comment + '"},')
#after we’ve looped through this window of the input dataset to build this series, add the request head and tail
input_texts.set_value(j, '{"documents":[' + input_texts[j] + ']}')
headers = {'Content-Type':'application/json', 'Ocp-Apim-Subscription-Key':account_key}
Sentiment = pd.Series()
batch_sentiment_url = "https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment"
直到现在一切都很好,但是当我尝试从 api 获取数据时,我在最后一部分出现错误
for j in range(0,num_of_batches):
# Detect sentiment for the each batch.
req = urllib2.Request(batch_sentiment_url, input_texts[j], headers)
response = urllib2.urlopen(req)
result = response.read()
obj = json.loads(result.decode('utf-8'))
#loop through each result string, extracting the sentiment associated with each id
for sentiment_analysis in obj['documents']:
Sentiment.set_value(sentiment_analysis['id'], sentiment_analysis['score'])
#tack our new sentiment series onto our original dataframe
mydata.insert(len(mydata.columns),'Sentiment',Sentiment.values)
这个错误
HTTPError: HTTP Error 400: Bad Request
始终首先使用 curl
验证 API 调用。然后插入代码。这条 curl
行对我有用:
curl -k -X POST -H "Ocp-Apim-Subscription-Key: <your ocp-apim-subscription-key>" -H "Content-Type: application/json" --data "{ 'documents': [ { 'id': '12345', 'text': 'now is the time for all good men to come to the aid of their party.' } ] }" "https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment"
您收到 400 错误,因为您的 JSON 格式不正确('pt' 周围的引号不匹配)。我不认为您使用 pandas
模块发送请求或尝试手工制作 JSON 对自己有任何好处。特别是您容易受到错误的引号或转义字符的攻击。
以下是您可以改用的方法:
input_texts = []
for j in range(0,num_of_batches): # this loop will add num_of_batches strings to input_texts
documents = []
for i in range(j*l//num_of_batches,(j+1)*l//num_of_batches): #loop through a window of rows from the dataset
documents.append({
'language':'pt',
'id': str(i),
'text': str(mydata["tweet"][i])})
input_texts.append({'documents':documents})
...
req = urllib2.Request(batch_sentiment_url, json.dumps(input_texts[j]), headers)
我正在尝试在 Azure
上使用文本分析 api 在 csv 文件上构建情绪分析模型这是我使用的代码:
for j in range(0,num_of_batches): # this loop will add num_of_batches strings to input_texts
input_texts.set_value(j,"") # initialize input_texts string j
for i in range(j*l//num_of_batches,(j+1)*l//num_of_batches): #loop through a window of rows from the dataset
comment = str(mydata["tweet"][i]) #grab the comment from the current row
comment = comment.replace("\"", "'") #remove backslashes (why? I don’t remember. #honestblogger)
#add the current comment to the end of the string we’re building in input_texts string j
input_texts.set_value(j, input_texts[j] + '{"language":"' + "pt"',"id":"' + str(i) + '","text":"'+ comment + '"},')
#after we’ve looped through this window of the input dataset to build this series, add the request head and tail
input_texts.set_value(j, '{"documents":[' + input_texts[j] + ']}')
headers = {'Content-Type':'application/json', 'Ocp-Apim-Subscription-Key':account_key}
Sentiment = pd.Series()
batch_sentiment_url = "https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment"
直到现在一切都很好,但是当我尝试从 api 获取数据时,我在最后一部分出现错误
for j in range(0,num_of_batches):
# Detect sentiment for the each batch.
req = urllib2.Request(batch_sentiment_url, input_texts[j], headers)
response = urllib2.urlopen(req)
result = response.read()
obj = json.loads(result.decode('utf-8'))
#loop through each result string, extracting the sentiment associated with each id
for sentiment_analysis in obj['documents']:
Sentiment.set_value(sentiment_analysis['id'], sentiment_analysis['score'])
#tack our new sentiment series onto our original dataframe
mydata.insert(len(mydata.columns),'Sentiment',Sentiment.values)
这个错误
HTTPError: HTTP Error 400: Bad Request
始终首先使用 curl
验证 API 调用。然后插入代码。这条 curl
行对我有用:
curl -k -X POST -H "Ocp-Apim-Subscription-Key: <your ocp-apim-subscription-key>" -H "Content-Type: application/json" --data "{ 'documents': [ { 'id': '12345', 'text': 'now is the time for all good men to come to the aid of their party.' } ] }" "https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment"
您收到 400 错误,因为您的 JSON 格式不正确('pt' 周围的引号不匹配)。我不认为您使用 pandas
模块发送请求或尝试手工制作 JSON 对自己有任何好处。特别是您容易受到错误的引号或转义字符的攻击。
以下是您可以改用的方法:
input_texts = []
for j in range(0,num_of_batches): # this loop will add num_of_batches strings to input_texts
documents = []
for i in range(j*l//num_of_batches,(j+1)*l//num_of_batches): #loop through a window of rows from the dataset
documents.append({
'language':'pt',
'id': str(i),
'text': str(mydata["tweet"][i])})
input_texts.append({'documents':documents})
...
req = urllib2.Request(batch_sentiment_url, json.dumps(input_texts[j]), headers)