Watson Discovery Service Python add document Error: Invalid Content-Type. Expected 'multipart/form-data'

Watson Discovery Service Python add document Error: Invalid Content-Type. Expected 'multipart/form-data'

导入 Watson Developer Cloud Python SDK

from watson_developer_cloud import DiscoveryV1

从 Slack doc_url 获取 pdf,这是私有 URL

r = requests.get(doc_url, headers={'Authorization': 'Bearer {}'.format(slack_token) })
logging.debug("read_pdf headers %s " %r.headers )
logging.debug("read_pdf content-type %s " %r.headers['content-type'] )

将文件暂时保存在云端文件系统

with open(doc_name, 'wb' ) as f:
  f.write(r.content)
filepath = os.path.join(os.getcwd(), '.', doc_name )
logging.debug('filepath %s' %filepath)
logging.debug('filepath assertion %s' %os.path.isfile(filepath) )

创建发现实例

discovery = DiscoveryV1(
username=DS_USERNAME,
password=DS_PASSWORD,
version="2017-10-16"
)

在 Discovery 实例中添加 pdf 文档

with open(filepath, 'rb') as fileinfo:
  add_doc = discovery.add_document(ENVIRONMENT_ID, COLLECTION_ID, file_content_type=r.headers['content-type'])

日志文件

read_pdf headers {'Content-Type': 'application/pdf', 'Content-Length': '149814'
WatsonApiException: Error: Invalid Content-Type. Expected 'multipart/form-data', got 'application/octet-stream', Code: 400 , X-dp-watson-tran-id: gateway02-732476861 , X-global-transaction-id: ffea405d5ba1ad632ba8b5bd

开发人员代码示例在 Github 中被注释掉了。

https://github.com/watson-developer-cloud/python-sdk/blob/master/examples/discovery_v1.py

哦,天哪。那是一个悲惨的错误信息。

调用 discovery.add_document() 时缺少的是 file 参数。您可以尝试像这样添加 file=fileinfo 吗:

with open(filepath, 'rb') as fileinfo:
  add_doc = discovery.add_document(ENVIRONMENT_ID,
                                   COLLECTION_ID,
                                   file=fileinfo,
                                   file_content_type=r.headers['content-type'])

供参考,here is some Python code 有效并且正在做的事情与您的目标非常相似。