文档转换 Watson 服务不工作?
Document Conversion Watson service not working?
我一直在尝试将 IBM Watson 文档转换服务与演示 PDF 一起使用,但它无法将文档转换成小块。它所做的就是创建 1 个答案单元,这真的很长:
"text": "Watson is an artificially intelligent computer system capable of answering questions posed in natural language,[2] developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first CEO and industrialist Thomas J. Watson.[3][4] The computer system was specifically developed to answer questions on the quiz show Jeopardy![5] In 2011, Watson competed on Jeopardy! against former winners Brad Rutter and Ken Jennings.[3][6] Watson received the first place prize of million.[7] Watson had access to 200 million pages of structured and unstructured content consuming four terabytes of disk storage[8] including the full text of Wikipedia,[9] but was not connected to the Internet during the game.[10][11] For each clue, Watson's three most probable responses were displayed on the television screen. Watson consistently outperformed its human opponents on the game's signaling device, but had trouble responding to a few categories, notably those having short clues containing only a few words. In February 2013, IBM announced that Watson software system's first commercial application would be for utilization management decisions in lung cancer treatment at Memorial Sloan- Kettering Cancer Center in conjunction with health insurance company WellPoint.[12] IBM Watson's former business chief Manoj Saxena says that 90% of nurses in the field who use Watson now follow its guidance.[13]"
提前致谢!
不幸的是,该演示 PDF 不是最适合使用的文档:目前,答案单元是根据标题标签 (h1 - h6) 拆分的,并且该 PDF 不包含任何 headers。 =(
如果您将 conversion_target
设置为 NORMALIZED_HTML
,您将能够在转换后的 PDF 拆分为答案单元之前看到它。它将包含段落但没有标题。
将来,我们希望还允许按段落拆分答案单元,但尚未发布。
更新:
我们用一个更好的示例更新了演示站点上的 PDF。
我一直在尝试将 IBM Watson 文档转换服务与演示 PDF 一起使用,但它无法将文档转换成小块。它所做的就是创建 1 个答案单元,这真的很长:
"text": "Watson is an artificially intelligent computer system capable of answering questions posed in natural language,[2] developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first CEO and industrialist Thomas J. Watson.[3][4] The computer system was specifically developed to answer questions on the quiz show Jeopardy![5] In 2011, Watson competed on Jeopardy! against former winners Brad Rutter and Ken Jennings.[3][6] Watson received the first place prize of million.[7] Watson had access to 200 million pages of structured and unstructured content consuming four terabytes of disk storage[8] including the full text of Wikipedia,[9] but was not connected to the Internet during the game.[10][11] For each clue, Watson's three most probable responses were displayed on the television screen. Watson consistently outperformed its human opponents on the game's signaling device, but had trouble responding to a few categories, notably those having short clues containing only a few words. In February 2013, IBM announced that Watson software system's first commercial application would be for utilization management decisions in lung cancer treatment at Memorial Sloan- Kettering Cancer Center in conjunction with health insurance company WellPoint.[12] IBM Watson's former business chief Manoj Saxena says that 90% of nurses in the field who use Watson now follow its guidance.[13]"
提前致谢!
不幸的是,该演示 PDF 不是最适合使用的文档:目前,答案单元是根据标题标签 (h1 - h6) 拆分的,并且该 PDF 不包含任何 headers。 =(
如果您将 conversion_target
设置为 NORMALIZED_HTML
,您将能够在转换后的 PDF 拆分为答案单元之前看到它。它将包含段落但没有标题。
将来,我们希望还允许按段落拆分答案单元,但尚未发布。
更新: 我们用一个更好的示例更新了演示站点上的 PDF。