如何使用 BERT 模型获得答案的概率,有没有办法针对上下文提出多个问题

How to get probability of an answer using BERT model and is there a way to ask multiple questions for a context

我是 AI 模型的新手,目前正在试验 QandA 模型。我特别对以下 2 个模型感兴趣。 1。从变压器导入 BertForQuestionAnswering
2。来自 simpletransformers.question_answering 导入 QuestionAnsweringModel

使用选项 1 BertForQuestionAnswering 我得到了想要的结果。但是我一次只能问一个问题。我也没有得到答案的概率。

下面是来自 transformers 的 BertForQuestionAnswering 的代码。

from transformers import BertTokenizer, BertForQuestionAnswering
import torch

tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')


input_ids = tokenizer.encode('Sky color?', 'Today the sky is blue. and it is cold out there')
tokens = tokenizer.convert_ids_to_tokens(input_ids)
sep_index = input_ids.index(tokenizer.sep_token_id)
num_seg_a = sep_index + 1
num_seg_b = len(input_ids) - num_seg_a
segment_ids = [0]*num_seg_a + [1]*num_seg_b
assert len(segment_ids) == len(input_ids)
outputs = model(torch.tensor([input_ids]), 
                            token_type_ids=torch.tensor([segment_ids]), 
                            return_dict=True)
start_scores = outputs.start_logits
end_scores = outputs.end_logits
answer_start = torch.argmax(start_scores)
answer_end = torch.argmax(end_scores)
answer = ' '.join(tokens[answer_start:answer_end+1])
print(answer)

这是输出:blue

如果使用 simpletransformers 中的选项 2 QuestionAnsweringModel,我可以一次提出多个问题并获得答案的概率。

下面是来自 simpletransformers

QuestionAnsweringModel 的代码
from simpletransformers.question_answering import QuestionAnsweringModel
model = QuestionAnsweringModel('distilbert', 'distilbert-base-uncased-distilled-squad', use_cuda=False)

question_data = {
        'qas': [{
            'question': 'Sky color?',
            'id': 0,
        },
        {
            'question': 'weather?',
            'id': 1,
        }
        ],
        'context': 'Today the sky is blue. and it is cold out there'
    }

prediction = model.predict([question_data])
output = {'result': list(prediction)}
print(output)

这是输出:

{
   "result":[
      [
         {
            "id":0,
            "answer":["blue", "the sky is blue", "blue."]
         },
         {
            "id":1,
            "answer":["cold", "it is cold", "cold out there"]
         }
      ],
      [
         {
            "id":0,
            "probability":[0.8834650211919095,0.0653234009794176,0.031404456093241565]
         },
         {
            "id":1,
            "probability":[0.6851319220199236,0.18145769901523698,0.05004994980074798]
         }
      ]
   ]
}

如您所见,对于相同的上下文,我可以一次提出多个问题并获得每个答案的概率。

有没有办法在选项#1 中为 BERT 模型获得类似的输出。 我需要一种方法来将多个问题设置为上下文,还需要响应中每个答案的概率。

如有任何帮助,我们将不胜感激。

您可以使用抱脸 question answering pipeline 来实现:

from transformers import pipeline

model_checkpoint = 'bert-large-uncased-whole-word-masking-finetuned-squad' qa_pipeline =  pipeline('question-answering', model=model_checkpoint, tokenizer=model_checkpoint)

qa_pipeline(question=['Sky color?', 'weather?'], context='Today the sky is blue. and it is cold out there')

输出:

[{'score': 0.9102755188941956, 'start': 17, 'end': 21, 'answer': 'blue'},
 {'score': 0.49744659662246704, 'start': 33, 'end': 37, 'answer': 'cold'}]