如何使用 BERT 模型获得答案的概率,有没有办法针对上下文提出多个问题
How to get probability of an answer using BERT model and is there a way to ask multiple questions for a context
我是 AI 模型的新手,目前正在试验 QandA 模型。我特别对以下 2 个模型感兴趣。
1。从变压器导入 BertForQuestionAnswering
2。来自 simpletransformers.question_answering 导入 QuestionAnsweringModel
使用选项 1 BertForQuestionAnswering 我得到了想要的结果。但是我一次只能问一个问题。我也没有得到答案的概率。
下面是来自 transformers 的 BertForQuestionAnswering 的代码。
from transformers import BertTokenizer, BertForQuestionAnswering
import torch
tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
input_ids = tokenizer.encode('Sky color?', 'Today the sky is blue. and it is cold out there')
tokens = tokenizer.convert_ids_to_tokens(input_ids)
sep_index = input_ids.index(tokenizer.sep_token_id)
num_seg_a = sep_index + 1
num_seg_b = len(input_ids) - num_seg_a
segment_ids = [0]*num_seg_a + [1]*num_seg_b
assert len(segment_ids) == len(input_ids)
outputs = model(torch.tensor([input_ids]),
token_type_ids=torch.tensor([segment_ids]),
return_dict=True)
start_scores = outputs.start_logits
end_scores = outputs.end_logits
answer_start = torch.argmax(start_scores)
answer_end = torch.argmax(end_scores)
answer = ' '.join(tokens[answer_start:answer_end+1])
print(answer)
这是输出:blue
如果使用 simpletransformers 中的选项 2 QuestionAnsweringModel,我可以一次提出多个问题并获得答案的概率。
下面是来自 simpletransformers
的 QuestionAnsweringModel 的代码
from simpletransformers.question_answering import QuestionAnsweringModel
model = QuestionAnsweringModel('distilbert', 'distilbert-base-uncased-distilled-squad', use_cuda=False)
question_data = {
'qas': [{
'question': 'Sky color?',
'id': 0,
},
{
'question': 'weather?',
'id': 1,
}
],
'context': 'Today the sky is blue. and it is cold out there'
}
prediction = model.predict([question_data])
output = {'result': list(prediction)}
print(output)
这是输出:
{
"result":[
[
{
"id":0,
"answer":["blue", "the sky is blue", "blue."]
},
{
"id":1,
"answer":["cold", "it is cold", "cold out there"]
}
],
[
{
"id":0,
"probability":[0.8834650211919095,0.0653234009794176,0.031404456093241565]
},
{
"id":1,
"probability":[0.6851319220199236,0.18145769901523698,0.05004994980074798]
}
]
]
}
如您所见,对于相同的上下文,我可以一次提出多个问题并获得每个答案的概率。
有没有办法在选项#1 中为 BERT 模型获得类似的输出。
我需要一种方法来将多个问题设置为上下文,还需要响应中每个答案的概率。
如有任何帮助,我们将不胜感激。
您可以使用抱脸 question answering pipeline 来实现:
from transformers import pipeline
model_checkpoint = 'bert-large-uncased-whole-word-masking-finetuned-squad' qa_pipeline = pipeline('question-answering', model=model_checkpoint, tokenizer=model_checkpoint)
qa_pipeline(question=['Sky color?', 'weather?'], context='Today the sky is blue. and it is cold out there')
输出:
[{'score': 0.9102755188941956, 'start': 17, 'end': 21, 'answer': 'blue'},
{'score': 0.49744659662246704, 'start': 33, 'end': 37, 'answer': 'cold'}]
我是 AI 模型的新手,目前正在试验 QandA 模型。我特别对以下 2 个模型感兴趣。
1。从变压器导入 BertForQuestionAnswering
2。来自 simpletransformers.question_answering 导入 QuestionAnsweringModel
使用选项 1 BertForQuestionAnswering 我得到了想要的结果。但是我一次只能问一个问题。我也没有得到答案的概率。
下面是来自 transformers 的 BertForQuestionAnswering 的代码。
from transformers import BertTokenizer, BertForQuestionAnswering
import torch
tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
input_ids = tokenizer.encode('Sky color?', 'Today the sky is blue. and it is cold out there')
tokens = tokenizer.convert_ids_to_tokens(input_ids)
sep_index = input_ids.index(tokenizer.sep_token_id)
num_seg_a = sep_index + 1
num_seg_b = len(input_ids) - num_seg_a
segment_ids = [0]*num_seg_a + [1]*num_seg_b
assert len(segment_ids) == len(input_ids)
outputs = model(torch.tensor([input_ids]),
token_type_ids=torch.tensor([segment_ids]),
return_dict=True)
start_scores = outputs.start_logits
end_scores = outputs.end_logits
answer_start = torch.argmax(start_scores)
answer_end = torch.argmax(end_scores)
answer = ' '.join(tokens[answer_start:answer_end+1])
print(answer)
这是输出:blue
如果使用 simpletransformers 中的选项 2 QuestionAnsweringModel,我可以一次提出多个问题并获得答案的概率。
下面是来自 simpletransformers
的 QuestionAnsweringModel 的代码from simpletransformers.question_answering import QuestionAnsweringModel
model = QuestionAnsweringModel('distilbert', 'distilbert-base-uncased-distilled-squad', use_cuda=False)
question_data = {
'qas': [{
'question': 'Sky color?',
'id': 0,
},
{
'question': 'weather?',
'id': 1,
}
],
'context': 'Today the sky is blue. and it is cold out there'
}
prediction = model.predict([question_data])
output = {'result': list(prediction)}
print(output)
这是输出:
{
"result":[
[
{
"id":0,
"answer":["blue", "the sky is blue", "blue."]
},
{
"id":1,
"answer":["cold", "it is cold", "cold out there"]
}
],
[
{
"id":0,
"probability":[0.8834650211919095,0.0653234009794176,0.031404456093241565]
},
{
"id":1,
"probability":[0.6851319220199236,0.18145769901523698,0.05004994980074798]
}
]
]
}
如您所见,对于相同的上下文,我可以一次提出多个问题并获得每个答案的概率。
有没有办法在选项#1 中为 BERT 模型获得类似的输出。 我需要一种方法来将多个问题设置为上下文,还需要响应中每个答案的概率。
如有任何帮助,我们将不胜感激。
您可以使用抱脸 question answering pipeline 来实现:
from transformers import pipeline
model_checkpoint = 'bert-large-uncased-whole-word-masking-finetuned-squad' qa_pipeline = pipeline('question-answering', model=model_checkpoint, tokenizer=model_checkpoint)
qa_pipeline(question=['Sky color?', 'weather?'], context='Today the sky is blue. and it is cold out there')
输出:
[{'score': 0.9102755188941956, 'start': 17, 'end': 21, 'answer': 'blue'},
{'score': 0.49744659662246704, 'start': 33, 'end': 37, 'answer': 'cold'}]