如何提取段落中的某个句子? Python
How to extract a certain sentence in a paragraph? Python
我想从查看一组特定单词的段落中提取某些句子 Object C Statement:
。段落如下:
Object A Statement: There was a cat with a bag full of meat. It was a
red cat with a blue hat. Object B Statement: There was a dog with a
bag full of toys. It was a blue dog with a green hat. Object C
Statement: There was a dolphin with a bag full of bubbles. It was a
purple dolphin with an orange hat. Object D Statement: There was a
zebra with a bag full of grass. It was a white zebra with a blue hat.
Object E Statement: There was a bear with a bag full of wood. It was a
brown bear with a black hat.
我要提取Object C语句:如下:
There was a dolphin with a bag full of bubbles. It was a purple
dolphin with an orange hat.
我遇到的所有例子都是拆分特定单词等
我试过了,但对我不起作用:
word="Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat."
a, b, c, d, e = re.split(r"\B\s(?=[^\s:]+:)", word)
regex = re.compile(r"""Object A Statement\s(.*?)Object B Statement\s(.*?)Object C Statement\s(.*?)Object D Statement\s(.*?)Object E Statement\s(.*)""", re.S|re.X)
a, b, c, d, e = regex.match(word).groups()
您可以将字符串拆分为 "\s*Object . Statement:\s*"
import re
word="Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat."
result = re.split(r"\s*Object . Statement:\s*", word)
result = [r for r in result if len(r) > 0]
print("\n".join(result))
我得到以下结果。
There was a cat with a bag full of meat. It was a red cat with a blue hat.
There was a dog with a bag full of toys. It was a blue dog with a green hat.
There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat.
There was a zebra with a bag full of grass. It was a white zebra with a blue hat.
There was a bear with a bag full of wood. It was a brown bear with a black hat.
我想从查看一组特定单词的段落中提取某些句子 Object C Statement:
。段落如下:
Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat.
我要提取Object C语句:如下:
There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat.
我遇到的所有例子都是拆分特定单词等
我试过了,但对我不起作用:
word="Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat."
a, b, c, d, e = re.split(r"\B\s(?=[^\s:]+:)", word)
regex = re.compile(r"""Object A Statement\s(.*?)Object B Statement\s(.*?)Object C Statement\s(.*?)Object D Statement\s(.*?)Object E Statement\s(.*)""", re.S|re.X)
a, b, c, d, e = regex.match(word).groups()
您可以将字符串拆分为 "\s*Object . Statement:\s*"
import re
word="Object A Statement: There was a cat with a bag full of meat. It was a red cat with a blue hat. Object B Statement: There was a dog with a bag full of toys. It was a blue dog with a green hat. Object C Statement: There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat. Object D Statement: There was a zebra with a bag full of grass. It was a white zebra with a blue hat. Object E Statement: There was a bear with a bag full of wood. It was a brown bear with a black hat."
result = re.split(r"\s*Object . Statement:\s*", word)
result = [r for r in result if len(r) > 0]
print("\n".join(result))
我得到以下结果。
There was a cat with a bag full of meat. It was a red cat with a blue hat.
There was a dog with a bag full of toys. It was a blue dog with a green hat.
There was a dolphin with a bag full of bubbles. It was a purple dolphin with an orange hat.
There was a zebra with a bag full of grass. It was a white zebra with a blue hat.
There was a bear with a bag full of wood. It was a brown bear with a black hat.