如何使用 python 提取文本文件中的特定段落并将其保存在 csv 文件中?
How to extract specific paragraph in a text file and save it in csv file using python?
我有一个包含标题、作者、摘要、DOI 等信息的文本文件。我只想提取摘要并将其存储在数据框中。我尝试使用下面的代码,但我得到了作者信息和 DOI,我只想要 作者信息: 和 DOI: 之间的中间段落。如何获取该特定段落并将其存储在数据框中
extracted_lines=[]
extract = False
for line in open("abstract.txt"):
if extract == False and "Author information:" in line.strip():
extract = True
if extract:
extracted_lines.append(line)
if "DOI:" in line.strip():
extract = False
print("".join(extracted_lines))
**Output**
Author information:
(1)Carol Davila University of Medicine and Pharmacy, 37, Dionisie Lupu St,
Bucharest, Romania 020021.
(2)National Institute of Public Health, 1-3 Doctor Leonte Anastasievici St,
Bucharest, Romania 050463.
Dark chocolate is not the most popular chocolate; the higher concentration in
antioxidants pays tribute to the increment in bitterness. The caloric density of
dark chocolate is potentially lower but has a large variability according to
recipes and ingredients. Nevertheless, in the last decade, the interest in dark
chocolate as a potential functional food has constantly increased. In this
review, we present the nutritional composition, factors influencing the
bioavailability, and health outcomes of dark chocolate intake. We have extracted
pro- and counter-arguments to illustrate these effects from both experimental
and clinical studies in an attempt to solve the dilemma. The antioxidative and
anti-inflammatory abilities, the cardiovascular and metabolic effects, and
influences on central neural functions were selected to substantiate the main
positive consequences. Beside the caloric density, we have included reports
placing responsibility on chocolate as a migraine trigger or as an inducer of
the gastroesophagial reflux in the negative effects section. Despite an
extensive literature review, there are not large enough studies specifically
dedicated to dark chocolate that took into consideration possible confounders on
the health-related effects. Therefore, a definite answer on our initial question
is, currently, not available.
DOI: 10.5740/jaoacint.19-0132
Author information:
(1)School of Food Science and Nutrition, Faculty of Maths and Physical Sciences,
University of Leeds, Leeds LS2 9JT, UK.
(2)School of Food Science and Nutrition, Faculty of Maths and Physical Sciences,
University of Leeds, Leeds LS2 9JT, UK. Electronic address:
g.williamson@leeds.ac.uk.
Dark chocolate contains many biologically active components, such as catechins,
procyanidins and theobromine from cocoa, together with added sucrose and lipids.
All of these can directly or indirectly affect the cardiovascular system by
multiple mechanisms. Intervention studies on healthy and
metabolically-dysfunctional volunteers have suggested that cocoa improves blood
pressure, platelet aggregation and endothelial function. The effect of chocolate
is more convoluted since the sucrose and lipid may transiently and negatively
impact on endothelial function, partly through insulin signalling and nitric
oxide bioavailability. However, few studies have attempted to dissect out the
role of the individual components and have not explored their possible
interactions. For intervention studies, the situation is complex since suitable
placebos are often not available, and some benefits may only be observed in
individuals showing mild metabolic dysfunction. For chocolate, the effects of
some of the components, such as sugar and epicatechin on FMD, may oppose each
other, or alternatively in some cases may act together, such as theobromine and
epicatechin. Although clearly cocoa provides some cardiovascular benefits
according to many human intervention studies, the exact components, their
interactions and molecular mechanisms are still under debate.
Copyright © 2015 Elsevier Inc. All rights reserved.
DOI: 10.1016/j.vph.2015.05.011
Expected Output
Index Abstract
0 Dark chocolate is not the most popular chocola...
1 Dark chocolate contains many biologically acti...
你可以试试:
- 以字符串形式检索文件的全部内容
- 拆分 'Author information:\n',以检索有关每篇论文的信息
- 获取论文的索引 1,以检索摘要
代码如下:
with open("abstract.txt") as f:
contents = f.read()
papers = [p for p in contents.split('Author information:\n')]
abstracts = [p.split("\n\n")[1] for p in papers[1:]
对你有用吗?
我有一个包含标题、作者、摘要、DOI 等信息的文本文件。我只想提取摘要并将其存储在数据框中。我尝试使用下面的代码,但我得到了作者信息和 DOI,我只想要 作者信息: 和 DOI: 之间的中间段落。如何获取该特定段落并将其存储在数据框中
extracted_lines=[]
extract = False
for line in open("abstract.txt"):
if extract == False and "Author information:" in line.strip():
extract = True
if extract:
extracted_lines.append(line)
if "DOI:" in line.strip():
extract = False
print("".join(extracted_lines))
**Output**
Author information:
(1)Carol Davila University of Medicine and Pharmacy, 37, Dionisie Lupu St,
Bucharest, Romania 020021.
(2)National Institute of Public Health, 1-3 Doctor Leonte Anastasievici St,
Bucharest, Romania 050463.
Dark chocolate is not the most popular chocolate; the higher concentration in
antioxidants pays tribute to the increment in bitterness. The caloric density of
dark chocolate is potentially lower but has a large variability according to
recipes and ingredients. Nevertheless, in the last decade, the interest in dark
chocolate as a potential functional food has constantly increased. In this
review, we present the nutritional composition, factors influencing the
bioavailability, and health outcomes of dark chocolate intake. We have extracted
pro- and counter-arguments to illustrate these effects from both experimental
and clinical studies in an attempt to solve the dilemma. The antioxidative and
anti-inflammatory abilities, the cardiovascular and metabolic effects, and
influences on central neural functions were selected to substantiate the main
positive consequences. Beside the caloric density, we have included reports
placing responsibility on chocolate as a migraine trigger or as an inducer of
the gastroesophagial reflux in the negative effects section. Despite an
extensive literature review, there are not large enough studies specifically
dedicated to dark chocolate that took into consideration possible confounders on
the health-related effects. Therefore, a definite answer on our initial question
is, currently, not available.
DOI: 10.5740/jaoacint.19-0132
Author information:
(1)School of Food Science and Nutrition, Faculty of Maths and Physical Sciences,
University of Leeds, Leeds LS2 9JT, UK.
(2)School of Food Science and Nutrition, Faculty of Maths and Physical Sciences,
University of Leeds, Leeds LS2 9JT, UK. Electronic address:
g.williamson@leeds.ac.uk.
Dark chocolate contains many biologically active components, such as catechins,
procyanidins and theobromine from cocoa, together with added sucrose and lipids.
All of these can directly or indirectly affect the cardiovascular system by
multiple mechanisms. Intervention studies on healthy and
metabolically-dysfunctional volunteers have suggested that cocoa improves blood
pressure, platelet aggregation and endothelial function. The effect of chocolate
is more convoluted since the sucrose and lipid may transiently and negatively
impact on endothelial function, partly through insulin signalling and nitric
oxide bioavailability. However, few studies have attempted to dissect out the
role of the individual components and have not explored their possible
interactions. For intervention studies, the situation is complex since suitable
placebos are often not available, and some benefits may only be observed in
individuals showing mild metabolic dysfunction. For chocolate, the effects of
some of the components, such as sugar and epicatechin on FMD, may oppose each
other, or alternatively in some cases may act together, such as theobromine and
epicatechin. Although clearly cocoa provides some cardiovascular benefits
according to many human intervention studies, the exact components, their
interactions and molecular mechanisms are still under debate.
Copyright © 2015 Elsevier Inc. All rights reserved.
DOI: 10.1016/j.vph.2015.05.011
Expected Output
Index Abstract
0 Dark chocolate is not the most popular chocola...
1 Dark chocolate contains many biologically acti...
你可以试试:
- 以字符串形式检索文件的全部内容
- 拆分 'Author information:\n',以检索有关每篇论文的信息
- 获取论文的索引 1,以检索摘要
代码如下:
with open("abstract.txt") as f:
contents = f.read()
papers = [p for p in contents.split('Author information:\n')]
abstracts = [p.split("\n\n")[1] for p in papers[1:]
对你有用吗?