提取两个相似标题之间带有特定单词的段落
Extract Paragraph with specific words between two similar titiles
我的文本文件包含类似这样的段落。
summary
A result oriented and dedicated professional with three years’ experience in Software Development. A proactive individual with a logical approach to challenges, performs effectively even within a highly pressurised working environment.
summary
Oct 28th, 2010 – Till date Cognizant Technology Solutions
Project #1
Title Wealth Passport – R7.3
Client Northern Trust
Operating System Windows XP
Technologies J2EE, JSP, Struts, Oracle, PL/SQL
Team Size 3
Role Team Member
Period 22nd Aug’ 2013 - Till Date
Project Description
Wealth Passport R7.3 release aims at enhancements in four projects SGY, PMM, WPA and WPX. This primarily involves analysing existing issues in the four applications and enhancements to some of the functionalities.
Role and Responsibilities
Handled dockets in SGY and PMM applications.
Done root cause analysis to existing issues in a short span of time.
Designed and developed enhancements in PMM application.
Preparing Unit Test cases for the developed Java modules and executing them.
Project #2
Title PFS Development – WP Filecabinet and R7.2
Client Northern Trust
Operating System Windows XP
Technologies J2EE, JSP, Struts, Weblogic Portal, Oracle, PL/SQL, UNIX, Hibernate, Spring, DOJO
Team Size 1
Role Team Member – JavaEE Developer
Period 18th June’ 2013 – 21st Aug’ 2013
Project Description
PFS Development project is to provide the development services for PFS capital projects: Wealth Passport, Private Passport 6.0 and Private Passport 7.0
Wealth Passport Filecabinet provides functionality for users to store their files on our system. This enables users to create folders, upload files and view the uploaded files. Batch upload/delete option is also available. Deleted files will be moved to Waste Bucket, from where users can restore should they wish. This project aims at improving the performance of Filecabinet which was mandated by increasing customer base and files handled by the system.
现在,我想提取包含 "Project", "Teamsize "
等词的部分摘要
不提取其他摘要部分。
我在下面尝试过这段代码,它提取了两个摘要内容
import re
import os
with open ('9.txt', encoding='latin-1') as infile, open ('d.txt','w',encoding='latin-1') as outfile :
copy = False
for line in infile:
if line.strip() == 'summary':
re.compile('\r\nproject*\r\n')
copy = True
elif line.strip() == "summary":
copy =False
elif copy:
outfile.write(line)
#fh = open("d.txt",'r')
contents = fh.read()
len(contents)
我希望保存一个文本文件 d.txt,其中包含内容
summary
Oct 28th, 2010 – Till date Cognizant Technology Solutions
Project #1
Title Wealth Passport – R7.3
Client Northern Trust
Operating System Windows XP
Technologies J2EE, JSP, Struts, Oracle, PL/SQL
Team Size 3
Role Team Member
Period 22nd Aug’ 2013 - Till Date
Project Description
Wealth Passport R7.3 release aims at enhancements in four projects SGY, PMM, WPA and WPX. This primarily involves analysing existing issues in the four applications and enhancements to some of the functionalities.
Role and Responsibilities
Handled dockets in SGY and PMM applications.
Done root cause analysis to existing issues in a short span of time.
Designed and developed enhancements in PMM application.
Preparing Unit Test cases for the developed Java modules and executing them.
Project #2
Title PFS Development – WP Filecabinet and R7.2
Client Northern Trust
Operating System Windows XP
Technologies J2EE, JSP, Struts, Weblogic Portal, Oracle, PL/SQL, UNIX, Hibernate, Spring, DOJO
Team Size 1
Role Team Member – JavaEE Developer
Period 18th June’ 2013 – 21st Aug’ 2013
Project Description
PFS Development project is to provide the development services for PFS capital projects: Wealth Passport, Private Passport 6.0 and Private Passport 7.0
Wealth Passport Filecabinet provides functionality for users to store their files on our system. This enables users to create folders, upload files and view the uploaded files. Batch upload/delete option is also available. Deleted files will be moved to Waste Bucket, from where users can restore should they wish. This project aims at improving the performance of Filecabinet which was mandated by increasing customer base and files handled by the system.
这里的第二个条件语句永远不会 运行,因为它与第一个条件相同。意思是在 summary
.
的第一个实例之后,副本将始终是 True
if line.strip() == 'summary':
re.compile('\r\nproject*\r\n')
copy = True
elif line.strip() == "summary":
copy =False
我建议的是使用一个语句来选择 "summary" 标签(我假设这些是 start/end 个评论块)- 并切换 copy
。
要切换布尔值,您可以简单地将其设置为自身的倒数:
a = True
a = not a
# a is now False
例如:
if line.strip() == 'summary':
copy = not copy
elif copy:
outfile.write(line)
要提取包含您感兴趣的字词的所有 summary
部分:
split_on = 'summary\n\n'
must_contain = ['Project', 'Team Size']
with open('9.txt') as f_input, open('d.txt', 'w') as f_output:
for part in f_input.read().split(split_on):
if all(text in part for text in must_contain):
f_output.write(split_on + part)
我的文本文件包含类似这样的段落。
summary
A result oriented and dedicated professional with three years’ experience in Software Development. A proactive individual with a logical approach to challenges, performs effectively even within a highly pressurised working environment.
summary
Oct 28th, 2010 – Till date Cognizant Technology Solutions
Project #1
Title Wealth Passport – R7.3
Client Northern Trust
Operating System Windows XP
Technologies J2EE, JSP, Struts, Oracle, PL/SQL
Team Size 3
Role Team Member
Period 22nd Aug’ 2013 - Till Date
Project Description
Wealth Passport R7.3 release aims at enhancements in four projects SGY, PMM, WPA and WPX. This primarily involves analysing existing issues in the four applications and enhancements to some of the functionalities.
Role and Responsibilities
Handled dockets in SGY and PMM applications.
Done root cause analysis to existing issues in a short span of time.
Designed and developed enhancements in PMM application.
Preparing Unit Test cases for the developed Java modules and executing them.
Project #2
Title PFS Development – WP Filecabinet and R7.2
Client Northern Trust
Operating System Windows XP
Technologies J2EE, JSP, Struts, Weblogic Portal, Oracle, PL/SQL, UNIX, Hibernate, Spring, DOJO
Team Size 1
Role Team Member – JavaEE Developer
Period 18th June’ 2013 – 21st Aug’ 2013
Project Description
PFS Development project is to provide the development services for PFS capital projects: Wealth Passport, Private Passport 6.0 and Private Passport 7.0
Wealth Passport Filecabinet provides functionality for users to store their files on our system. This enables users to create folders, upload files and view the uploaded files. Batch upload/delete option is also available. Deleted files will be moved to Waste Bucket, from where users can restore should they wish. This project aims at improving the performance of Filecabinet which was mandated by increasing customer base and files handled by the system.
现在,我想提取包含 "Project", "Teamsize "
等词的部分摘要
不提取其他摘要部分。
我在下面尝试过这段代码,它提取了两个摘要内容
import re
import os
with open ('9.txt', encoding='latin-1') as infile, open ('d.txt','w',encoding='latin-1') as outfile :
copy = False
for line in infile:
if line.strip() == 'summary':
re.compile('\r\nproject*\r\n')
copy = True
elif line.strip() == "summary":
copy =False
elif copy:
outfile.write(line)
#fh = open("d.txt",'r')
contents = fh.read()
len(contents)
我希望保存一个文本文件 d.txt,其中包含内容
summary
Oct 28th, 2010 – Till date Cognizant Technology Solutions
Project #1
Title Wealth Passport – R7.3
Client Northern Trust
Operating System Windows XP
Technologies J2EE, JSP, Struts, Oracle, PL/SQL
Team Size 3
Role Team Member
Period 22nd Aug’ 2013 - Till Date
Project Description
Wealth Passport R7.3 release aims at enhancements in four projects SGY, PMM, WPA and WPX. This primarily involves analysing existing issues in the four applications and enhancements to some of the functionalities.
Role and Responsibilities
Handled dockets in SGY and PMM applications.
Done root cause analysis to existing issues in a short span of time.
Designed and developed enhancements in PMM application.
Preparing Unit Test cases for the developed Java modules and executing them.
Project #2
Title PFS Development – WP Filecabinet and R7.2
Client Northern Trust
Operating System Windows XP
Technologies J2EE, JSP, Struts, Weblogic Portal, Oracle, PL/SQL, UNIX, Hibernate, Spring, DOJO
Team Size 1
Role Team Member – JavaEE Developer
Period 18th June’ 2013 – 21st Aug’ 2013
Project Description
PFS Development project is to provide the development services for PFS capital projects: Wealth Passport, Private Passport 6.0 and Private Passport 7.0
Wealth Passport Filecabinet provides functionality for users to store their files on our system. This enables users to create folders, upload files and view the uploaded files. Batch upload/delete option is also available. Deleted files will be moved to Waste Bucket, from where users can restore should they wish. This project aims at improving the performance of Filecabinet which was mandated by increasing customer base and files handled by the system.
这里的第二个条件语句永远不会 运行,因为它与第一个条件相同。意思是在 summary
.
True
if line.strip() == 'summary':
re.compile('\r\nproject*\r\n')
copy = True
elif line.strip() == "summary":
copy =False
我建议的是使用一个语句来选择 "summary" 标签(我假设这些是 start/end 个评论块)- 并切换 copy
。
要切换布尔值,您可以简单地将其设置为自身的倒数:
a = True
a = not a
# a is now False
例如:
if line.strip() == 'summary':
copy = not copy
elif copy:
outfile.write(line)
要提取包含您感兴趣的字词的所有 summary
部分:
split_on = 'summary\n\n'
must_contain = ['Project', 'Team Size']
with open('9.txt') as f_input, open('d.txt', 'w') as f_output:
for part in f_input.read().split(split_on):
if all(text in part for text in must_contain):
f_output.write(split_on + part)