将 txt 文件转换为 pandas 数据框
Transform a txt file to a pandas dataframe
您好,我有以下 txt 文件
December
line: 285 - event ID: 67511
line: 296 - event ID: 67512
November
line: 305 - event ID: 67515
line: 300 - event ID: 67517
我想把它转化成下面的数据框
df1 = pd.DataFrame(
{
"index": ["December", "December", "November", "November"],
"index1": ["285", "296", "305", "300"],
"eventid": ["67511", "67512", "64515", "64517"]})
index index1 eventid
0 December 285 67511
1 December 296 67512
2 November 305 64515
3 November 300 64517
有什么想法吗?
我已经使用模式匹配来实现你所需要的:
import re
import pandas as pd
res = []
month_pattern = re.compile("^\w+$")
line_pattern = re.compile("\d+")
current_month = ""
with open("FILE_PATH_TO_YOUR_DATA", "r") as f:
for line in f:
m = month_pattern.findall(line)
if len(m) > 0:
current_month = m[0]
m = line_pattern.findall(line)
if len(m) > 0:
res.append([current_month] + m)
df = pd.DataFrame(res, columns = ["index", "index1", "eventid"])
print(df)
输出
index index1 eventid
0 December 285 67511
1 December 296 67512
2 November 305 67515
3 November 300 67517
您好,我有以下 txt 文件
December
line: 285 - event ID: 67511
line: 296 - event ID: 67512
November
line: 305 - event ID: 67515
line: 300 - event ID: 67517
我想把它转化成下面的数据框
df1 = pd.DataFrame(
{
"index": ["December", "December", "November", "November"],
"index1": ["285", "296", "305", "300"],
"eventid": ["67511", "67512", "64515", "64517"]})
index index1 eventid
0 December 285 67511
1 December 296 67512
2 November 305 64515
3 November 300 64517
有什么想法吗?
我已经使用模式匹配来实现你所需要的:
import re
import pandas as pd
res = []
month_pattern = re.compile("^\w+$")
line_pattern = re.compile("\d+")
current_month = ""
with open("FILE_PATH_TO_YOUR_DATA", "r") as f:
for line in f:
m = month_pattern.findall(line)
if len(m) > 0:
current_month = m[0]
m = line_pattern.findall(line)
if len(m) > 0:
res.append([current_month] + m)
df = pd.DataFrame(res, columns = ["index", "index1", "eventid"])
print(df)
输出
index index1 eventid
0 December 285 67511
1 December 296 67512
2 November 305 67515
3 November 300 67517