从 txt 文件创建数据框
Create dataframe from txt file
我有一个具有以下结构的文本文件
"ts": "2021-01-29T00:06:46.929363"
"from": "text"
"to": "text"
"body": "text"
txt 文件很大。
如何创建具有以下结构的数据框
ts
from
to
body
timestamp
text
text
text
timestamp
text
text
text
timestamp
text
text
text
timestamp
text
text
text
timestamp
text
text
text
非常感谢任何帮助!
读取文件,每行更新一个dict
,当有4个key时,保存并开始一个新的dict,最后构建dataframe
import pandas as pd
with open("data.txt") as f:
batch = {}
result = []
for line in f:
key, value = line.rstrip().split(":", maxsplit=1)
batch[key.strip('" ')] = value.strip('" ')
if len(batch) == 4:
result.append(batch)
batch = {}
df = pd.DataFrame(result)
我有一个具有以下结构的文本文件
"ts": "2021-01-29T00:06:46.929363"
"from": "text"
"to": "text"
"body": "text"
txt 文件很大。
如何创建具有以下结构的数据框
ts | from | to | body |
---|---|---|---|
timestamp | text | text | text |
timestamp | text | text | text |
timestamp | text | text | text |
timestamp | text | text | text |
timestamp | text | text | text |
非常感谢任何帮助!
读取文件,每行更新一个dict
,当有4个key时,保存并开始一个新的dict,最后构建dataframe
import pandas as pd
with open("data.txt") as f:
batch = {}
result = []
for line in f:
key, value = line.rstrip().split(":", maxsplit=1)
batch[key.strip('" ')] = value.strip('" ')
if len(batch) == 4:
result.append(batch)
batch = {}
df = pd.DataFrame(result)