如何从文本文件中拆分数据并在特定位置创建单独的列

Question

这是一个示例数据

ao112   qwertyuiopasdfgh
ao12234 isbcbcobcwocbwowd
ao12    lscnldcnoodqhiod

我想使用 .txt 文件创建数据框，在此示例中数据框需要使用此信息创建单独的列（2 列）上校名称 |位置代码 |1-7 空白|8 描述|9-结束我需要通过从处于空白位置的示例案例中的特定位置分离来创建数据框列

尝试使用此代码但无法找到我应该在 sep

中使用的参数

data=pd.read_csv('filepath',sep=' ',name=['code','description'])

Answer 1

您可以使用 pandas.read_fwf:

您可以使用 colspecs='infer' 推断格式或提供范围的完整列表。

colspecslist of tuple (int, int) or ‘infer’. optional

A list of tuples giving the extents of the fixed-width fields of each line as half-open intervals (i.e., [from, to[ ). String value ‘infer’ can be used to instruct the parser to try detecting the column specifications from the first 100 rows of the data which are not being skipped via skiprows (default=’infer’).

(pd.read_fwf('filepath', colspecs='infer', header=None)
   .set_axis(['code','description'], axis=1)
)

或者，使用带有 pandas.read_csv 的正则表达式分隔符（如果您有一个或多个空格作为分隔符）：

pd.read_csv('filepath', sep='\s+', names=['code','description'])

输出：

      code        description
0    ao112   qwertyuiopasdfgh
1  ao12234  isbcbcobcwocbwowd
2     ao12   lscnldcnoodqhiod

如何从文本文件中拆分数据并在特定位置创建单独的列

how to split data from text file and create separate columns at a specific position

python

file

dataframe

pandas