创建一个包含 50 列且仅填充 5 个特定列的空数据框

Question

我有一个 pandas 数据框 A，它有 5 列和几十万行。我需要的是创建一个数据框 B，它有 50 列，其中 45 列为空，另外 5 列填充了我在数据框 A 中的数据。

我需要这种格式的原因是因为我想最终转换为带有 (,) 分隔符且大部分列为空的 csv 文件。

我的数据框 A 如下所示：

id	order	first	last	type
1	111	Johnny	Depp	type1
2	222	Amber	Heard	type2

我的 Dataframe B 应该看起来像这样，最后有更多空列：

x	order	first	last	x	x	x	x	x	x	x	type	x	x	x	x
empty	111	Johnny	Depp	empty	empty	empty	empty	empty	empty	empty	type1	empty	empty	empty	empty
empty	222	Amber	Heard	empty	empty	empty	empty	empty	empty	empty	type2	empty	empty	empty	empty

如您所见，我需要为 type 列指定列的位置。这是因为我最终想用函数转换为 CSV to_csv(delimiter=',') 最终看起来像这样：

,111,Johnny,Depp,,,,,,,,,type1,,,,,
,222,Amber,Heard,,,,,,,,,type2,,,,,

Answer 1

好的，所以我假设数据框 B 的前 5 列已经填满了您需要的数据。

然后您可以循环添加任意数量的空白列：

i=4 # However many columns the df started with

while i < 50: # or however many blank columns you want to add
    df[f'column_{i}'] = ''
    i+=1

Answer 2

import pandas as pd

a = pd.DataFrame({"id": [1, 2], "order": [111, 222], "first": ["Johnny", "Amber"], "last": ["Depp", "Heard"], "type": ["type1", "type2"]})
push = ["x", "order", "first", "last"] + list("x" * 7) + ["type"] + list("x" * 4)
cols = [f"x{num}" if value == "x" else value for num, value in enumerate(push)]
b = pd.DataFrame({col: a[col] if col in a.columns.to_list() else None for col in cols})
print(b)

似乎是一个相当随意的问题，但我认为这可以解决您的特定要求。随意更改 "x" * 7 值以反映您的意愿。如果你 import numpy as np，你也可以用 np.nan 替换 None。或者您可以将 None 替换为 "" 以插入空字符串。你的问题说“空”有点含糊。

输出：

     x0  order   first   last    x4    x5    x6    x7    x8    x9   x10   type   x12   x13   x14   x15
0  None    111  Johnny   Depp  None  None  None  None  None  None  None  type1  None  None  None  None
1  None    222   Amber  Heard  None  None  None  None  None  None  None  type2  None  None  None  None

创建一个包含 50 列且仅填充 5 个特定列的空数据框

Creating an empty dataframe with 50 columns with only 5 specific columns filled

python

dataframe

pandas