如何将此单列数据放入具有适当列的数据框中

Question

我正在学习 pandas 和数据科学，是初学者。我有一个数据如下

Rahul
1
2
5
Suresh
4
2
1
Dharm
1
3
4

我希望它在我的数据框中为

如何在不遍历每一行的情况下实现这一点，因为我有数十万的数据。我搜索了很多但除了迭代之外找不到任何东西。有没有更好的办法。

感谢您的善意和耐心

Answer 1

如何最好地格式化取决于您打算用它做什么，但一个好的起点是这样做：

鉴于：

Rahul
1
2
5
Suresh
4
2
1
Dharm
1
3
4

正在做：

# Read in the file and call the column 'values':
df = pd.read_table(filepath, header=None, names=['values'])

# Create a new column with names filled in:
df['names'] = df['values'].replace('\d+', np.nan, regex=True).ffill()

# Drop the extra rows:
df = df[df['values'].str.isnumeric()].reset_index(drop=True)

print(df[['names', 'values']])

输出：

    names values
0   Rahul      1
1   Rahul      2
2   Rahul      5
3  Suresh      4
4  Suresh      2
5  Suresh      1
6   Dharm      1
7   Dharm      3
8   Dharm      4

如何将此单列数据放入具有适当列的数据框中

How to get this single column data into data frame with appropriate columns

python

dataframe

pandas

data-science