如果指定列不为空，则重复列元素

Question

我的数据框中有四列。 A 列是名称，B、C 和 D 列是指定给 A 列中的名称的语言代码。我想创建一个将 B、C 和 D 列合并到一列中，并将它们的指定名称放在相邻的列中。示例数据框应以更清晰的方式说明操作。谁能帮我解决这个问题？感谢您的帮助！！

当前 df

Name     Language 1     Language 2     Language 3
one         en             NaN            NaN
two         ko             ja             zh-CN
three       fr             de             NaN
four        nl             ml             NaN
five        kh             NaN            NaN
six         hi             en             es

我认为这将是一个从宽到长的操作或某种类型。

期望的输出

Name     Language
one         en
two         ko
two         ja
two       zh-CN
three       fr
three       de
four        nl
four        ml
five        kh
six         hi
six         en
six         es

再次感谢！

Answer 1

将 Name 列设置为索引，然后 stack the remaining columns, which are languages, into one. This results in an extra index, with a values column, and all the null values excluded. The extra index is not relevant, so drop it with droplevel. Finally, reset index 将其作为数据帧取回，并将 Language 的参数传递给名称参数。

df.set_index("Name").stack().droplevel(-1).reset_index(name="Language")

    Name    Language
0   one      en
1   two      ko
2   two      ja
3   two      zh-CN
4   three    fr
5   three    de
6   four     nl
7   four     ml
8   five     kh
9   six      hi
10  six      en
11  six      es

如果指定列不为空，则重复列元素

Duplicate column A element if designated column is not null

python

merge

concat

pandas