将文本拆分为单独的行 python

Question

我正在努力将数组转换为单个标记。目前我使用了以下代码，但没有得到我想要的确切输出。因为我希望数字也能成为其中的一部分。

text = df.head(3)[['processed_arti', 'cluster']].values    // where df is a pandas dataframe

terms = [b for l in text for b in zip (l[0].split(" "))]

我在下面添加了另一张图片，显示了数据外观的更多细节。读入 pandas 数据框。

我非常感谢对此的任何帮助。提前致谢。

Answer 1

这不是你需要的吗？你只需要在你的话旁边加上数字：

terms = [(b, n) for l, n in text for b in l.split(" ")]

Answer 2

首先你得到一个包含你的元组的列表：

[[(word, l[1]) for word in l[0].split('0')] for l in a] # a being your array.

然后将列表列表展平：参见 How to make a flat list out of list of lists?

或者更好，正如 Yevhen Kuzmovych 所建议的：

[(word, l[1]) for l in a for word in l[0].split('0')]

注意：未验证。在我的手机上输入。

将文本拆分为单独的行 python

split text into individual row python

python

loops

token