如何将另一列添加到我的数据框中，这是我另一列的计数 "tags"

Question

我是 python 的新手，我正在尝试计算字符串中标签的数量。

我发现有人说要计算逗号的 then + 1，这是有道理的。没有意义的是如何将其变成适用于每一行的列。
我的数据框称为数据，设置如下：

product_id  sku       total_sold  tags           total_images 
grgeggre    rgerg     456         Up1_, Up2      5

我希望它看起来像下面这样：

product_id  sku       total_sold  tags           total_images  total tags
grgeggre    rgerg     456         Up1_, Up2      5             2

我试过：

tgs = data['tags']
tgsc = tgs.count("," in data["tags"] + str(1))
print(tgsc)

哪个不行，有什么想法吗？

Answer 1

我认为 apply 的简单 lambda 函数应该可以解决问题：

data["total_tags"] = data["tags"].apply(lambda x : len(x.split(',')))

说明： DataFrame.apply():

Apply a function along an axis of the DataFrame.
Objects passed to the function are Series objects whose index is either the DataFrame’s index (axis=0) or the DataFrame’s columns (axis=1). By default (result_type=None), the final return type is inferred from the return type of the applied function. Otherwise, it depends on the result_type argument.

见pandas documentation

因此我们将一个函数（lambda 函数）应用到列 "tags" 的数据框的每一行。
在这种情况下，lambda 函数是一个匿名函数，x 作为 "input arguments"，len(x.split(',')) 作为函数体。所以这个函数被应用到列的每一行 "tags".
对于 split() 请参阅 str.split documentation 它将定义的分隔符处的字符串拆分为一个数组。这个数组的长度就是逗号分隔的标签个数。

希望这个解释对您有所帮助

如何将另一列添加到我的数据框中，这是我另一列的计数 "tags"

How to add another column to my dataframe, that is a count of my other column "tags"

python

count

calculated-columns