Slugify 列并将 Python Pandas 中的 csv 解析为新的 csv 输出
Slugify column and parse csv in Python Pandas to a new csv output
我是 Python 和 Pandas 的新手。
不确定我在代码中做错了什么,但我只是想将 csv 列中给出的产品名称值转换为新的输出 csv,作为相应产品名称的 slug 值。
输入为:product-feed.csv
product_name
V-Neck T-Shirt
Hoodie with Logo
Long Sleeve T-Shirt
Hoodie with Pocket
Hoodie with Zipper
Long Sleeve Tee
Polo Neck Tee
V-Neck T-Shirt - Red
V-Neck T-Shirt - Green
V-Neck T-Shirt - Blue
当我 运行 VS Code 终端中的 py 文件时,预期输出 (slugged-output.csv) 应该是这样的:
product_name
v-neck-t-shirt
hoodie-with-logo
long-sleeve-t-shirt
hoodie-with-pocket
hoodie-with-zipper
long-sleeve-tee
polo-neck-tee
v-neck-t-shirt-red
v-neck-t-shirt-green
v-neck-t-shirt-blue
parse_code.py 是这样的: 注意:我正在使用 https://pypi.org/project/python-slugify/ 模块来传递这个来转换代码中的 slugs:
import pandas as pd
from slugify import slugify
df = pd.read_csv("product-feed.csv", dtype="str")
df["product_name"] = slugify(str(df["product_name"]))
# VIEW TO DEBUG ONLY
print(df["product_name"])
df["product_name"].to_csv(path_or_buf="slugged-output.csv", index=False, sep=";", quoting=1, encoding="UTF-8")
问题:我的输出是这样的:
slugged-output.csv(控制台打印)
"product_name"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
请告诉我我在代码中遗漏了什么……谢谢 :)
最终在互联网上搜索了一下后,我遇到了这个 page 并且这一行代码解决了所有问题!!
df["product_name"] = df["product_name"].fillna('').apply(lambda x: slugify(x))
从我的原始代码中删除并替换为上面的行!
df["product_name"] = slugify(str(df["product_name"]))
我是 Python 和 Pandas 的新手。
不确定我在代码中做错了什么,但我只是想将 csv 列中给出的产品名称值转换为新的输出 csv,作为相应产品名称的 slug 值。
输入为:product-feed.csv
product_name |
---|
V-Neck T-Shirt |
Hoodie with Logo |
Long Sleeve T-Shirt |
Hoodie with Pocket |
Hoodie with Zipper |
Long Sleeve Tee |
Polo Neck Tee |
V-Neck T-Shirt - Red |
V-Neck T-Shirt - Green |
V-Neck T-Shirt - Blue |
当我 运行 VS Code 终端中的 py 文件时,预期输出 (slugged-output.csv) 应该是这样的:
product_name |
---|
v-neck-t-shirt |
hoodie-with-logo |
long-sleeve-t-shirt |
hoodie-with-pocket |
hoodie-with-zipper |
long-sleeve-tee |
polo-neck-tee |
v-neck-t-shirt-red |
v-neck-t-shirt-green |
v-neck-t-shirt-blue |
parse_code.py 是这样的: 注意:我正在使用 https://pypi.org/project/python-slugify/ 模块来传递这个来转换代码中的 slugs:
import pandas as pd
from slugify import slugify
df = pd.read_csv("product-feed.csv", dtype="str")
df["product_name"] = slugify(str(df["product_name"]))
# VIEW TO DEBUG ONLY
print(df["product_name"])
df["product_name"].to_csv(path_or_buf="slugged-output.csv", index=False, sep=";", quoting=1, encoding="UTF-8")
问题:我的输出是这样的:
slugged-output.csv(控制台打印)
"product_name"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
"0-v-neck-t-shirt-1-hoodie-with-logo-2-long-sleeve-t-shirt-3-hoodie-with-pocket-4-hoodie-with-zipper-5-long-sleeve-tee-6-polo-neck-tee-7-v-neck-t-shirt-red-8-v-neck-t-shirt-green-9-v-neck-t-shirt-blue-name-product-name-dtype-object"
请告诉我我在代码中遗漏了什么……谢谢 :)
最终在互联网上搜索了一下后,我遇到了这个 page 并且这一行代码解决了所有问题!!
df["product_name"] = df["product_name"].fillna('').apply(lambda x: slugify(x))
从我的原始代码中删除并替换为上面的行!
df["product_name"] = slugify(str(df["product_name"]))