Python:遍历数据集以将值传递到字典中的最佳方法是什么?
Python: what is the best way to iterate through a dataset to pass the values into a dict?
我有一个功能可以通过 Google DFP API 将数据发送到我的广告服务器。当我的变量(order_id、targeted_placement_id 等)将数据硬编码到它时,该函数起作用。
我的数据来自 'ad_data.csv',其中每一列都是键,关联行中的数据是值。我想遍历此数据集并将 csv 文件中每一行的值传递到 line_item
dict 中的正确值。下面是我的pandasDataFrame.head()
order_id targeted_placement_id campaign
0 3494982232 5555666677 Ad Campaign 1
1 8494984434 1112666177 Ad Campaign 2
3 4494922232 0992666677 Ad Campaign 3
4 1494984234 9494939499 Ad Campaign 4
但是,在 for 循环中,我想传递每一行 'ad_data.csv'
from googleads import dfp
import pandas as pd
df = pd.read_csv('ad_data.csv')
order_id = df['order'].tolist()
targeted_placement_id = df['placement_id'].tolist()
campaign_name = df['campaign'].tolist()
def main(client, order_id, targeted_placement_ids, campaign_name):
line_item_service = client.GetService('LineItemService')
# Create line item objects.
line_items = []
for _ in range(1):
line_item = {
'orderId': order_id,
'name': campaign_name,
'targeting': {
'inventoryTargeting':
{'targetedPlacementIds': targeted_placement_ids},
}
}
line_items.append(line_item)
line_items = line_item_service.createLineItems(line_items)
for line_item in line_items:
print('Target id "%s", in order id "%s", named"%s" was created'
%(line_item['targetedPlacementId'], line_item['orderId'], line_item['name']))
if __name__ == '__main__':
dfp_client = dfp.DfpClient.LoadFromStorage()
main(dfp_client, order_id, targeted_placement_id, campaign_name)
如果操作正确,line_item
应该打印:
Target id 5555666677 in order id 3494982232, named Ad Campaign 1 was created
Target id 1112666177 in order id 8494984434, named Ad Campaign 2 was created
Target id 0992666677 in order id 4494922232, named Ad Campaign 3 was created
Target id 9494939499 in order id 1494984234, named Ad Campaign 4 was created
完成这项任务的最佳方法是什么?
如果您想使用 .csv 和 .json 文件,您应该使用 pandas 库
要阅读文件,您可以使用 read_csv(), it will return a pandas DataFrame object which you can manipulate and then if you want to save it as a .csv file just use to_csv()
您还可以使用 tolist()
将 系列 转换为 python 列表
例如
DF = pandas.DataFrame.read_csv('filename.csv')
orders = DF['Orders'].tolist()
orders 是一个 python 列表,其中的值来自您的 .csv 文件中名为 Orders 的列
编辑:
正如评论中所讨论的,您应该弄清楚哪种工具最适合您的问题。但是,如果您打算使用大型数据集,我建议您阅读 docs
中的 pandas 内存使用情况
有趣的文章:reducing memory usage with pandas for large datasets
编辑 2:
要将 DataFrame 的每一列作为列表,您应该这样做:
orders = DF['order_id'].tolist()
targets = DF['targeted_placement_id'].tolist()
campaigns = DF['campaign'].tolist()
# print(orders, targets, campaigns)
您得到的 ValueError 是因为您试图将这些列表作为值传递给字典的键 orderId
、name
和 targetedPlacementIds
.迭代这些列表的一种方法是使用 enumerate(orders)
它将 return 一个索引和每个位置的 order_id。
例如
0 3494982232
1 8494984434
2 4494922232
然后要获取每个订单的 campaigns
和 targets
,您只需传递带有订单索引的列表,因此您的循环将如下所示:
# Create line item objects.
line_items = []
for index, order in enumerate(orders):
line_item = {
'orderId': order,
'name': campaigns[index],
'targeting': {
'inventoryTargeting': {
'targetedPlacementIds': targets[index]
}
}
}
line_items.append(line_item)
print(line_items)
最后,您的 line_items
将是一个列表,其中每个位置都是一个字典。
PS:
您的打印循环有错误,而不是 line_item['targetedPlacementId']
应该是 line_item['targeting']['inventoryTargeting']['targetedPlacementIds']
您还可以检查您的 DataFrame 是否具有空值:
if DF.isnull().values.any():
raise Exception('Null values')
我有一个功能可以通过 Google DFP API 将数据发送到我的广告服务器。当我的变量(order_id、targeted_placement_id 等)将数据硬编码到它时,该函数起作用。
我的数据来自 'ad_data.csv',其中每一列都是键,关联行中的数据是值。我想遍历此数据集并将 csv 文件中每一行的值传递到 line_item
dict 中的正确值。下面是我的pandasDataFrame.head()
order_id targeted_placement_id campaign
0 3494982232 5555666677 Ad Campaign 1
1 8494984434 1112666177 Ad Campaign 2
3 4494922232 0992666677 Ad Campaign 3
4 1494984234 9494939499 Ad Campaign 4
但是,在 for 循环中,我想传递每一行 'ad_data.csv'
from googleads import dfp
import pandas as pd
df = pd.read_csv('ad_data.csv')
order_id = df['order'].tolist()
targeted_placement_id = df['placement_id'].tolist()
campaign_name = df['campaign'].tolist()
def main(client, order_id, targeted_placement_ids, campaign_name):
line_item_service = client.GetService('LineItemService')
# Create line item objects.
line_items = []
for _ in range(1):
line_item = {
'orderId': order_id,
'name': campaign_name,
'targeting': {
'inventoryTargeting':
{'targetedPlacementIds': targeted_placement_ids},
}
}
line_items.append(line_item)
line_items = line_item_service.createLineItems(line_items)
for line_item in line_items:
print('Target id "%s", in order id "%s", named"%s" was created'
%(line_item['targetedPlacementId'], line_item['orderId'], line_item['name']))
if __name__ == '__main__':
dfp_client = dfp.DfpClient.LoadFromStorage()
main(dfp_client, order_id, targeted_placement_id, campaign_name)
如果操作正确,line_item
应该打印:
Target id 5555666677 in order id 3494982232, named Ad Campaign 1 was created
Target id 1112666177 in order id 8494984434, named Ad Campaign 2 was created
Target id 0992666677 in order id 4494922232, named Ad Campaign 3 was created
Target id 9494939499 in order id 1494984234, named Ad Campaign 4 was created
完成这项任务的最佳方法是什么?
如果您想使用 .csv 和 .json 文件,您应该使用 pandas 库
要阅读文件,您可以使用 read_csv(), it will return a pandas DataFrame object which you can manipulate and then if you want to save it as a .csv file just use to_csv()
您还可以使用 tolist()
将 系列 转换为 python 列表
例如
DF = pandas.DataFrame.read_csv('filename.csv')
orders = DF['Orders'].tolist()
orders 是一个 python 列表,其中的值来自您的 .csv 文件中名为 Orders 的列
编辑: 正如评论中所讨论的,您应该弄清楚哪种工具最适合您的问题。但是,如果您打算使用大型数据集,我建议您阅读 docs
中的 pandas 内存使用情况有趣的文章:reducing memory usage with pandas for large datasets
编辑 2:
要将 DataFrame 的每一列作为列表,您应该这样做:
orders = DF['order_id'].tolist()
targets = DF['targeted_placement_id'].tolist()
campaigns = DF['campaign'].tolist()
# print(orders, targets, campaigns)
您得到的 ValueError 是因为您试图将这些列表作为值传递给字典的键 orderId
、name
和 targetedPlacementIds
.迭代这些列表的一种方法是使用 enumerate(orders)
它将 return 一个索引和每个位置的 order_id。
例如
0 3494982232
1 8494984434
2 4494922232
然后要获取每个订单的 campaigns
和 targets
,您只需传递带有订单索引的列表,因此您的循环将如下所示:
# Create line item objects.
line_items = []
for index, order in enumerate(orders):
line_item = {
'orderId': order,
'name': campaigns[index],
'targeting': {
'inventoryTargeting': {
'targetedPlacementIds': targets[index]
}
}
}
line_items.append(line_item)
print(line_items)
最后,您的 line_items
将是一个列表,其中每个位置都是一个字典。
PS:
您的打印循环有错误,而不是 line_item['targetedPlacementId']
应该是 line_item['targeting']['inventoryTargeting']['targetedPlacementIds']
您还可以检查您的 DataFrame 是否具有空值:
if DF.isnull().values.any():
raise Exception('Null values')