如何使用 petl 转换 table 中的值
How to convert values in table using petl
我在转换 table 中的多个值时遇到了一个奇怪的问题。
我有一个 table
的数据是这样的:
+-------------+--------------+-------------+
| id | name | category_id |
+=============+==============+=============+
| 1 | Horse | 5 |
+-------------+--------------+-------------+
| 2 | Cow | 5 |
+-------------+--------------+-------------+
| 3 | Pig | 2 |
+-------------+--------------+-------------+
| 4 | Chicken | 3 |
+-------------+--------------+-------------+
然后我通过 category_id
像这样找到:
# find the item category id
items_cat_id = etl.values(table, 'category_id')
然后我像这样循环遍历数据,这样我就可以将上面的 category_id 转换为目标类别 ID:
for item in items_cat_id:
"""
fetch target mongo collection.
source_name is the category name to look up in the target,
if we get a match convert the table category_id value to
the mongo id.
"""
target_category_id = target_db.collection.find_one({ 'name': source_name })
converted_table = etl.convert(table, 'category_id',
lambda _: target_category_id.get('_id'),
where=lambda x: x.category_id == item)
我似乎只得到这个:
+-------------+--------------+-----------------------------+
| id | name | category_id |
+=============+==============+=============================+
| 1 | Horse | 5 |
+-------------+--------------+-----------------------------+
| 2 | Cow | 5 |
+-------------+--------------+-----------------------------+
| 3 | Pig | 2 |
+-------------+--------------+-----------------------------+
| 4 | Chicken | QnicP3f4njL54HRqu |
+-------------+--------------+-----------------------------+
什么时候应该
+-------------+--------------+-----------------------------+
| id | name | category_id |
+=============+==============+=============================+
| 1 | Horse | 5 |
+-------------+--------------+-----------------------------+
| 2 | Cow | 5 |
+-------------+--------------+-----------------------------+
| 3 | Pig | yrDku5Yqkc2MKZZkD |
+-------------+--------------+-----------------------------+
| 4 | Chicken | QnicP3f4njL54HRqu |
+-------------+--------------+-----------------------------+
有什么建议吗?
etl.convert()
从相同的未更改源 table
创建一个新的 table,每次迭代超过 items_cat_id
。因此,每个结果中只有一行发生变化。像这样更改您的代码:
for x in y:
table = etl.convert(table, ...)
现在您始终使用最后一个结果。
我在转换 table 中的多个值时遇到了一个奇怪的问题。
我有一个 table
的数据是这样的:
+-------------+--------------+-------------+
| id | name | category_id |
+=============+==============+=============+
| 1 | Horse | 5 |
+-------------+--------------+-------------+
| 2 | Cow | 5 |
+-------------+--------------+-------------+
| 3 | Pig | 2 |
+-------------+--------------+-------------+
| 4 | Chicken | 3 |
+-------------+--------------+-------------+
然后我通过 category_id
像这样找到:
# find the item category id
items_cat_id = etl.values(table, 'category_id')
然后我像这样循环遍历数据,这样我就可以将上面的 category_id 转换为目标类别 ID:
for item in items_cat_id:
"""
fetch target mongo collection.
source_name is the category name to look up in the target,
if we get a match convert the table category_id value to
the mongo id.
"""
target_category_id = target_db.collection.find_one({ 'name': source_name })
converted_table = etl.convert(table, 'category_id',
lambda _: target_category_id.get('_id'),
where=lambda x: x.category_id == item)
我似乎只得到这个:
+-------------+--------------+-----------------------------+
| id | name | category_id |
+=============+==============+=============================+
| 1 | Horse | 5 |
+-------------+--------------+-----------------------------+
| 2 | Cow | 5 |
+-------------+--------------+-----------------------------+
| 3 | Pig | 2 |
+-------------+--------------+-----------------------------+
| 4 | Chicken | QnicP3f4njL54HRqu |
+-------------+--------------+-----------------------------+
什么时候应该
+-------------+--------------+-----------------------------+
| id | name | category_id |
+=============+==============+=============================+
| 1 | Horse | 5 |
+-------------+--------------+-----------------------------+
| 2 | Cow | 5 |
+-------------+--------------+-----------------------------+
| 3 | Pig | yrDku5Yqkc2MKZZkD |
+-------------+--------------+-----------------------------+
| 4 | Chicken | QnicP3f4njL54HRqu |
+-------------+--------------+-----------------------------+
有什么建议吗?
etl.convert()
从相同的未更改源 table
创建一个新的 table,每次迭代超过 items_cat_id
。因此,每个结果中只有一行发生变化。像这样更改您的代码:
for x in y:
table = etl.convert(table, ...)
现在您始终使用最后一个结果。