Python DataFrame 块提取问题
Python DataFrame chunk extract issue
我想将一个数据帧分成块(例如:如果我们有 100 行,我将它们分成 20 个块)并且对于其中包含 5 个值的每个块,我需要应用 5 个更新查询(5 个不同的表)在这个分块数据上。
我如何完成这项任务,因为我是新手,在工作中学习,你能推荐一下方法吗?
for item in np.array_split(df1, 10):
print(item) ##I was able to divide into chunks
for i,j in item.iterrows():
print(item.iloc[i]['ColumnName'])
我的想法是在这个打印语句之后添加更新查询行。
但是这段代码给出了一个例外。
Traceback (most recent call last):
File "/Users/gd/Documents/myproj/test.py", line 63, in <module>
func()
File "/Users/gd/Documents/myproj/test.py", line 45, in dedupe_pe
print(item.iloc[i]['ColumnName'])
File "/Users/gd/Documents/myproj/lib/python3.9/site-packages/pandas/core/indexing.py", line 931, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "/Users/gd/Documents/myproj/lib/python3.9/site-packages/pandas/core/indexing.py", line 1566, in _getitem_axis
self._validate_integer(key, axis)
File "/Users/gd/Documents/myproj/lib/python3.9/site-packages/pandas/core/indexing.py", line 1500, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
item.iterrows()
生成行索引和行本身,因此您可以尝试如下操作:
for item in np.array_split(df1, 10):
print(item) ##I was able to divide into chunks
item["sql"] = "UPDATE " + item["table_name"] + " SET column1 = '" + item["ColumnName_DATA"] + "' WHERE condition"
for i, j in item.iterrows():
print(j['ColumnName'])
print(j['sql'])
我想将一个数据帧分成块(例如:如果我们有 100 行,我将它们分成 20 个块)并且对于其中包含 5 个值的每个块,我需要应用 5 个更新查询(5 个不同的表)在这个分块数据上。
我如何完成这项任务,因为我是新手,在工作中学习,你能推荐一下方法吗?
for item in np.array_split(df1, 10):
print(item) ##I was able to divide into chunks
for i,j in item.iterrows():
print(item.iloc[i]['ColumnName'])
我的想法是在这个打印语句之后添加更新查询行。
但是这段代码给出了一个例外。
Traceback (most recent call last):
File "/Users/gd/Documents/myproj/test.py", line 63, in <module>
func()
File "/Users/gd/Documents/myproj/test.py", line 45, in dedupe_pe
print(item.iloc[i]['ColumnName'])
File "/Users/gd/Documents/myproj/lib/python3.9/site-packages/pandas/core/indexing.py", line 931, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "/Users/gd/Documents/myproj/lib/python3.9/site-packages/pandas/core/indexing.py", line 1566, in _getitem_axis
self._validate_integer(key, axis)
File "/Users/gd/Documents/myproj/lib/python3.9/site-packages/pandas/core/indexing.py", line 1500, in _validate_integer
raise IndexError("single positional indexer is out-of-bounds")
IndexError: single positional indexer is out-of-bounds
item.iterrows()
生成行索引和行本身,因此您可以尝试如下操作:
for item in np.array_split(df1, 10):
print(item) ##I was able to divide into chunks
item["sql"] = "UPDATE " + item["table_name"] + " SET column1 = '" + item["ColumnName_DATA"] + "' WHERE condition"
for i, j in item.iterrows():
print(j['ColumnName'])
print(j['sql'])