使用函数将值应用于 Dask 数据帧映射
Apply values to Dask dataframe mapping with function
在下面的 Dask 代码中,我试图根据函数中的逻辑设置数据框字段的值,apply_masks
:
import numpy as np
import pandas as pd
import dask.dataframe as daskDataFrame
def apply_masks(df):
if df['Age'] > 14:
df['outcol'] = 6
else:
df['outcol'] = 5
return df
data = [[1,100, 12, 6], [1,200, 18, 5], [1,170, 22, 4]]
df = pd.DataFrame(data, columns = ['outcol', 'Weight', 'Age', 'Height'])
ddf = daskDataFrame.from_pandas(df, npartitions=100)
ddf = ddf.map_partitions(apply_masks)
print(ddf.compute())
问题是获取异常:
ValueError: Metadata inference failed in apply_masks
.
You have supplied a custom function and Dask is unable to determine
the type of output that that function returns.
To resolve this please provide a meta= keyword. The docstring of the
Dask function you ran should have more information.
Original error is below:
------------------------ ValueError('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().')
如何解决这个问题?
在下面的 Dask 代码中,我试图根据函数中的逻辑设置数据框字段的值,apply_masks
:
import numpy as np
import pandas as pd
import dask.dataframe as daskDataFrame
def apply_masks(df):
if df['Age'] > 14:
df['outcol'] = 6
else:
df['outcol'] = 5
return df
data = [[1,100, 12, 6], [1,200, 18, 5], [1,170, 22, 4]]
df = pd.DataFrame(data, columns = ['outcol', 'Weight', 'Age', 'Height'])
ddf = daskDataFrame.from_pandas(df, npartitions=100)
ddf = ddf.map_partitions(apply_masks)
print(ddf.compute())
问题是获取异常:
ValueError: Metadata inference failed in
apply_masks
.You have supplied a custom function and Dask is unable to determine the type of output that that function returns.
To resolve this please provide a meta= keyword. The docstring of the Dask function you ran should have more information.
Original error is below: ------------------------ ValueError('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().')
如何解决这个问题?