Python apriori 返回 Generator 而不是 Dataframe
Python apriori returning Generator instead of Dataframe
我正在编写代码,获取数据集的一小部分(购物篮),将其转换为热编码数据帧,我想 运行 mlxtend 的先验算法对其进行获取频繁项集。
但是,每当我 运行 先验算法时,它似乎立即 运行 并且它 returns 生成器对象而不是数据框。我按照 documentation 的说明进行操作,在他们的示例中,它显示了 apriori returns 数据框。我做错了什么?
这是我的代码:
import numpy as np
import pandas as pd
import csv
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules
from mlxtend.preprocessing import TransactionEncoder
from apyori import apriori
def simpleRandomisedSample(filename, support_frac, sample_frac):
df1 = pd.read_csv("%s.csv" % filename, header=None) #Saving csv file into a dataframe in memory
size = len(df1)
support = support_frac * len(df1) #Sets the original support value to x% of the original dataset
sample_support = support * sample_frac #Support for our reduced sample as a fraction of the original support
sample = df1.sample(frac=sample_frac) #Saving x% (randomised) of the dataset as our sample
sample = sample.reset_index(drop = True) #Reseting indexes (which previously got randomised along with the data)
del df1 #Deleting original dataframe from memory to clear up space
sample_size = len(sample)
return size, support, sample_size, sample_support, sample
def main():
size, support, sample_size, sample_support, sample = simpleRandomisedSample("chess",0.01,0.1)
print("The original dataset had %d rows and a support of %.2f" % (size, support))
print("The dataset was reduced to %d rows and the sample has a support of %.2f" % (sample_size, sample_support))
sample_list = sample.values.tolist() #Converting Dataframe to list of lists for use with Apriori
te = TransactionEncoder()
te_ary = te.fit(sample_list).transform(sample_list) #Preprocessing our sample to work with Apriori algorithm
df = pd.DataFrame(te_ary, columns=te.columns_)
print(df)
frequent_itemsets = apriori(df, min_support=0.6, use_colnames=True)
print(frequent_itemsets)
if __name__ == "__main__":
main()
您的导入中存在名称冲突:
from mlxtend.frequent_patterns import apriori
[...]
from apyori import apriori
您的代码没有使用 mlxtend
算法,而是 apyori
提供的算法,延迟导入的算法会覆盖之前的算法。
您可以删除不使用的那个,或者,如果您以后想同时访问这两个,您可以给其中一个起一个不同的名字:
from mlxtend.frequent_patterns import apriori as mlx_apriori
from apyori import apriori as apy_apriori
我正在编写代码,获取数据集的一小部分(购物篮),将其转换为热编码数据帧,我想 运行 mlxtend 的先验算法对其进行获取频繁项集。
但是,每当我 运行 先验算法时,它似乎立即 运行 并且它 returns 生成器对象而不是数据框。我按照 documentation 的说明进行操作,在他们的示例中,它显示了 apriori returns 数据框。我做错了什么?
这是我的代码:
import numpy as np
import pandas as pd
import csv
from mlxtend.frequent_patterns import apriori
from mlxtend.frequent_patterns import association_rules
from mlxtend.preprocessing import TransactionEncoder
from apyori import apriori
def simpleRandomisedSample(filename, support_frac, sample_frac):
df1 = pd.read_csv("%s.csv" % filename, header=None) #Saving csv file into a dataframe in memory
size = len(df1)
support = support_frac * len(df1) #Sets the original support value to x% of the original dataset
sample_support = support * sample_frac #Support for our reduced sample as a fraction of the original support
sample = df1.sample(frac=sample_frac) #Saving x% (randomised) of the dataset as our sample
sample = sample.reset_index(drop = True) #Reseting indexes (which previously got randomised along with the data)
del df1 #Deleting original dataframe from memory to clear up space
sample_size = len(sample)
return size, support, sample_size, sample_support, sample
def main():
size, support, sample_size, sample_support, sample = simpleRandomisedSample("chess",0.01,0.1)
print("The original dataset had %d rows and a support of %.2f" % (size, support))
print("The dataset was reduced to %d rows and the sample has a support of %.2f" % (sample_size, sample_support))
sample_list = sample.values.tolist() #Converting Dataframe to list of lists for use with Apriori
te = TransactionEncoder()
te_ary = te.fit(sample_list).transform(sample_list) #Preprocessing our sample to work with Apriori algorithm
df = pd.DataFrame(te_ary, columns=te.columns_)
print(df)
frequent_itemsets = apriori(df, min_support=0.6, use_colnames=True)
print(frequent_itemsets)
if __name__ == "__main__":
main()
您的导入中存在名称冲突:
from mlxtend.frequent_patterns import apriori
[...]
from apyori import apriori
您的代码没有使用 mlxtend
算法,而是 apyori
提供的算法,延迟导入的算法会覆盖之前的算法。
您可以删除不使用的那个,或者,如果您以后想同时访问这两个,您可以给其中一个起一个不同的名字:
from mlxtend.frequent_patterns import apriori as mlx_apriori
from apyori import apriori as apy_apriori