Python 中的循环逻辑回归
For loop Logistic regression in Python
我想在数据框上构建一个 for 循环,目的是创建一个包含每只股票准确度得分的 df。
一只股票的模型工作正常,但 for 循环不工作 anything.Below 是 df 的输出,这不是完整的 df。
Date Close ticker rating price returns direction long direction
2021-02-06 21.8 AD.AS 1 21.8 -0.02 -1 1
2021-02-06 21.8 AD.AS 1 21.8 -0.02 -1 1
2021-02-06 21.8 APPL 1 153 -0.02 -1 1
2021-02-06 21.8 APPL 1 153 -0.02 -1 1
stock_df['ticker'].unique()
array(['CSCO', 'IBM', 'AMZN', 'AD.AS'], dtype=object)
我们终于想出了如何做到这一点,请参阅下面我们现在使用的代码。
#for loop test
#Split data into training and test sets
stock_df = stock_df.dropna()
preds_sell = {}
y_test={}
temp = {}
# loop on every type
for ticker in stock_df['ticker'].unique():
# slice
stock_slice = stock_df[stock_df['ticker'] == ticker]
X = stock_slice.drop(['long direction', 'BuyFlag','SellFlag', 'Date', 'ticker'], axis=1)
y = stock_slice['SellFlag']
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size = 0.25, random_state = 5, shuffle=False)
#Creatig duplicate for back testing
X_test_2 = X_test
#building logistic regression model on training data
model1= LogisticRegression(random_state=0, multi_class='ovr', penalty='none', solver='newton-cg', class_weight={0:0.6, 1:0.4}).fit(X_train, y_train)
preds_sell[ticker] = model1.predict(X_test)
#Accuracy statistics
print('Accuracy Score:', metrics.accuracy_score(y_test, preds_sell[ticker]))
#Create classification report
class_report=classification_report(y_test, preds_sell[ticker])
print(class_report)
# build dataframe with all your results
temp[ticker] = metrics.accuracy_score(y_test, preds_sell[ticker])
temp = pd.DataFrame.from_dict(temp, orient ='index').reset_index()
temp = final_df.rename(columns={'index': 'ticker', 0: 'Accuracy Score'})
final_df = final_df.append(temp)
您没有更新 for-loop 中名为 result
的变量值。
#for loop test
#Split data into training and test sets
stock_df = stock_df.dropna()
preds_sell = {}
y_test={}
temp = {}
# loop on every type
for ticker in stock_df['ticker'].unique():
# slice
stock_slice = stock_df[stock_df['ticker'] == ticker]
X = stock_slice.drop(['long direction', 'BuyFlag','SellFlag', 'Date', 'ticker'], axis=1)
y = stock_slice['SellFlag']
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size = 0.25, random_state = 5, shuffle=False)
#Creatig duplicate for back testing
X_test_2 = X_test
#building logistic regression model on training data
model1= LogisticRegression(random_state=0, multi_class='ovr', penalty='none', solver='newton-cg', class_weight={0:0.6, 1:0.4}).fit(X_train, y_train)
preds_sell[ticker] = model1.predict(X_test)
#Accuracy statistics
print('Accuracy Score:', metrics.accuracy_score(y_test, preds_sell[ticker]))
#Create classification report
class_report=classification_report(y_test, preds_sell[ticker])
print(class_report)
# build dataframe with all your results
temp[ticker] = metrics.accuracy_score(y_test, preds_sell[ticker])
temp = pd.DataFrame.from_dict(temp, orient ='index').reset_index()
temp = final_df.rename(columns={'index': 'ticker', 0: 'Accuracy Score'})
final_df = final_df.append(temp)
我想在数据框上构建一个 for 循环,目的是创建一个包含每只股票准确度得分的 df。
一只股票的模型工作正常,但 for 循环不工作 anything.Below 是 df 的输出,这不是完整的 df。
Date Close ticker rating price returns direction long direction
2021-02-06 21.8 AD.AS 1 21.8 -0.02 -1 1
2021-02-06 21.8 AD.AS 1 21.8 -0.02 -1 1
2021-02-06 21.8 APPL 1 153 -0.02 -1 1
2021-02-06 21.8 APPL 1 153 -0.02 -1 1
stock_df['ticker'].unique()
array(['CSCO', 'IBM', 'AMZN', 'AD.AS'], dtype=object)
我们终于想出了如何做到这一点,请参阅下面我们现在使用的代码。
#for loop test
#Split data into training and test sets
stock_df = stock_df.dropna()
preds_sell = {}
y_test={}
temp = {}
# loop on every type
for ticker in stock_df['ticker'].unique():
# slice
stock_slice = stock_df[stock_df['ticker'] == ticker]
X = stock_slice.drop(['long direction', 'BuyFlag','SellFlag', 'Date', 'ticker'], axis=1)
y = stock_slice['SellFlag']
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size = 0.25, random_state = 5, shuffle=False)
#Creatig duplicate for back testing
X_test_2 = X_test
#building logistic regression model on training data
model1= LogisticRegression(random_state=0, multi_class='ovr', penalty='none', solver='newton-cg', class_weight={0:0.6, 1:0.4}).fit(X_train, y_train)
preds_sell[ticker] = model1.predict(X_test)
#Accuracy statistics
print('Accuracy Score:', metrics.accuracy_score(y_test, preds_sell[ticker]))
#Create classification report
class_report=classification_report(y_test, preds_sell[ticker])
print(class_report)
# build dataframe with all your results
temp[ticker] = metrics.accuracy_score(y_test, preds_sell[ticker])
temp = pd.DataFrame.from_dict(temp, orient ='index').reset_index()
temp = final_df.rename(columns={'index': 'ticker', 0: 'Accuracy Score'})
final_df = final_df.append(temp)
您没有更新 for-loop 中名为 result
的变量值。
#for loop test
#Split data into training and test sets
stock_df = stock_df.dropna()
preds_sell = {}
y_test={}
temp = {}
# loop on every type
for ticker in stock_df['ticker'].unique():
# slice
stock_slice = stock_df[stock_df['ticker'] == ticker]
X = stock_slice.drop(['long direction', 'BuyFlag','SellFlag', 'Date', 'ticker'], axis=1)
y = stock_slice['SellFlag']
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size = 0.25, random_state = 5, shuffle=False)
#Creatig duplicate for back testing
X_test_2 = X_test
#building logistic regression model on training data
model1= LogisticRegression(random_state=0, multi_class='ovr', penalty='none', solver='newton-cg', class_weight={0:0.6, 1:0.4}).fit(X_train, y_train)
preds_sell[ticker] = model1.predict(X_test)
#Accuracy statistics
print('Accuracy Score:', metrics.accuracy_score(y_test, preds_sell[ticker]))
#Create classification report
class_report=classification_report(y_test, preds_sell[ticker])
print(class_report)
# build dataframe with all your results
temp[ticker] = metrics.accuracy_score(y_test, preds_sell[ticker])
temp = pd.DataFrame.from_dict(temp, orient ='index').reset_index()
temp = final_df.rename(columns={'index': 'ticker', 0: 'Accuracy Score'})
final_df = final_df.append(temp)