即使遵循示例也无法加入数据框
Unable to join a dataframe even after following an example
导入模块:
import Quandl
import pandas as pd
from pandas.tools.plotting import df_unique
读取api键:
api_key = open('quandlapikey.txt','r').read()
目前该函数读取 csv 文件以获取代码,但我打算将其更改为 sqllite..
def stock_list():
#stocks = pd.read_csv('TID.csv'.rstrip())
stocks = open('TID.csv').readlines()
return stocks[0:]
从 quandl 获取股票代码这很有用。
def getStockValues():
stocks = stock_list()
main_df = pd.DataFrame()
for abbrv in stocks:
query = "LSE/" + str(abbrv).strip()
df = Quandl.get(query, authtoken=api_key,start_date='2016-04-05', end_date='2016-04-10')
df = df['Price']
df.columns = [abbrv]
print(query)
print(df)
此语句由于某种原因导致问题,同时循环它无法加入其他股票价格。
#This statement Prints as
print(df.tail(5))
#causes error
if main_df.empty:
main_df = df
else:
main_df = main_df.join(df)
# exit
print('Task done!')
getStockValues()
这是打印语句的输出和连接的错误。
Result:
LSE/VOD
Date
2016-04-14 226.80
2016-04-15 229.75
<ETC for all stocks>
Traceback (most recent call last):
File "H:\Workarea\DataB\SkyDriveP\OneDrive\PyProjects\Learning myPprojects\stockPrices.py", line 49, in <module>
getStockValues()
File "H:\Workarea\DataB\SkyDriveP\OneDrive\PyProjects\Learning myPprojects\stockPrices.py", line 43, in getStockValues
main_df = main_df.join(df)
File "H:\APPS\Python35-32\lib\site-packages\pandas\core\generic.py", line 2669, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'join'
进一步的测试表明,问题似乎与 pandas 数据对象的范围有关,这导致和问题:
main_df = pd.DataFrame()
for abbrv in stocks:
query = "LSE/" + str(abbrv).strip()
df = Quandl.get(query, authtoken=api_key,start_date='2016-03-05', end_date='2016-04-10')
df = df['Price']
df.columns = [abbrv]
#causes error
if main_df.empty:
main_df = df
else:
main_df = main_df.join(df)
但是这不会导致错误,但是只有 returns 一个数据集:
for abbrv in stocks:
main_df = pd.DataFrame()
query = "LSE/" + str(abbrv).strip()
df = Quandl.get(query, authtoken=api_key,start_date='2016-03-05', end_date='2016-04-10')
df = df['Price']
df.columns = [abbrv]
if main_df.empty:
main_df = df
else:
main_df = main_df.join(df)
在我看来,您的代码存在问题:
...
df = df['Price'] ## <- you are turning the DataFrame to a Series here
df.columns = [abbrv] ## <- no effect whatsoever on a Series
print(query)
print(df)
我要做的只是将新行添加到您现有的 DataFrame。
## if main_df.empty: ## <- remove this line
## main_df = df ## This should be changed to the line below
main_df[abbrv] = df ## This will just add the new column to you df and use the Series as data
## else: ## <- remove this line
## main_df = main_df.join(df) ## <- remove this line
导入模块:
import Quandl
import pandas as pd
from pandas.tools.plotting import df_unique
读取api键:
api_key = open('quandlapikey.txt','r').read()
目前该函数读取 csv 文件以获取代码,但我打算将其更改为 sqllite..
def stock_list():
#stocks = pd.read_csv('TID.csv'.rstrip())
stocks = open('TID.csv').readlines()
return stocks[0:]
从 quandl 获取股票代码这很有用。
def getStockValues():
stocks = stock_list()
main_df = pd.DataFrame()
for abbrv in stocks:
query = "LSE/" + str(abbrv).strip()
df = Quandl.get(query, authtoken=api_key,start_date='2016-04-05', end_date='2016-04-10')
df = df['Price']
df.columns = [abbrv]
print(query)
print(df)
此语句由于某种原因导致问题,同时循环它无法加入其他股票价格。
#This statement Prints as
print(df.tail(5))
#causes error
if main_df.empty:
main_df = df
else:
main_df = main_df.join(df)
# exit
print('Task done!')
getStockValues()
这是打印语句的输出和连接的错误。
Result:
LSE/VOD
Date
2016-04-14 226.80
2016-04-15 229.75
<ETC for all stocks>
Traceback (most recent call last):
File "H:\Workarea\DataB\SkyDriveP\OneDrive\PyProjects\Learning myPprojects\stockPrices.py", line 49, in <module>
getStockValues()
File "H:\Workarea\DataB\SkyDriveP\OneDrive\PyProjects\Learning myPprojects\stockPrices.py", line 43, in getStockValues
main_df = main_df.join(df)
File "H:\APPS\Python35-32\lib\site-packages\pandas\core\generic.py", line 2669, in __getattr__
return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'join'
进一步的测试表明,问题似乎与 pandas 数据对象的范围有关,这导致和问题:
main_df = pd.DataFrame()
for abbrv in stocks:
query = "LSE/" + str(abbrv).strip()
df = Quandl.get(query, authtoken=api_key,start_date='2016-03-05', end_date='2016-04-10')
df = df['Price']
df.columns = [abbrv]
#causes error
if main_df.empty:
main_df = df
else:
main_df = main_df.join(df)
但是这不会导致错误,但是只有 returns 一个数据集:
for abbrv in stocks:
main_df = pd.DataFrame()
query = "LSE/" + str(abbrv).strip()
df = Quandl.get(query, authtoken=api_key,start_date='2016-03-05', end_date='2016-04-10')
df = df['Price']
df.columns = [abbrv]
if main_df.empty:
main_df = df
else:
main_df = main_df.join(df)
在我看来,您的代码存在问题:
...
df = df['Price'] ## <- you are turning the DataFrame to a Series here
df.columns = [abbrv] ## <- no effect whatsoever on a Series
print(query)
print(df)
我要做的只是将新行添加到您现有的 DataFrame。
## if main_df.empty: ## <- remove this line
## main_df = df ## This should be changed to the line below
main_df[abbrv] = df ## This will just add the new column to you df and use the Series as data
## else: ## <- remove this line
## main_df = main_df.join(df) ## <- remove this line