如何使用字典中的值填充 DataFrame 的行?
how do I populate rows of of a DataFrame with values from a dictionary?
我正在从 quandl.com 下载财务数据集的元数据。来自 quandl.com 的数据已经是字典格式。我想从 quandl.com 中获取这些数据并将其组织到 DataFrame 中,然后将其导入 Excel.
这是包含我从 quandl.com 下载的金融数据集列表的文本文件 ('Indicator_list.txt')。我希望将这些符号中的每一个的元数据组织到一个 DataFrame 中。
COM/OIL_WTI
BOE/XUDLADS
BOE/XUDLADD
BOE/XUDLB8KL
BOE/XUDLCDS
BOE/XUDLCDD
这是我的代码运行
import quandl
import pandas as pd
#This adjusts the layout in the command
#promt to have columns displayed side by side
pd.set_option('expand_frame_repr', False)
#This "with open" statment opens a text file that
#has the symbols I want to get the metadata on
with open ('Indicator_list.txt') as file_object:
Current_indicators = file_object.read()
tickers = Current_indicators.split('\n')
#quandlmetadata is a blank dictionary that I am
#appending the meatadata to
quandlmetadata={}
#this loops through all the values in
#Indicator_list.txt"
for i in tickers:
#metadata represents one set of metadata
metadata = quandl.Dataset(i).data().meta
这是来自quandl.com
的元数据的输出
{'start_date': datetime.date(1975, 1, 2), 'column_names': ['Date', 'Value'], 'limit': None, 'collapse': None, 'order': 'asc', 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'column_index': None, 'frequency': 'daily'}
接下来,我将其添加到 quandlmetadata 字典中,并使用 indicator_list.txt“i”中的当前符号来命名我的字典键。
quandlmetadata[i]=(metadata)
这是 quandlmetadata 的输出
{'BOE/XUDLADS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLB8KL': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(2011, 8, 1), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'COM/OIL_WTI': {'column_names': ['date', 'value'], 'end_date': datetime.date(2016, 11, 4), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1983, 3, 30), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLADD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}}
最后我想把 quandlmetadata 字典变成一个数据框(或者其他更好的方式)
这是代码的最后一部分
df = pd.DataFrame(index = quandlmetadata.keys(),columns =['transform', 'frequency', 'limit', 'end_date', 'collapse', 'column_names','start_date', 'order', 'column_index'] )
df 的输出
transform frequency limit end_date collapse column_names start_date order column_index
BOE/XUDLB8KL NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLADS NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLADD NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLCDS NaN NaN NaN NaN NaN NaN NaN NaN NaN
COM/OIL_WTI NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLCDD NaN NaN NaN NaN NaN NaN NaN NaN NaN
df的输出正是我想要的;来自 Indicator_list.txt 的代码是我的索引,列是 metadata.keys()。我唯一无法开始工作的是用 quandlmetadata 字典值填充 DataFrame 的行。最终目标是能够将此列表导入 excel,因此如果有一种方法可以在不使用数据框的情况下执行此操作,我会坚决接受。
也许你可以使用 DataFrame.from_dict
?
In [15]: pd.DataFrame.from_dict(quandlmetadata, orient='index')
Out[15]:
column_index end_date order column_names start_date collapse transform limit frequency
BOE/XUDLADD None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
BOE/XUDLADS None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
BOE/XUDLB8KL None 2016-11-03 asc [Date, Value] 2011-08-01 None None None daily
BOE/XUDLCDD None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
BOE/XUDLCDS None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
COM/OIL_WTI None 2016-11-04 asc [date, value] 1983-03-30 None None None daily
不过,我认为 column_names
专栏不会很有用。您还想在日期列上手动调用 pd.to_datetime
,以便它们是 datetime64 列而不是字符串列。
我正在从 quandl.com 下载财务数据集的元数据。来自 quandl.com 的数据已经是字典格式。我想从 quandl.com 中获取这些数据并将其组织到 DataFrame 中,然后将其导入 Excel.
这是包含我从 quandl.com 下载的金融数据集列表的文本文件 ('Indicator_list.txt')。我希望将这些符号中的每一个的元数据组织到一个 DataFrame 中。
COM/OIL_WTI
BOE/XUDLADS
BOE/XUDLADD
BOE/XUDLB8KL
BOE/XUDLCDS
BOE/XUDLCDD
这是我的代码运行
import quandl
import pandas as pd
#This adjusts the layout in the command
#promt to have columns displayed side by side
pd.set_option('expand_frame_repr', False)
#This "with open" statment opens a text file that
#has the symbols I want to get the metadata on
with open ('Indicator_list.txt') as file_object:
Current_indicators = file_object.read()
tickers = Current_indicators.split('\n')
#quandlmetadata is a blank dictionary that I am
#appending the meatadata to
quandlmetadata={}
#this loops through all the values in
#Indicator_list.txt"
for i in tickers:
#metadata represents one set of metadata
metadata = quandl.Dataset(i).data().meta
这是来自quandl.com
的元数据的输出{'start_date': datetime.date(1975, 1, 2), 'column_names': ['Date', 'Value'], 'limit': None, 'collapse': None, 'order': 'asc', 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'column_index': None, 'frequency': 'daily'}
接下来,我将其添加到 quandlmetadata 字典中,并使用 indicator_list.txt“i”中的当前符号来命名我的字典键。
quandlmetadata[i]=(metadata)
这是 quandlmetadata 的输出
{'BOE/XUDLADS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLB8KL': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(2011, 8, 1), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'COM/OIL_WTI': {'column_names': ['date', 'value'], 'end_date': datetime.date(2016, 11, 4), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1983, 3, 30), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLADD': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}, 'BOE/XUDLCDS': {'column_names': ['Date', 'Value'], 'end_date': datetime.date(2016, 11, 3), 'transform': None, 'collapse': None, 'order': 'asc', 'start_date': datetime.date(1975, 1, 2), 'limit': None, 'column_index': None, 'frequency': 'daily'}}
最后我想把 quandlmetadata 字典变成一个数据框(或者其他更好的方式)
这是代码的最后一部分
df = pd.DataFrame(index = quandlmetadata.keys(),columns =['transform', 'frequency', 'limit', 'end_date', 'collapse', 'column_names','start_date', 'order', 'column_index'] )
df 的输出
transform frequency limit end_date collapse column_names start_date order column_index
BOE/XUDLB8KL NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLADS NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLADD NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLCDS NaN NaN NaN NaN NaN NaN NaN NaN NaN
COM/OIL_WTI NaN NaN NaN NaN NaN NaN NaN NaN NaN
BOE/XUDLCDD NaN NaN NaN NaN NaN NaN NaN NaN NaN
df的输出正是我想要的;来自 Indicator_list.txt 的代码是我的索引,列是 metadata.keys()。我唯一无法开始工作的是用 quandlmetadata 字典值填充 DataFrame 的行。最终目标是能够将此列表导入 excel,因此如果有一种方法可以在不使用数据框的情况下执行此操作,我会坚决接受。
也许你可以使用 DataFrame.from_dict
?
In [15]: pd.DataFrame.from_dict(quandlmetadata, orient='index')
Out[15]:
column_index end_date order column_names start_date collapse transform limit frequency
BOE/XUDLADD None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
BOE/XUDLADS None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
BOE/XUDLB8KL None 2016-11-03 asc [Date, Value] 2011-08-01 None None None daily
BOE/XUDLCDD None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
BOE/XUDLCDS None 2016-11-03 asc [Date, Value] 1975-01-02 None None None daily
COM/OIL_WTI None 2016-11-04 asc [date, value] 1983-03-30 None None None daily
不过,我认为 column_names
专栏不会很有用。您还想在日期列上手动调用 pd.to_datetime
,以便它们是 datetime64 列而不是字符串列。