多头 pandas 数据帧
multiheader pandas dataframe
是否可以使用 pandas.DataFrame
复制此 csv 的结构?
所有数据都从一个 HDF5
文件中提取,然后将属性解析到 pd.DataFrame
我的担忧 是 meta header
和 meta data
(csv 中的第 1 行和第 2 行)与 attribute header
和attribute data
长度或形状。
我是这样称呼 pd.DataFrame
:
# Meta Pandas DataFrame
meta_df = pd.DataFrame(index=range(0, 8760, 24), columns=['source', 'location_id', 'state', 'country', 'latitude',
'longitude', 'time_zone', 'elevation', 'clearsky_dhi',
'clearsky_dni', 'clearsky_ghi', 'dewpoint_unit',
'temperature_unit'])
# Meta Header & Data
meta_df['source'] = source
meta_df['location_id'] = location_id
meta_df['state'] = state
meta_df['country'] = country
meta_df['latitude'] = latitude
meta_df['longitude'] = longitude
meta_df['time_zone'] = local_time
meta_df['elevation'] = elevation
meta_df['clearsky_dhi'] = clearsky_dhi
meta_df['clearsky_dni'] = clearsky_dni
meta_df['clearsky_ghi'] = clearsky_ghi
meta_df['dewpoint_unit'] = dewpoint_unit
meta_df['temperature_unit'] = temperature_unit
# Attribute Pandas DataFrame
att_df = pd.DataFrame(index=range(0, 8760, 24), columns=['dhi', 'dni', 'ghi', 'source', 'dew_point', 'temperature'])
# Attribute Header & Data
att_df['year'] = year
att_df['month'] = month
att_df['day'] = day
att_df['hour'] = hour
att_df['minute'] = minute
att_df['dhi'] = dhi
att_df['dni'] = dni
att_df['ghi'] = ghi
att_df['dew_point'] = dew_point
att_df['temperature'] = temperature
# Make one DataFrame with multiple headers?
# Do something, then export to csv.
df.to_csv(ndir_root + ndir + '/' + fname + '.csv', index=False)
是否最好创建两个单独的数据帧,然后将它们垂直堆叠以创建第三个数据帧并将最后一个数据帧导出为 csv?
布勒?
我认为您可以通过 .to_csv() 执行此操作,因为此方法接受文件路径(如您所做的那样)或缓冲区。我假设您知道元 header、元数据和属性 header 字符串的顺序,因此您可以选择将它们写入文件的方式。您缺少的部分如下所示。
with open('output.csv','w') as fid:
# write your meta header etc., here assumed to be a list of strings
fid.write(','.join(meta_header) + '\n')
fid.write(','.join(meta_data) + '\n')
fid.write(','.join(attribute_header) + '\n')
# now write attr_df to a csv by passing data to your fid buffer
attr_df.to_csv(fid, sep=',', header=False, index=False)
是否可以使用 pandas.DataFrame
复制此 csv 的结构?
所有数据都从一个 HDF5
文件中提取,然后将属性解析到 pd.DataFrame
我的担忧 是 meta header
和 meta data
(csv 中的第 1 行和第 2 行)与 attribute header
和attribute data
长度或形状。
我是这样称呼 pd.DataFrame
:
# Meta Pandas DataFrame
meta_df = pd.DataFrame(index=range(0, 8760, 24), columns=['source', 'location_id', 'state', 'country', 'latitude',
'longitude', 'time_zone', 'elevation', 'clearsky_dhi',
'clearsky_dni', 'clearsky_ghi', 'dewpoint_unit',
'temperature_unit'])
# Meta Header & Data
meta_df['source'] = source
meta_df['location_id'] = location_id
meta_df['state'] = state
meta_df['country'] = country
meta_df['latitude'] = latitude
meta_df['longitude'] = longitude
meta_df['time_zone'] = local_time
meta_df['elevation'] = elevation
meta_df['clearsky_dhi'] = clearsky_dhi
meta_df['clearsky_dni'] = clearsky_dni
meta_df['clearsky_ghi'] = clearsky_ghi
meta_df['dewpoint_unit'] = dewpoint_unit
meta_df['temperature_unit'] = temperature_unit
# Attribute Pandas DataFrame
att_df = pd.DataFrame(index=range(0, 8760, 24), columns=['dhi', 'dni', 'ghi', 'source', 'dew_point', 'temperature'])
# Attribute Header & Data
att_df['year'] = year
att_df['month'] = month
att_df['day'] = day
att_df['hour'] = hour
att_df['minute'] = minute
att_df['dhi'] = dhi
att_df['dni'] = dni
att_df['ghi'] = ghi
att_df['dew_point'] = dew_point
att_df['temperature'] = temperature
# Make one DataFrame with multiple headers?
# Do something, then export to csv.
df.to_csv(ndir_root + ndir + '/' + fname + '.csv', index=False)
是否最好创建两个单独的数据帧,然后将它们垂直堆叠以创建第三个数据帧并将最后一个数据帧导出为 csv?
布勒?
我认为您可以通过 .to_csv() 执行此操作,因为此方法接受文件路径(如您所做的那样)或缓冲区。我假设您知道元 header、元数据和属性 header 字符串的顺序,因此您可以选择将它们写入文件的方式。您缺少的部分如下所示。
with open('output.csv','w') as fid:
# write your meta header etc., here assumed to be a list of strings
fid.write(','.join(meta_header) + '\n')
fid.write(','.join(meta_data) + '\n')
fid.write(','.join(attribute_header) + '\n')
# now write attr_df to a csv by passing data to your fid buffer
attr_df.to_csv(fid, sep=',', header=False, index=False)