Python 将 DataFrame table(面板数据?)转换为时间序列的代码?
Python code to convert DataFrame table (Panel data?) to time-series?
我正在使用 eurostat 软件包在 Python 下载 Eurostat 数据集,而 Dataframe 格式很难处理。我一直在尝试将面板数据转成时间序列,但是一直没有成功。
我已经对数据进行了一些过滤和清理,但是我未能将 table 转换为时间序列(我对 Python 还很陌生)。在我的代码下方:
#pip install eurostat
import pandas as pd
import eurostat
# Commercial flights by reporting country – monthly data (source: Eurocontrol)
df_eurostat = eurostat.get_data_df('avia_tf_cm')
df_eurostat = df_eurostat.rename(columns={'geo\time':'Region'})
# To exclude: 'EU27_2020', 'EU28'
# df_eurostat = df_eurostat.drop(columns='unit').T
country_list = ['AL', 'AT', 'BE', 'BG', 'CH', 'CY', 'CZ', 'DE', 'DK', 'EE', 'EL',
'ES', 'FI', 'FR', 'HR', 'HU', 'IE', 'IS', 'IT', 'LT', 'LU', 'LV',
'ME', 'MK', 'MT', 'NL', 'NO', 'PL', 'PT', 'RO', 'RS', 'SE', 'SI',
'SK', 'TR', 'UK']
df_eurostat = df_eurostat[df_eurostat['Region'].isin(country_list)]
df_eurostat = df_eurostat.loc[(df_eurostat['unit']=='NR')]
之前:
之后-我想要实现的目标:
如果有人能提供帮助,我们将不胜感激。提前致谢!
再一步:
to_date = lambda x: pd.to_datetime(x['Date'], format='%YM%m')
df_eurostat = df_eurostat.drop(columns='unit').set_index('Region').T \
.rename_axis(index='Date', columns=None) \
.reset_index().assign(Date=to_date)
输出:
>>> df_eurostat
Date AL AT BE BG CH CY CZ DE DK ... NO PL PT RO RS SE SI SK TR UK
0 2021-12-01 2265.0 15224.0 20055.0 4188.0 24102.0 3851.0 6690.0 94592.0 17277.0 ... 32284.0 23299.0 23977.0 10653.0 3804.0 19148.0 1038.0 1224.0 55338.0 96922.0
1 2021-11-01 1953.0 15513.0 20445.0 3694.0 21180.0 4452.0 6549.0 96853.0 17630.0 ... 33727.0 22105.0 23334.0 9294.0 3578.0 19088.0 993.0 1040.0 57975.0 90265.0
2 2021-10-01 2358.0 18314.0 21520.0 4945.0 26289.0 7118.0 7019.0 115037.0 18805.0 ... 33051.0 23325.0 27620.0 11708.0 4017.0 19070.0 1137.0 1178.0 81820.0 103358.0
3 2021-09-01 2998.0 18856.0 21834.0 6853.0 24979.0 6488.0 7785.0 107754.0 17609.0 ... 31901.0 25523.0 26989.0 13370.0 4691.0 18503.0 1155.0 1453.0 81744.0 98183.0
4 2021-08-01 3705.0 19579.0 22261.0 8807.0 26451.0 6873.0 7815.0 106657.0 16538.0 ... 28870.0 26381.0 29506.0 14416.0 5761.0 17061.0 1268.0 1695.0 90404.0 92697.0
5 2021-07-01 2973.0 17697.0 21617.0 7663.0 24531.0 6418.0 7291.0 99334.0 15357.0 ... 26152.0 24355.0 26176.0 13446.0 5831.0 15591.0 1210.0 1608.0 87664.0 72389.0
6 2021-06-01 2173.0 11225.0 15313.0 4441.0 15021.0 4328.0 5151.0 68482.0 8958.0 ... 21798.0 17129.0 19879.0 10222.0 3955.0 11832.0 788.0 992.0 58319.0 50648.0
7 2021-05-01 1452.0 7783.0 11247.0 2796.0 11619.0 3016.0 3051.0 51870.0 5993.0 ... 19007.0 8933.0 13758.0 6936.0 2736.0 8661.0 592.0 436.0 36572.0 35027.0
8 2021-04-01 1039.0 6632.0 9537.0 2457.0 10199.0 1872.0 2310.0 45712.0 4994.0 ... 18183.0 7256.0 10086.0 5720.0 2203.0 7683.0 455.0 280.0 39540.0 27739.0
9 2021-03-01 935.0 5327.0 8454.0 2071.0 8431.0 1334.0 2174.0 39463.0 4615.0 ... 19120.0 6120.0 6216.0 4212.0 1829.0 7502.0 479.0 377.0 38896.0 25305.0
10 2021-02-01 751.0 3976.0 7836.0 1756.0 7116.0 992.0 1889.0 30330.0 3522.0 ... 16159.0 4553.0 5134.0 3543.0 1527.0 6274.0 391.0 418.0 30167.0 20496.0
11 2021-01-01 881.0 4801.0 9481.0 2229.0 9262.0 1064.0 2208.0 36932.0 4937.0 ... 18953.0 6943.0 9227.0 4555.0 1741.0 7203.0 402.0 444.0 32167.0 28100.0
12 2020-12-01 880.0 5271.0 10360.0 2577.0 9804.0 1316.0 2572.0 39709.0 6030.0 ... 18913.0 7898.0 10387.0 4463.0 1887.0 8003.0 416.0 521.0 29614.0 38484.0
13 2020-11-01 872.0 5409.0 9787.0 2265.0 7667.0 1528.0 2248.0 40854.0 6328.0 ... 21194.0 8035.0 9738.0 3661.0 2130.0 8903.0 404.0 362.0 36441.0 34516.0
14 2020-10-01 1227.0 9237.0 11507.0 3392.0 12132.0 3185.0 3271.0 64376.0 9356.0 ... 24317.0 13245.0 15886.0 6179.0 2817.0 11103.0 577.0 653.0 49092.0 61735.0
15 2020-09-01 1513.0 11990.0 12241.0 4429.0 14364.0 3464.0 4749.0 69292.0 10604.0 ... 24939.0 15927.0 17980.0 7112.0 2845.0 10819.0 664.0 901.0 51449.0 72451.0
16 2020-08-01 2087.0 13469.0 14772.0 5396.0 18023.0 3770.0 5157.0 73205.0 10657.0 ... 24069.0 18681.0 20945.0 8059.0 2898.0 9963.0 796.0 1114.0 51758.0 79123.0
17 2020-07-01 1754.0 10377.0 13294.0 5026.0 15326.0 2914.0 4441.0 62889.0 9168.0 ... 23057.0 14361.0 14599.0 6925.0 2846.0 8154.0 703.0 783.0 36743.0 52547.0
18 2020-06-01 400.0 3901.0 6902.0 2495.0 6319.0 996.0 1715.0 31467.0 4085.0 ... 17126.0 3120.0 4340.0 2386.0 1570.0 5025.0 513.0 382.0 18020.0 21071.0
19 2020-05-01 186.0 1628.0 5626.0 1521.0 2841.0 457.0 979.0 20787.0 2245.0 ... 13377.0 1106.0 2208.0 1391.0 494.0 3716.0 340.0 191.0 4703.0 16397.0
20 2020-04-01 134.0 1297.0 4708.0 931.0 1936.0 355.0 823.0 17894.0 1974.0 ... 13114.0 1059.0 1600.0 1393.0 295.0 3422.0 369.0 207.0 3726.0 13634.0
21 2020-03-01 862.0 13690.0 16101.0 3551.0 20060.0 2749.0 5807.0 84579.0 14416.0 ... 28254.0 14506.0 17820.0 8349.0 2529.0 19940.0 903.0 811.0 40122.0 96914.0
22 2020-02-01 1667.0 24837.0 22531.0 4923.0 33073.0 4030.0 9417.0 128115.0 22684.0 ... 36181.0 27688.0 25712.0 12360.0 4278.0 27289.0 1256.0 1511.0 60161.0 137542.0
23 2020-01-01 1984.0 25526.0 23595.0 5261.0 34628.0 4422.0 10130.0 132506.0 23224.0 ... 38375.0 29776.0 26492.0 13357.0 4614.0 27758.0 1325.0 1580.0 66067.0 141097.0
24 2019-12-01 2204.0 25704.0 24205.0 5187.0 33464.0 4233.0 11243.0 134607.0 22640.0 ... 35866.0 29886.0 27860.0 13759.0 4792.0 27187.0 1409.0 1747.0 65175.0 148395.0
25 2019-11-01 1983.0 24584.0 24661.0 4931.0 30263.0 4886.0 11019.0 139360.0 24478.0 ... 39602.0 29281.0 27347.0 13367.0 4675.0 29459.0 1375.0 1641.0 68215.0 143007.0
26 2019-10-01 2173.0 28210.0 28315.0 6027.0 36833.0 7826.0 13484.0 175844.0 28961.0 ... 44260.0 33407.0 35370.0 15213.0 5655.0 34250.0 1274.0 1934.0 92012.0 179242.0
27 2019-09-01 2572.0 29329.0 29049.0 9908.0 37735.0 8426.0 15865.0 176614.0 29324.0 ... 43968.0 36534.0 37728.0 16539.0 6418.0 35217.0 2242.0 2917.0 99239.0 186990.0
28 2019-08-01 3012.0 30197.0 29686.0 12911.0 38535.0 9024.0 16373.0 174218.0 29149.0 ... 43548.0 37931.0 40481.0 17583.0 7150.0 32993.0 2726.0 3444.0 110635.0 196632.0
29 2019-07-01 2954.0 30638.0 30711.0 12911.0 39715.0 8895.0 16339.0 178418.0 28525.0 ... 42885.0 37728.0 40453.0 17591.0 7041.0 31426.0 2757.0 3535.0 108069.0 196964.0
30 2019-06-01 2479.0 29954.0 28327.0 10775.0 37872.0 8428.0 15645.0 171786.0 29099.0 ... 43533.0 35891.0 37269.0 16189.0 6103.0 33742.0 2508.0 2937.0 98885.0 189383.0
31 2019-05-01 2262.0 28262.0 28503.0 7053.0 37384.0 7597.0 13281.0 171324.0 28684.0 ... 43880.0 34017.0 36306.0 15470.0 5319.0 34604.0 2555.0 2104.0 86267.0 187445.0
32 2019-04-01 2110.0 27218.0 27080.0 5539.0 36308.0 6305.0 11985.0 158711.0 25866.0 ... 39057.0 30874.0 34229.0 14432.0 4958.0 31597.0 2426.0 1955.0 73548.0 169391.0
33 2019-03-01 1775.0 27362.0 24518.0 5108.0 37157.0 4415.0 11213.0 150008.0 26319.0 ... 41485.0 28338.0 28165.0 13041.0 4299.0 33232.0 2285.0 1848.0 67939.0 157772.0
34 2019-02-01 1625.0 23368.0 21019.0 4529.0 33206.0 3526.0 9256.0 131628.0 22559.0 ... 36782.0 25442.0 24069.0 11906.0 3824.0 28637.0 2022.0 1614.0 59619.0 139353.0
35 2019-01-01 1925.0 24110.0 23694.0 4990.0 35228.0 3751.0 10059.0 138258.0 23211.0 ... 38933.0 27756.0 26258.0 13292.0 4237.0 30192.0 2226.0 1723.0 66304.0 145002.0
[36 rows x 37 columns]
我正在使用 eurostat 软件包在 Python 下载 Eurostat 数据集,而 Dataframe 格式很难处理。我一直在尝试将面板数据转成时间序列,但是一直没有成功。
我已经对数据进行了一些过滤和清理,但是我未能将 table 转换为时间序列(我对 Python 还很陌生)。在我的代码下方:
#pip install eurostat
import pandas as pd
import eurostat
# Commercial flights by reporting country – monthly data (source: Eurocontrol)
df_eurostat = eurostat.get_data_df('avia_tf_cm')
df_eurostat = df_eurostat.rename(columns={'geo\time':'Region'})
# To exclude: 'EU27_2020', 'EU28'
# df_eurostat = df_eurostat.drop(columns='unit').T
country_list = ['AL', 'AT', 'BE', 'BG', 'CH', 'CY', 'CZ', 'DE', 'DK', 'EE', 'EL',
'ES', 'FI', 'FR', 'HR', 'HU', 'IE', 'IS', 'IT', 'LT', 'LU', 'LV',
'ME', 'MK', 'MT', 'NL', 'NO', 'PL', 'PT', 'RO', 'RS', 'SE', 'SI',
'SK', 'TR', 'UK']
df_eurostat = df_eurostat[df_eurostat['Region'].isin(country_list)]
df_eurostat = df_eurostat.loc[(df_eurostat['unit']=='NR')]
之前:
之后-我想要实现的目标:
如果有人能提供帮助,我们将不胜感激。提前致谢!
再一步:
to_date = lambda x: pd.to_datetime(x['Date'], format='%YM%m')
df_eurostat = df_eurostat.drop(columns='unit').set_index('Region').T \
.rename_axis(index='Date', columns=None) \
.reset_index().assign(Date=to_date)
输出:
>>> df_eurostat
Date AL AT BE BG CH CY CZ DE DK ... NO PL PT RO RS SE SI SK TR UK
0 2021-12-01 2265.0 15224.0 20055.0 4188.0 24102.0 3851.0 6690.0 94592.0 17277.0 ... 32284.0 23299.0 23977.0 10653.0 3804.0 19148.0 1038.0 1224.0 55338.0 96922.0
1 2021-11-01 1953.0 15513.0 20445.0 3694.0 21180.0 4452.0 6549.0 96853.0 17630.0 ... 33727.0 22105.0 23334.0 9294.0 3578.0 19088.0 993.0 1040.0 57975.0 90265.0
2 2021-10-01 2358.0 18314.0 21520.0 4945.0 26289.0 7118.0 7019.0 115037.0 18805.0 ... 33051.0 23325.0 27620.0 11708.0 4017.0 19070.0 1137.0 1178.0 81820.0 103358.0
3 2021-09-01 2998.0 18856.0 21834.0 6853.0 24979.0 6488.0 7785.0 107754.0 17609.0 ... 31901.0 25523.0 26989.0 13370.0 4691.0 18503.0 1155.0 1453.0 81744.0 98183.0
4 2021-08-01 3705.0 19579.0 22261.0 8807.0 26451.0 6873.0 7815.0 106657.0 16538.0 ... 28870.0 26381.0 29506.0 14416.0 5761.0 17061.0 1268.0 1695.0 90404.0 92697.0
5 2021-07-01 2973.0 17697.0 21617.0 7663.0 24531.0 6418.0 7291.0 99334.0 15357.0 ... 26152.0 24355.0 26176.0 13446.0 5831.0 15591.0 1210.0 1608.0 87664.0 72389.0
6 2021-06-01 2173.0 11225.0 15313.0 4441.0 15021.0 4328.0 5151.0 68482.0 8958.0 ... 21798.0 17129.0 19879.0 10222.0 3955.0 11832.0 788.0 992.0 58319.0 50648.0
7 2021-05-01 1452.0 7783.0 11247.0 2796.0 11619.0 3016.0 3051.0 51870.0 5993.0 ... 19007.0 8933.0 13758.0 6936.0 2736.0 8661.0 592.0 436.0 36572.0 35027.0
8 2021-04-01 1039.0 6632.0 9537.0 2457.0 10199.0 1872.0 2310.0 45712.0 4994.0 ... 18183.0 7256.0 10086.0 5720.0 2203.0 7683.0 455.0 280.0 39540.0 27739.0
9 2021-03-01 935.0 5327.0 8454.0 2071.0 8431.0 1334.0 2174.0 39463.0 4615.0 ... 19120.0 6120.0 6216.0 4212.0 1829.0 7502.0 479.0 377.0 38896.0 25305.0
10 2021-02-01 751.0 3976.0 7836.0 1756.0 7116.0 992.0 1889.0 30330.0 3522.0 ... 16159.0 4553.0 5134.0 3543.0 1527.0 6274.0 391.0 418.0 30167.0 20496.0
11 2021-01-01 881.0 4801.0 9481.0 2229.0 9262.0 1064.0 2208.0 36932.0 4937.0 ... 18953.0 6943.0 9227.0 4555.0 1741.0 7203.0 402.0 444.0 32167.0 28100.0
12 2020-12-01 880.0 5271.0 10360.0 2577.0 9804.0 1316.0 2572.0 39709.0 6030.0 ... 18913.0 7898.0 10387.0 4463.0 1887.0 8003.0 416.0 521.0 29614.0 38484.0
13 2020-11-01 872.0 5409.0 9787.0 2265.0 7667.0 1528.0 2248.0 40854.0 6328.0 ... 21194.0 8035.0 9738.0 3661.0 2130.0 8903.0 404.0 362.0 36441.0 34516.0
14 2020-10-01 1227.0 9237.0 11507.0 3392.0 12132.0 3185.0 3271.0 64376.0 9356.0 ... 24317.0 13245.0 15886.0 6179.0 2817.0 11103.0 577.0 653.0 49092.0 61735.0
15 2020-09-01 1513.0 11990.0 12241.0 4429.0 14364.0 3464.0 4749.0 69292.0 10604.0 ... 24939.0 15927.0 17980.0 7112.0 2845.0 10819.0 664.0 901.0 51449.0 72451.0
16 2020-08-01 2087.0 13469.0 14772.0 5396.0 18023.0 3770.0 5157.0 73205.0 10657.0 ... 24069.0 18681.0 20945.0 8059.0 2898.0 9963.0 796.0 1114.0 51758.0 79123.0
17 2020-07-01 1754.0 10377.0 13294.0 5026.0 15326.0 2914.0 4441.0 62889.0 9168.0 ... 23057.0 14361.0 14599.0 6925.0 2846.0 8154.0 703.0 783.0 36743.0 52547.0
18 2020-06-01 400.0 3901.0 6902.0 2495.0 6319.0 996.0 1715.0 31467.0 4085.0 ... 17126.0 3120.0 4340.0 2386.0 1570.0 5025.0 513.0 382.0 18020.0 21071.0
19 2020-05-01 186.0 1628.0 5626.0 1521.0 2841.0 457.0 979.0 20787.0 2245.0 ... 13377.0 1106.0 2208.0 1391.0 494.0 3716.0 340.0 191.0 4703.0 16397.0
20 2020-04-01 134.0 1297.0 4708.0 931.0 1936.0 355.0 823.0 17894.0 1974.0 ... 13114.0 1059.0 1600.0 1393.0 295.0 3422.0 369.0 207.0 3726.0 13634.0
21 2020-03-01 862.0 13690.0 16101.0 3551.0 20060.0 2749.0 5807.0 84579.0 14416.0 ... 28254.0 14506.0 17820.0 8349.0 2529.0 19940.0 903.0 811.0 40122.0 96914.0
22 2020-02-01 1667.0 24837.0 22531.0 4923.0 33073.0 4030.0 9417.0 128115.0 22684.0 ... 36181.0 27688.0 25712.0 12360.0 4278.0 27289.0 1256.0 1511.0 60161.0 137542.0
23 2020-01-01 1984.0 25526.0 23595.0 5261.0 34628.0 4422.0 10130.0 132506.0 23224.0 ... 38375.0 29776.0 26492.0 13357.0 4614.0 27758.0 1325.0 1580.0 66067.0 141097.0
24 2019-12-01 2204.0 25704.0 24205.0 5187.0 33464.0 4233.0 11243.0 134607.0 22640.0 ... 35866.0 29886.0 27860.0 13759.0 4792.0 27187.0 1409.0 1747.0 65175.0 148395.0
25 2019-11-01 1983.0 24584.0 24661.0 4931.0 30263.0 4886.0 11019.0 139360.0 24478.0 ... 39602.0 29281.0 27347.0 13367.0 4675.0 29459.0 1375.0 1641.0 68215.0 143007.0
26 2019-10-01 2173.0 28210.0 28315.0 6027.0 36833.0 7826.0 13484.0 175844.0 28961.0 ... 44260.0 33407.0 35370.0 15213.0 5655.0 34250.0 1274.0 1934.0 92012.0 179242.0
27 2019-09-01 2572.0 29329.0 29049.0 9908.0 37735.0 8426.0 15865.0 176614.0 29324.0 ... 43968.0 36534.0 37728.0 16539.0 6418.0 35217.0 2242.0 2917.0 99239.0 186990.0
28 2019-08-01 3012.0 30197.0 29686.0 12911.0 38535.0 9024.0 16373.0 174218.0 29149.0 ... 43548.0 37931.0 40481.0 17583.0 7150.0 32993.0 2726.0 3444.0 110635.0 196632.0
29 2019-07-01 2954.0 30638.0 30711.0 12911.0 39715.0 8895.0 16339.0 178418.0 28525.0 ... 42885.0 37728.0 40453.0 17591.0 7041.0 31426.0 2757.0 3535.0 108069.0 196964.0
30 2019-06-01 2479.0 29954.0 28327.0 10775.0 37872.0 8428.0 15645.0 171786.0 29099.0 ... 43533.0 35891.0 37269.0 16189.0 6103.0 33742.0 2508.0 2937.0 98885.0 189383.0
31 2019-05-01 2262.0 28262.0 28503.0 7053.0 37384.0 7597.0 13281.0 171324.0 28684.0 ... 43880.0 34017.0 36306.0 15470.0 5319.0 34604.0 2555.0 2104.0 86267.0 187445.0
32 2019-04-01 2110.0 27218.0 27080.0 5539.0 36308.0 6305.0 11985.0 158711.0 25866.0 ... 39057.0 30874.0 34229.0 14432.0 4958.0 31597.0 2426.0 1955.0 73548.0 169391.0
33 2019-03-01 1775.0 27362.0 24518.0 5108.0 37157.0 4415.0 11213.0 150008.0 26319.0 ... 41485.0 28338.0 28165.0 13041.0 4299.0 33232.0 2285.0 1848.0 67939.0 157772.0
34 2019-02-01 1625.0 23368.0 21019.0 4529.0 33206.0 3526.0 9256.0 131628.0 22559.0 ... 36782.0 25442.0 24069.0 11906.0 3824.0 28637.0 2022.0 1614.0 59619.0 139353.0
35 2019-01-01 1925.0 24110.0 23694.0 4990.0 35228.0 3751.0 10059.0 138258.0 23211.0 ... 38933.0 27756.0 26258.0 13292.0 4237.0 30192.0 2226.0 1723.0 66304.0 145002.0
[36 rows x 37 columns]