如何在 python 中对 DataFrame 进行切片,并以列表的形式将行与 DataFrame 分开
How to slice DataFrame in python and make row that are separeted from DataFrame in the form of list
我是 python 的新手。
我有一个包含 105120 行和 33 列的数据框,如下所示:
n1 n4 n31 n54 n105 n114 n163 n188 ... n636 n644 n679 n722 n726 n740 n752 n769
0 28.92 33.87 37.13 37.13 50.52 53.99 52.56 55.32 ... 45.53 47.62 47.33 46.14 47.12 43.81 49.17 48.50
1 28.94 33.89 37.16 37.23 50.60 54.09 52.67 55.42 ... 45.61 47.71 47.39 46.19 47.17 43.83 49.22 48.54
2 28.96 33.91 37.18 37.21 50.57 54.05 52.64 55.39 ... 45.61 47.71 47.41 46.20 47.19 43.84 49.24 48.56
3 28.98 33.93 37.19 37.27 50.60 54.08 52.70 55.45 ... 45.65 47.75 47.43 46.21 47.21 43.84 49.26 48.57
4 28.98 33.93 37.19 37.14 50.53 54.00 52.57 55.32 ... 45.54 47.64 47.34 46.15 47.13 43.81 49.17 48.50
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
105115 28.55 33.61 36.93 36.88 50.42 53.88 52.34 55.09 ... 45.34 47.40 47.12 45.98 46.90 43.73 48.94 48.33
105116 28.56 33.61 36.94 36.82 50.38 53.82 52.28 55.03 ... 45.31 47.37 47.11 45.97 46.89 43.72 48.93 48.33
105117 28.58 33.64 36.96 36.90 50.43 53.88 52.35 55.11 ... 45.36 47.44 47.16 46.01 46.95 43.74 48.98 48.36
105118 28.58 33.64 36.96 36.85 50.40 53.86 52.31 55.06 ... 45.31 47.37 47.10 45.97 46.89 43.73 48.93 48.32
105119 28.63 33.68 36.99 37.00 50.49 53.96 52.45 55.20 ... 45.44 47.51 47.22 46.06 47.01 43.77 49.05 48.42
[105120 rows x 33 columns]
我想将Dataframe中的每一行分开,每一行将以列表的形式表示。
例如,与 Dataframe 分开的第一行将表示如下:
[28.92, 33.87, 37.13, 37.13, 50.52, 53.99, 52.56, 55.32, 39.09, 52.53, 52.81, 42.45, 56.43, 46.75, 31.07, 45.62, 36.79, 43.52, 47.54, 51.7, 53.54, 54.85, 47.58, 54.8, 56.16, 45.53, 47.62, 47.33, 46.14, 47.12, 43.81, 49.17, 48.5]
然后用列表形式表示的每一行与第一行进行比较(找出每一行中相同位置的元素与第一行的差异)。
你能告诉我如何实现这两件事吗?
提前致谢!
输入数据:
>>> df
0 1 2 3 4 5 6 7 8 9
0 72.76 47.54 57.37 40.03 75.24 46.05 51.76 33.94 56.97 78.77
1 52.54 79.47 42.52 31.36 40.38 47.65 34.89 57.98 40.27 59.65
2 72.61 48.26 35.68 43.54 32.07 60.47 71.90 85.41 40.66 48.70
3 62.56 72.92 50.90 38.44 48.64 47.35 69.55 65.00 30.74 63.12
4 27.59 45.57 55.75 66.59 60.69 60.99 49.23 45.16 33.03 29.38
第一行与其他行的区别:
>>> df - df.iloc[0]
0 1 2 3 4 5 6 7 8 9
0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1 -20.22 31.93 -14.85 -8.67 -34.86 1.60 -16.87 24.04 -16.70 -19.12
2 -0.15 0.72 -21.69 3.51 -43.17 14.42 20.14 51.47 -16.31 -30.07
3 -10.20 25.38 -6.47 -1.59 -26.60 1.30 17.79 31.06 -26.23 -15.65
4 -45.17 -1.97 -1.62 26.56 -14.55 14.94 -2.53 11.22 -23.94 -49.39
转换为列表:
>>> df.values.tolist()
[[72.76, 47.54, 57.37, 40.03, 75.24, 46.05, 51.76, 33.94, 56.97, 78.77],
[52.54, 79.47, 42.52, 31.36, 40.38, 47.65, 34.89, 57.98, 40.27, 59.65],
[72.61, 48.26, 35.68, 43.54, 32.07, 60.47, 71.9, 85.41, 40.66, 48.7],
[62.56, 72.92, 50.9, 38.44, 48.64, 47.35, 69.55, 65.0, 30.74, 63.12],
[27.59, 45.57, 55.75, 66.59, 60.69, 60.99, 49.23, 45.16, 33.03, 29.38]]
差异列表:df.sub(df.iloc[0]).values.tolist()
更新
if each row would be compared with a list that is not the row in the DataFrame. How should do it?
l = [55.89, 43.93, 32.39, 28.36, 39.98, 66.26, 51.56, 48.34, 37.33, 60.15]
>>> df - l
0 1 2 3 4 5 6 7 8 9
0 16.87 3.61 24.98 11.67 35.26 -20.21 0.20 -14.40 19.64 18.62
1 -3.35 35.54 10.13 3.00 0.40 -18.61 -16.67 9.64 2.94 -0.50
2 16.72 4.33 3.29 15.18 -7.91 -5.79 20.34 37.07 3.33 -11.45
3 6.67 28.99 18.51 10.08 8.66 -18.91 17.99 16.66 -6.59 2.97
4 -28.30 1.64 23.36 38.23 20.71 -5.27 -2.33 -3.18 -4.30 -30.77
我是 python 的新手。 我有一个包含 105120 行和 33 列的数据框,如下所示:
n1 n4 n31 n54 n105 n114 n163 n188 ... n636 n644 n679 n722 n726 n740 n752 n769
0 28.92 33.87 37.13 37.13 50.52 53.99 52.56 55.32 ... 45.53 47.62 47.33 46.14 47.12 43.81 49.17 48.50
1 28.94 33.89 37.16 37.23 50.60 54.09 52.67 55.42 ... 45.61 47.71 47.39 46.19 47.17 43.83 49.22 48.54
2 28.96 33.91 37.18 37.21 50.57 54.05 52.64 55.39 ... 45.61 47.71 47.41 46.20 47.19 43.84 49.24 48.56
3 28.98 33.93 37.19 37.27 50.60 54.08 52.70 55.45 ... 45.65 47.75 47.43 46.21 47.21 43.84 49.26 48.57
4 28.98 33.93 37.19 37.14 50.53 54.00 52.57 55.32 ... 45.54 47.64 47.34 46.15 47.13 43.81 49.17 48.50
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
105115 28.55 33.61 36.93 36.88 50.42 53.88 52.34 55.09 ... 45.34 47.40 47.12 45.98 46.90 43.73 48.94 48.33
105116 28.56 33.61 36.94 36.82 50.38 53.82 52.28 55.03 ... 45.31 47.37 47.11 45.97 46.89 43.72 48.93 48.33
105117 28.58 33.64 36.96 36.90 50.43 53.88 52.35 55.11 ... 45.36 47.44 47.16 46.01 46.95 43.74 48.98 48.36
105118 28.58 33.64 36.96 36.85 50.40 53.86 52.31 55.06 ... 45.31 47.37 47.10 45.97 46.89 43.73 48.93 48.32
105119 28.63 33.68 36.99 37.00 50.49 53.96 52.45 55.20 ... 45.44 47.51 47.22 46.06 47.01 43.77 49.05 48.42
[105120 rows x 33 columns]
我想将Dataframe中的每一行分开,每一行将以列表的形式表示。 例如,与 Dataframe 分开的第一行将表示如下:
[28.92, 33.87, 37.13, 37.13, 50.52, 53.99, 52.56, 55.32, 39.09, 52.53, 52.81, 42.45, 56.43, 46.75, 31.07, 45.62, 36.79, 43.52, 47.54, 51.7, 53.54, 54.85, 47.58, 54.8, 56.16, 45.53, 47.62, 47.33, 46.14, 47.12, 43.81, 49.17, 48.5]
然后用列表形式表示的每一行与第一行进行比较(找出每一行中相同位置的元素与第一行的差异)。
你能告诉我如何实现这两件事吗? 提前致谢!
输入数据:
>>> df
0 1 2 3 4 5 6 7 8 9
0 72.76 47.54 57.37 40.03 75.24 46.05 51.76 33.94 56.97 78.77
1 52.54 79.47 42.52 31.36 40.38 47.65 34.89 57.98 40.27 59.65
2 72.61 48.26 35.68 43.54 32.07 60.47 71.90 85.41 40.66 48.70
3 62.56 72.92 50.90 38.44 48.64 47.35 69.55 65.00 30.74 63.12
4 27.59 45.57 55.75 66.59 60.69 60.99 49.23 45.16 33.03 29.38
第一行与其他行的区别:
>>> df - df.iloc[0]
0 1 2 3 4 5 6 7 8 9
0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1 -20.22 31.93 -14.85 -8.67 -34.86 1.60 -16.87 24.04 -16.70 -19.12
2 -0.15 0.72 -21.69 3.51 -43.17 14.42 20.14 51.47 -16.31 -30.07
3 -10.20 25.38 -6.47 -1.59 -26.60 1.30 17.79 31.06 -26.23 -15.65
4 -45.17 -1.97 -1.62 26.56 -14.55 14.94 -2.53 11.22 -23.94 -49.39
转换为列表:
>>> df.values.tolist()
[[72.76, 47.54, 57.37, 40.03, 75.24, 46.05, 51.76, 33.94, 56.97, 78.77],
[52.54, 79.47, 42.52, 31.36, 40.38, 47.65, 34.89, 57.98, 40.27, 59.65],
[72.61, 48.26, 35.68, 43.54, 32.07, 60.47, 71.9, 85.41, 40.66, 48.7],
[62.56, 72.92, 50.9, 38.44, 48.64, 47.35, 69.55, 65.0, 30.74, 63.12],
[27.59, 45.57, 55.75, 66.59, 60.69, 60.99, 49.23, 45.16, 33.03, 29.38]]
差异列表:df.sub(df.iloc[0]).values.tolist()
更新
if each row would be compared with a list that is not the row in the DataFrame. How should do it?
l = [55.89, 43.93, 32.39, 28.36, 39.98, 66.26, 51.56, 48.34, 37.33, 60.15]
>>> df - l
0 1 2 3 4 5 6 7 8 9
0 16.87 3.61 24.98 11.67 35.26 -20.21 0.20 -14.40 19.64 18.62
1 -3.35 35.54 10.13 3.00 0.40 -18.61 -16.67 9.64 2.94 -0.50
2 16.72 4.33 3.29 15.18 -7.91 -5.79 20.34 37.07 3.33 -11.45
3 6.67 28.99 18.51 10.08 8.66 -18.91 17.99 16.66 -6.59 2.97
4 -28.30 1.64 23.36 38.23 20.71 -5.27 -2.33 -3.18 -4.30 -30.77