如何 select 并使用 pandas 数据框中特定变量的值进行计算

How to select and calculate with value from specific variable in dataframe with pandas

我在下面的代码中 运行 得到了这个:

import pandas as pd
pf=pd.read_csv("https://www.dropbox.com/s/08kuxi50d0xqnfc/demo.csv?dl=1")
x=pf[pf['fuv1'] == 0].count()*100/1892
x
id          0.528541
date        0.528541
count       0.528541
idade       0.528541
site        0.528541
baseline    0.528541
fuv1        0.528541
fuv2        0.475687
fuv3        0.528541
fuv4        0.475687
dtype: float64

我想要的只是得到这个结果0.528541而忘记了上面的所有结果。

怎么办? 谢谢

In [282]: pf.loc[pf['fuv1'] == 0, 'id'].count()*100/1892
Out[282]: 0.5285412262156448

如果要计算 fuv1 列中 0 个值的数量,请使用 sum 来计算 Trues,这些过程类似于 1s:

print ((pf['fuv1'] == 0).sum())
10

x = (pf['fuv1'] == 0).sum()*100/1892
print (x)
0.528541226216

解释为什么不同的输出 - count 排除 NaNs:

pf=pd.read_csv("https://www.dropbox.com/s/08kuxi50d0xqnfc/demo.csv?dl=1")
x=pf[pf['fuv1'] == 0]
print (x)
    id       date  count  idade site  baseline  fuv1  fuv2  fuv3  fuv4
0    0   4/1/2016     10     13    A         1   0.0   1.0   0.0   1.0
2    2   4/3/2016      9      5    C         1   0.0   NaN   0.0   1.0
3    3   4/4/2016    108     96    D         1   0.0   1.0   0.0   NaN
11  11  4/12/2016      6     13    C         1   0.0   1.0   1.0   0.0
13  13  4/14/2016     12      4    C         1   0.0   1.0   1.0   0.0
40  40  5/11/2016     14      7    C         1   0.0   1.0   1.0   1.0
41  41  5/12/2016      0     26    C         1   0.0   1.0   1.0   1.0
42  42  5/13/2016     10     15    C         1   0.0   1.0   1.0   1.0
60  60  5/31/2016     13      3    D         1   0.0   1.0   1.0   1.0
74  74  6/14/2016     15      7    B         1   0.0   1.0   1.0   1.0

print (x.count())
id          10
date        10
count       10
idade       10
site        10
baseline    10
fuv1        10
fuv2         9
fuv3        10
fuv4         9
dtype: int64
import pandas as pd

pf=pd.read_csv("https://www.dropbox.com/s/08kuxi50d0xqnfc/demo.csv?dl=1")

x = (pf['fuv1'] == 0).sum()*100/1892
y=pf["idade"].mean()

l = "Performance"
k = "LTFU"


def test(l1,k1):
    return pd.DataFrame({'a':[l1, k1], 'b':[x, y]})

df1 = test(l,k)
df1.columns = [''] * len(df1.columns)   
df1.index = [''] * len(df1.index)   

print(round(df1, 2))

  Performance   0.53
         LTFU  14.13