Python 中的数据帧拆分

Dataframe splitting in Python

我有一个数据框 DF 和列名 DETAILS,我想要双引号 ("") 中的值,将创建新列 VALUE

DF=

      DETAILS                      
   Username as "JOHN"           
   verifying the name             
   click UserBox                  
   Password as "345678"          
   Click login                   
   click checkbox                 
   Phonenumber as "34512345"     
   Checking data    

我希望我的数据框看起来像这样

DF=

     DETAILS                      VALUE
   Username as "JOHN"           "JOHN"
   verifying the name             NA
   click UserBox                  NA
   Password as "345678"          "3345678"
   Click login                    NA
   click checkbox                 NA
   Phonenumber as "34512345"     "34512345"
   Checking data                 NA

 

给你:

df['VALUE'] = df.apply(
    lambda x: '"' + x['DETAILS'].split('"')[1] + '"' 
                if len(x['DETAILS'].split('"')) > 1 
                else 'NA',
    1)

这将输出:

                     DETAILS       VALUE
0         Username as "JOHN"      "JOHN"
1         verifying the name          NA
2              click UserBox          NA
3       Password as "345678"    "345678"
4                Click login          NA
5             click checkbox          NA
6  Phonenumber as "34512345"  "34512345"
7              Checking data          NA

VALUES 列包含您当前的值,否则 NA.

str 访问器有一个 extract 方法用于这种用例:

DF['VALUE'] = DF['DETAILS'].str.extract('(".*?")')

根据您的数据,它给出了预期的结果:

                     DETAILS       VALUE
0         Username as "JOHN"      "JOHN"
1         verifying the name         NaN
2              click UserBox         NaN
3       Password as "345678"    "345678"
4                Click login         NaN
5             click checkbox         NaN
6  Phonenumber as "34512345"  "34512345"
7              Checking data         NaN