将存储为字符的分数转换为 float64

Convert fractions stored as characters to float64

假设我们有这个 df:

 df = pd.DataFrame({
            'value': ['18 4/2', '2 2/2', '8.5'],
            'country': ['USA', 'Canada', 'Switzerland']
    })

Out:

        value   country
    0   18 4/2  USA
    1   2 2/2   Canada
    2   8.5     Switzerland

注意 'value' 列存储一个 object 类型:

df.dtypes

Out:

value      object
country    object
dtype: object

我的问题:我们如何将 'value' 转换为十进制,同时将数据类型更改为 float64?请注意,一个值 (8.5) 已经是小数,因此应保持不变。期望的输出:

desired_output = pd.DataFrame({
        'value': [20, 3, 8.5],
        'country': ['USA', 'Canada', 'Switzerland']
})


    value   country
0   20.0    USA
1   3.0     Canada
2   8.5     Switzerland


desired_output.dtypes

value       float64
country     object
dtype: object

你可以 replace 带符号 + 的 space 然后 apply eval

print(df['value'].str.replace(' ', '+').apply(eval))
0    20.0
1     3.0
2     8.5
Name: value, dtype: float64

或使用pd.eval

df['value'] = pd.eval(df['value'].str.replace(' ', '+')).astype(float)
print(df)
  value      country
0  20.0          USA
1   3.0       Canada
2   8.5  Switzerland

我会接受@Ben.T的回答,但由于我已经试过,所以这是我的尝试。

>>> import pandas as pd
>>> df = pd.DataFrame({
...             'value': ['18 4/2', '2 2/2', '8.5'],
...             'country': ['USA', 'Canada', 'Switzerland']
...     })
>>> df
    value      country
0  18 4/2          USA
1   2 2/2       Canada
2     8.5  Switzerland
>>> def foo(s):
...     try:
...             return float(s)
...     except ValueError:
...             pass
...     w, f = s.split()
...     n, d = f.split('/')
...     w, n, d = map(int, (w, n, d))
...     return w + n / d
...
>>> foo('1')
1.0
>>> foo('18 4/2')
20.0
>>> df['value'] = df['value'].apply(foo)
>>> df
   value      country
0   20.0          USA
1    3.0       Canada
2    8.5  Switzerland
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
 #   Column   Non-Null Count  Dtype
---  ------   --------------  -----
 0   value    3 non-null      float64
 1   country  3 non-null      object
dtypes: float64(1), object(1)
memory usage: 176.0+ bytes
>>>