How to normalize the date columns of one panda data frame (ValueError: could not convert string to float: '17-Aug-20 00:00:00')
How to normalize the date columns of one panda data frame (ValueError: could not convert string to float: '17-Aug-20 00:00:00')
我已按时间列打开 pandas CSV 文件,如下所示:
所以我试图通过以下命令规范化数据(df
变量):
import pandas as pd
from sklearn import preprocessing
import numpy as np
from sklearn.preprocessing import MinMaxScaler
import time
minmax = MinMaxScaler().fit(df.iloc[:].values.reshape((-1,1)))
df_log = MinMaxScaler().fit_transform(df.iloc[:].astype('float32'))
df.head()
或
df = pd.DataFrame(df.astype('float64'), columns=['Time'])
# specify your desired range (-1, 1)
scaler = MinMaxScaler(feature_range=(-1, 1))
scaled = scaler.fit_transform(df.values)
print(scaled)
但是我通过 运行 以上两个代码块得到了这个错误:
~/anaconda3/lib/python3.8/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
81
82 """
---> 83 return array(a, dtype, copy=False, order=order)
84
85
ValueError: could not convert string to float: '17-Aug-20 00:00:00'
所以如果可能的话,在这里询问如何规范化一个熊猫数据框的日期列。
谢谢。
你可以试试这个:
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
df = pd.DataFrame(
{
"time": [
"17-Aug-20 00:00:00",
"17-Aug-20 00:01:00",
"17-Aug-20 00:02:00",
"17-Aug-20 00:03:00",
"17-Aug-20 00:04:00",
],
}
)
# Convert to datetime type
df["time"] = pd.to_datetime(df["time"])
# Convert to Unix timestamp seconds
df["time"] = (df["time"] - pd.Timestamp("1970-01-01")) // pd.Timedelta("1s")
# Scale values
scaler = MinMaxScaler(feature_range=(-1, 1))
scaled = scaler.fit_transform(df["time"].values.reshape(-1, 1))
print(scaled)
# Outputs
[[-1. ]
[-0.5]
[ 0. ]
[ 0.5]
[ 1. ]]
我已按时间列打开 pandas CSV 文件,如下所示:
所以我试图通过以下命令规范化数据(df
变量):
import pandas as pd
from sklearn import preprocessing
import numpy as np
from sklearn.preprocessing import MinMaxScaler
import time
minmax = MinMaxScaler().fit(df.iloc[:].values.reshape((-1,1)))
df_log = MinMaxScaler().fit_transform(df.iloc[:].astype('float32'))
df.head()
或
df = pd.DataFrame(df.astype('float64'), columns=['Time'])
# specify your desired range (-1, 1)
scaler = MinMaxScaler(feature_range=(-1, 1))
scaled = scaler.fit_transform(df.values)
print(scaled)
但是我通过 运行 以上两个代码块得到了这个错误:
~/anaconda3/lib/python3.8/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
81
82 """
---> 83 return array(a, dtype, copy=False, order=order)
84
85
ValueError: could not convert string to float: '17-Aug-20 00:00:00'
所以如果可能的话,在这里询问如何规范化一个熊猫数据框的日期列。
谢谢。
你可以试试这个:
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
df = pd.DataFrame(
{
"time": [
"17-Aug-20 00:00:00",
"17-Aug-20 00:01:00",
"17-Aug-20 00:02:00",
"17-Aug-20 00:03:00",
"17-Aug-20 00:04:00",
],
}
)
# Convert to datetime type
df["time"] = pd.to_datetime(df["time"])
# Convert to Unix timestamp seconds
df["time"] = (df["time"] - pd.Timestamp("1970-01-01")) // pd.Timedelta("1s")
# Scale values
scaler = MinMaxScaler(feature_range=(-1, 1))
scaled = scaler.fit_transform(df["time"].values.reshape(-1, 1))
print(scaled)
# Outputs
[[-1. ]
[-0.5]
[ 0. ]
[ 0.5]
[ 1. ]]