在 pandas 中使用重新索引而不更改其他列数据
Using reindex in pandas without changing other column data
这是我的代码,我正在尝试将数据帧的索引值更改为数据帧中存在的日期。
import pandas as pd
x = {
'Dates':['24/09/1998', '26/01/1999', '28/08/1999', '11/09/1999'],
'Names': ['A', "B", 'C', 'D'],
'Marks': [5, 8, 5, 9],
'City': ['Rjy', 'Nzmbd', 'Kurnool', 'Srk']}
df = pd.DataFrame(x)
df['Dates']=pd.to_datetime(df['Dates'])
dt = df['Dates']
idx = pd.DatetimeIndex(dt)
df = df.reindex(idx)
print(df)
我得到的输出数据帧是,
Dates Names Marks City
1998-01-01 NaT NaN NaN NaN
1998-01-02 NaT NaN NaN NaN
1998-01-03 NaT NaN NaN NaN
1998-01-04 NaT NaN NaN NaN
1998-01-05 NaT NaN NaN NaN
我应该在我的代码中更改什么以使我的数据不会更改为 NaN 或 NaT?
我会这样做:
x = {
'Dates':['24/09/1998', '26/01/1999', '28/08/1999', '11/09/1999'],
'Names': ['A', "B", 'C', 'D'],
'Marks': [5, 8, 5, 9],
'City': ['Rjy', 'Nzmbd', 'Kurnool', 'Srk']}
df = pd.DataFrame(x)
df = df.set_index('Dates')
df.index = pd.to_datetime(df.index)
print(df)
print('\n')
df.info()
输出:
Names Marks City
Dates
1998-09-24 A 5 Rjy
1999-01-26 B 8 Nzmbd
1999-08-28 C 5 Kurnool
1999-11-09 D 9 Srk
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 4 entries, 1998-09-24 to 1999-11-09
Data columns (total 3 columns):
Names 4 non-null object
Marks 4 non-null int64
City 4 non-null object
dtypes: int64(1), object(2)
memory usage: 128.0+ bytes
你可以试试这个:
In [1]:
## Set the Dataframe
x = {
'Dates':['24/09/1998', '26/01/1999', '28/08/1999', '11/09/1999'],
'Names': ['A', "B", 'C', 'D'],
'Marks': [5, 8, 5, 9],
'City': ['Rjy', 'Nzmbd', 'Kurnool', 'Srk']
}
df = pd.DataFrame(x)
df['Dates']=pd.to_datetime(df['Dates'])
# Change the index
df = df.set_index('Dates')
df
Out [1]:
Names Marks City
Dates
1998-09-24 A 5 Rjy
1999-01-26 B 8 Nzmbd
1999-08-28 C 5 Kurnool
1999-11-09 D 9 Srk
这是我的代码,我正在尝试将数据帧的索引值更改为数据帧中存在的日期。
import pandas as pd
x = {
'Dates':['24/09/1998', '26/01/1999', '28/08/1999', '11/09/1999'],
'Names': ['A', "B", 'C', 'D'],
'Marks': [5, 8, 5, 9],
'City': ['Rjy', 'Nzmbd', 'Kurnool', 'Srk']}
df = pd.DataFrame(x)
df['Dates']=pd.to_datetime(df['Dates'])
dt = df['Dates']
idx = pd.DatetimeIndex(dt)
df = df.reindex(idx)
print(df)
我得到的输出数据帧是,
Dates Names Marks City
1998-01-01 NaT NaN NaN NaN
1998-01-02 NaT NaN NaN NaN
1998-01-03 NaT NaN NaN NaN
1998-01-04 NaT NaN NaN NaN
1998-01-05 NaT NaN NaN NaN
我应该在我的代码中更改什么以使我的数据不会更改为 NaN 或 NaT?
我会这样做:
x = {
'Dates':['24/09/1998', '26/01/1999', '28/08/1999', '11/09/1999'],
'Names': ['A', "B", 'C', 'D'],
'Marks': [5, 8, 5, 9],
'City': ['Rjy', 'Nzmbd', 'Kurnool', 'Srk']}
df = pd.DataFrame(x)
df = df.set_index('Dates')
df.index = pd.to_datetime(df.index)
print(df)
print('\n')
df.info()
输出:
Names Marks City
Dates
1998-09-24 A 5 Rjy
1999-01-26 B 8 Nzmbd
1999-08-28 C 5 Kurnool
1999-11-09 D 9 Srk
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 4 entries, 1998-09-24 to 1999-11-09
Data columns (total 3 columns):
Names 4 non-null object
Marks 4 non-null int64
City 4 non-null object
dtypes: int64(1), object(2)
memory usage: 128.0+ bytes
你可以试试这个:
In [1]:
## Set the Dataframe
x = {
'Dates':['24/09/1998', '26/01/1999', '28/08/1999', '11/09/1999'],
'Names': ['A', "B", 'C', 'D'],
'Marks': [5, 8, 5, 9],
'City': ['Rjy', 'Nzmbd', 'Kurnool', 'Srk']
}
df = pd.DataFrame(x)
df['Dates']=pd.to_datetime(df['Dates'])
# Change the index
df = df.set_index('Dates')
df
Out [1]:
Names Marks City
Dates
1998-09-24 A 5 Rjy
1999-01-26 B 8 Nzmbd
1999-08-28 C 5 Kurnool
1999-11-09 D 9 Srk