如何处理 Python 数据框中包含日期、数字、字符串值的列

Question

我有一个输入 CSV 文件，其中 A 列只有数字值，B 列包含 Number 、 string 和 date 。当我尝试使用 pd.read_csv 读取此 CSV 文件并使用 to_excel() 函数将数据写入 excel 文件时，输出 excel 文件存储日期值，数字作为 B 列中的字符串值（注意：在 Excel 中，字符串值位于单元格的左侧，数字、日期值存储在单元格的右侧，但在我的输出中 excel 文件日期，数字值作为字符串存储在单元格的左侧）。如何防止这种情况发生？

文件link:file

示例代码：

import pandas as pd
import numpy as np
data = pd.read_csv("input.csv",parse_dates=False,na_filter = False) 
print (data.dtypes)
data.to_excel('output.xlsx',sheet_name = 'sheet1',index=False,float_format=None)

输出文件问题： picture（在 B 列中标记的单元格有日期，数字值存储为字符串值）

预期输出：expected output（标记的单元格在单元格左侧有日期、数值）

D类型：一个 int64 , B对象 dtype: 对象

Answer 1

据我了解，这里要求在同一列上进行多次格式化。

检查以下行是否适合您

# this will change all possible values into int
df['yourcolumn']= df['your column'].astype(int, errors='ignore')
# this will convert all possible values in date
df['yourcolumn'] = pd.to_datetime(df['yourcolumn'], format=%d%b%Y,errors='ignore')
print(pd.head())

Answer 2

使用 xlwings 解决了这个问题。例如代码：

    import xlwings as xw
    import pandas as pd
    import numpy as np
    df=pd.read_csv('asset_input.csv',encoding='cp1252',parse_dates=False,na_filter = False)
    app = xw.App(visible=False)
    book = xw.Book('e2.xlsm')
    sht = book.sheets('ASSET')
    sht.range('B9').options(index=False, header=False).value = df
    book.save()
    book.close()
    app.quit()

如何处理 Python 数据框中包含日期、数字、字符串值的列

How to handle a column which contains date , number, string values in Python Data Frame

python

numpy

pandas

xlwings