函数前的变量赋值
Variables assignement before function
我创建了一个包来使用 pandas 和 xlsxwriter 快速转换数据。
这工作得很好,我成功地完成了一些功能。不过最近碰壁了:
对于一些函数,我需要先定义变量,但它们不是基本类型(列表、元组、str 等),而是例如数据框。我研究了全局变量,发现它们没有被推荐(并且不知道将它们放在哪里),我也研究了 类 但我不知道如何使用它们解决我的问题。我也试过创建一个空数据框,但在函数之后得到一个空数据框。
我正在尝试做的是使用 pandas 读取 .csv 或 .xlsx 的函数以及使用 Xlsxwriter 引擎保存的函数。
目标是尽可能少地更改代码以频繁且快速地转换数据(例如,我有像 Excel 中那样执行 LEFT,RIGHT 的功能,甚至是带有列号的 MIDDLE)并且在 main.py.
这是我的代码的精简版本,它使用 2 个 python 文件(main.py 和 format_operations.py)。我在遇到问题的地方添加了评论。
预先感谢您的帮助!
"""
main.py
"""
import format_operations as tbfrm #import another python file in the same folder
import pandas as pd
import numpy as np
import xlsxwriter.utility
#file settings
file_full_path= "C:/Tests/big_data.xlsx"
file_save_to= "C:/Tests/Xlsxwriter.xlsx"
sheet_name_save_to= "Xlswriter"
dfname = ??? #I need to create the variable but I don't know how
tbfrm.FCT_universal_read(dfname,file_full_path) #CAN'T GET IT TO WORK
#column operations and formatting
columns_numeric = [3,6] # (with pandas) list of columns with number values by iloc number, starts at 0 which is column A in Excel
tbfrm.FCT_columns_numeric(dfname,columns_numeric) #example of a WORKING function (if dfname is defined)
#write with Xlsxwriter engine
XLWRITER_DF = ??? #same problem as before, how to create the variable?
workbookvarname = ??? #same here
worksheetvarname = ??? # same here
tbfrm.FCT_df_xlsxwriter(XLWRITER_DF,dfname,file_save_to,sheet_name_save_to,workbookvarname,worksheetvarname) #CAN'T GET IT TO WORK
#### WORKING piece of code I want to execute after saving with Xlsxwriter engine ####
worksheet.set_zoom(80)
# Conditional formatting
color_range_1 = "J1:J{}".format(number_rows+1)
FORMAT1 = workbook.add_format({'bg_color': '#FFC7CE','font_color': '#9C0006'})
FORMAT2 = workbook.add_format({'bg_color': '#C6EFCE','font_color': '#006100'})
worksheet.conditional_format(color_range_1, {'type': 'bottom','value': '5','format': FORMAT1})
worksheet.conditional_format(color_range_1, {'type': 'top','value': '5','format': FORMAT2})
其他文件:
"""
format_operations.py
"""
import pandas as pd
import numpy as np
import xlsxwriter.utility
def FCT_universal_read(dfname,file_full_path):
if ".xls" in file_full_path:
dfname = pd.read_excel(file_full_path) #optional arguments:sheetname='Sheet1', header=0 , dtype=object to preserve values
if ".csv" in file_full_path:
dfname = pd.read_csv(file_full_path)
# save file with XLSXWriter engine for additional options to pandas
def FCT_df_xlsxwriter(XLWRITER_DF,dfname,file_save_to,sheet_name_save_to,workbookvarname,worksheetvarname):
XLWRITER_DF = pd.ExcelWriter(file_save_to, engine='xlsxwriter')
dfname.to_excel(XLWRITER_DF, sheet_name=sheet_name_save_to,encoding='utf-8')
workbookvarname = XLWRITER_DF.book
worksheetvarname = XLWRITER_DF.sheets[sheet_name_save_to]
#format as numbers
def FCT_columns_numeric(dfname,columns_numeric):
for x in columns_numeric:
dfname.iloc[:,x] = pd.to_numeric(dfname.iloc[:,x])
您的 FCT_universal_read 函数不应修改数据框,而应 return 修改一个新数据框:
def FCT_universal_read(file_full_path):
if file_full_path.split('.')[-1] == "xls":
df = pd.read_excel(file_full_path) #optional arguments:sheetname='Sheet1', header=0 , dtype=object to preserve values
if file_full_path.split('.')[-1] == "csv":
df = pd.read_csv(file_full_path)
return df
在你的主要任务中,做:
dfname = tbfrm.FCT_universal_read(file_full_path)
与 FCT_df_xlsxwriter 相同的答案,您应该用 return 重写它,这样您就可以:
XLWRITER_DF, workbookvarname,worksheetvarname = tbfrm.FCT_df_xlsxwriter(dfname,file_save_to,sheet_name_save_to)
要了解 python 如何处理您传递给函数的参数,您应该阅读这些博文:
https://jeffknupp.com/blog/2012/11/13/is-python-callbyvalue-or-callbyreference-neither/
https://robertheaton.com/2014/02/09/pythons-pass-by-object-reference-as-explained-by-philip-k-dick/
您需要更新 FCT_universal_read
以便它 return 成为您想要的数据框。无需在函数外定义数据框,只需创建并 return it
df = FCT_universal_read('/your/file/path')
def FCT_universal_read(file_full_path):
if ".xls" in file_full_path:
df = pd.read_excel(file_full_path) #optional arguments:sheetname='Sheet1', header=0 , dtype=object to preserve values
return df
if ".csv" in file_full_path:
df = pd.read_csv(file_full_path)
return df
非常感谢你们!我现在明白了 :)!也感谢您提供文档。
我成功地完成了这两个功能。我已经挣扎了几个小时。
我喜欢您使用的 .split 函数,它确保脚本只查看扩展名。
我按照你说的更新了 FCT_xlsxwriter 和 FCT_universal_read。以下是更正的两个函数:
'''
format_operations.py
'''
def FCT_universal_read(file_full_path):
if "xls" in file_full_path.split('.')[-1]:
dfname = pd.read_excel(file_full_path) #example: C:/Tests/Bigdata.xlsx
return dfname
if "csv" in file_full_path.split('.')[-1]:
dfname = pd.read_csv(file_full_path)
return dfname
def FCT_df_xlsxwriter(dfname,file_save_to,sheet_name_save_to):
XLWRITER_DF = pd.ExcelWriter(file_save_to, engine='xlsxwriter')
dfname.to_excel(XLWRITER_DF, sheet_name=sheet_name_save_to,encoding='utf-8')
workbook = XLWRITER_DF.book
worksheet = XLWRITER_DF.sheets[sheet_name_save_to]
return XLWRITER_DF,workbook,worksheet
下面是我调用这两个函数的方式:
'''
main.py
'''
import format_operations as tbfrm
import pandas as pd
import xlsxwriter.utility
#settings
file_full_path= "C:/Tests/big_data.xlsx"
file_save_to= "C:/Tests/Xlsxwriter.xlsx"
sheet_name_save_to= "Xlswriter"
#functions
FILE_DF = tbfrm.FCT_universal_read(file_full_path)
XLWRITER_DF,workbook,worksheet = tbfrm.FCT_df_xlsxwriter(FILE_DF,file_save_to,sheet_name_save_to)
我创建了一个包来使用 pandas 和 xlsxwriter 快速转换数据。 这工作得很好,我成功地完成了一些功能。不过最近碰壁了:
对于一些函数,我需要先定义变量,但它们不是基本类型(列表、元组、str 等),而是例如数据框。我研究了全局变量,发现它们没有被推荐(并且不知道将它们放在哪里),我也研究了 类 但我不知道如何使用它们解决我的问题。我也试过创建一个空数据框,但在函数之后得到一个空数据框。
我正在尝试做的是使用 pandas 读取 .csv 或 .xlsx 的函数以及使用 Xlsxwriter 引擎保存的函数。 目标是尽可能少地更改代码以频繁且快速地转换数据(例如,我有像 Excel 中那样执行 LEFT,RIGHT 的功能,甚至是带有列号的 MIDDLE)并且在 main.py.
这是我的代码的精简版本,它使用 2 个 python 文件(main.py 和 format_operations.py)。我在遇到问题的地方添加了评论。 预先感谢您的帮助!
"""
main.py
"""
import format_operations as tbfrm #import another python file in the same folder
import pandas as pd
import numpy as np
import xlsxwriter.utility
#file settings
file_full_path= "C:/Tests/big_data.xlsx"
file_save_to= "C:/Tests/Xlsxwriter.xlsx"
sheet_name_save_to= "Xlswriter"
dfname = ??? #I need to create the variable but I don't know how
tbfrm.FCT_universal_read(dfname,file_full_path) #CAN'T GET IT TO WORK
#column operations and formatting
columns_numeric = [3,6] # (with pandas) list of columns with number values by iloc number, starts at 0 which is column A in Excel
tbfrm.FCT_columns_numeric(dfname,columns_numeric) #example of a WORKING function (if dfname is defined)
#write with Xlsxwriter engine
XLWRITER_DF = ??? #same problem as before, how to create the variable?
workbookvarname = ??? #same here
worksheetvarname = ??? # same here
tbfrm.FCT_df_xlsxwriter(XLWRITER_DF,dfname,file_save_to,sheet_name_save_to,workbookvarname,worksheetvarname) #CAN'T GET IT TO WORK
#### WORKING piece of code I want to execute after saving with Xlsxwriter engine ####
worksheet.set_zoom(80)
# Conditional formatting
color_range_1 = "J1:J{}".format(number_rows+1)
FORMAT1 = workbook.add_format({'bg_color': '#FFC7CE','font_color': '#9C0006'})
FORMAT2 = workbook.add_format({'bg_color': '#C6EFCE','font_color': '#006100'})
worksheet.conditional_format(color_range_1, {'type': 'bottom','value': '5','format': FORMAT1})
worksheet.conditional_format(color_range_1, {'type': 'top','value': '5','format': FORMAT2})
其他文件:
"""
format_operations.py
"""
import pandas as pd
import numpy as np
import xlsxwriter.utility
def FCT_universal_read(dfname,file_full_path):
if ".xls" in file_full_path:
dfname = pd.read_excel(file_full_path) #optional arguments:sheetname='Sheet1', header=0 , dtype=object to preserve values
if ".csv" in file_full_path:
dfname = pd.read_csv(file_full_path)
# save file with XLSXWriter engine for additional options to pandas
def FCT_df_xlsxwriter(XLWRITER_DF,dfname,file_save_to,sheet_name_save_to,workbookvarname,worksheetvarname):
XLWRITER_DF = pd.ExcelWriter(file_save_to, engine='xlsxwriter')
dfname.to_excel(XLWRITER_DF, sheet_name=sheet_name_save_to,encoding='utf-8')
workbookvarname = XLWRITER_DF.book
worksheetvarname = XLWRITER_DF.sheets[sheet_name_save_to]
#format as numbers
def FCT_columns_numeric(dfname,columns_numeric):
for x in columns_numeric:
dfname.iloc[:,x] = pd.to_numeric(dfname.iloc[:,x])
您的 FCT_universal_read 函数不应修改数据框,而应 return 修改一个新数据框:
def FCT_universal_read(file_full_path):
if file_full_path.split('.')[-1] == "xls":
df = pd.read_excel(file_full_path) #optional arguments:sheetname='Sheet1', header=0 , dtype=object to preserve values
if file_full_path.split('.')[-1] == "csv":
df = pd.read_csv(file_full_path)
return df
在你的主要任务中,做:
dfname = tbfrm.FCT_universal_read(file_full_path)
与 FCT_df_xlsxwriter 相同的答案,您应该用 return 重写它,这样您就可以:
XLWRITER_DF, workbookvarname,worksheetvarname = tbfrm.FCT_df_xlsxwriter(dfname,file_save_to,sheet_name_save_to)
要了解 python 如何处理您传递给函数的参数,您应该阅读这些博文:
https://jeffknupp.com/blog/2012/11/13/is-python-callbyvalue-or-callbyreference-neither/ https://robertheaton.com/2014/02/09/pythons-pass-by-object-reference-as-explained-by-philip-k-dick/
您需要更新 FCT_universal_read
以便它 return 成为您想要的数据框。无需在函数外定义数据框,只需创建并 return it
df = FCT_universal_read('/your/file/path')
def FCT_universal_read(file_full_path):
if ".xls" in file_full_path:
df = pd.read_excel(file_full_path) #optional arguments:sheetname='Sheet1', header=0 , dtype=object to preserve values
return df
if ".csv" in file_full_path:
df = pd.read_csv(file_full_path)
return df
非常感谢你们!我现在明白了 :)!也感谢您提供文档。
我成功地完成了这两个功能。我已经挣扎了几个小时。 我喜欢您使用的 .split 函数,它确保脚本只查看扩展名。
我按照你说的更新了 FCT_xlsxwriter 和 FCT_universal_read。以下是更正的两个函数:
'''
format_operations.py
'''
def FCT_universal_read(file_full_path):
if "xls" in file_full_path.split('.')[-1]:
dfname = pd.read_excel(file_full_path) #example: C:/Tests/Bigdata.xlsx
return dfname
if "csv" in file_full_path.split('.')[-1]:
dfname = pd.read_csv(file_full_path)
return dfname
def FCT_df_xlsxwriter(dfname,file_save_to,sheet_name_save_to):
XLWRITER_DF = pd.ExcelWriter(file_save_to, engine='xlsxwriter')
dfname.to_excel(XLWRITER_DF, sheet_name=sheet_name_save_to,encoding='utf-8')
workbook = XLWRITER_DF.book
worksheet = XLWRITER_DF.sheets[sheet_name_save_to]
return XLWRITER_DF,workbook,worksheet
下面是我调用这两个函数的方式:
'''
main.py
'''
import format_operations as tbfrm
import pandas as pd
import xlsxwriter.utility
#settings
file_full_path= "C:/Tests/big_data.xlsx"
file_save_to= "C:/Tests/Xlsxwriter.xlsx"
sheet_name_save_to= "Xlswriter"
#functions
FILE_DF = tbfrm.FCT_universal_read(file_full_path)
XLWRITER_DF,workbook,worksheet = tbfrm.FCT_df_xlsxwriter(FILE_DF,file_save_to,sheet_name_save_to)