我需要如何将 sql 插入到通用值、哪些列和值中？

Question

我遇到的问题是：Excel 中的列每天都在变化。如果我在 insert into 部分、%s 部分、for 循环部分中添加一个列的名称，并且 for 循环中的值和该列不在 Excel 中，脚本将抛出一个错误，指出："first_column_name_db" 在使用前赋值

如果可能，我如何将插入部分、%s 部分、在 Excel 中找到的列以及插入到数据库中的值 table 变量化？

仅在 Excel.[=12 内的列为 found/located 时，才会在 INSERT INTO 部分、%s 部分、Excel For 循环和值部分中添加行=]

或者如果 Excel 中的列不是 found/located 并且找到的内容仍然插入到数据库 table 中，是否有更好的方法跨过每个部分中的行？

(我还在上面的代码中添加了注释 INSERT INTO 和 %s VALUES，Excel 的 For 循环以及 for 循环内的值 第一个2 For 循环仅用于按名称搜索列索引)

根据在 Excel 中找到的列调用用于插入 %s 的 excel 列和值的变量。我不确定列表是否有效。我一直在想我应该如何 start/proceed 完成这项任务。

        import psycopg2
        import xlrd

        try:
            conn = psycopg2.connect(user = 'Username', password = 'Password', host = 'Host_name', database = 'DB_name', port = Port_number)
            mycursor = conn.cursor()

            print('DB connection open')
            print('XLRD Data inserting into DB Table')

            #Open the excel
            book = xlrd.open_workbook('excel_name.xlsx')
            #Use Index # for Which worksheet or use by_sheet_name......
            sheet = book.sheet_by_index(0)

            #Loop row index's
            for rowidx in range(sheet.nrows):
                row = sheet.row(rowidx)
                #Loop Column index's
                for colidx, cell in enumerate(row):
                    #Cell value Text for Header name
                    if cell.value == "Header_Name_Column_A":
                        #Varablize header index #
                        header_name_a_index = colidx

                    if cell.value == "Header_Name_Column_B":
                        #Varablize header index #
                        header_name_b_index = colidx

                    if cell.value == "Header_Name_Column_C":
                        #Varablize header index #
                        header_name_c_index = colidx

                    if cell.value == "Header_Name_Column_D":
                        #Varablize header index #
                        header_name_d_index = colidx

                    #INSERT INTO lines are only added if columns are found or located in the Excel
                    #VALUES of %s are updated based on locating/finding columns in the Excel
                    sql = """INSERT INTO db_table_name(
                    first_column_name_db,
                    second_column_name_db,
                    third_column_name_db,
                    fourth_column_name_db
                    )

                    VALUES(
                    %s,
                    %s,
                    %s,
                    %s)"""

                    #loop through all rows and cells
                    for r in range(1, sheet.nrows):
                        #The columns are added as they are located/found in the Excel
                        first_column_name_db = sheet.cell(r, header_name_a_index).value
                        second_column_name_db = sheet.cell(r, header_name_b_index).value
                        third_column_name_db = sheet.cell(r, header_name_c_index).value
                        fourth_column_name_db = sheet.cell(r, header_name_d_index).value

                        #The values are updated based on the columns located/found in the Excel
                        values = (
                        first_column_name_db,
                        second_column_name_db,
                        third_column_name_db,
                        fourth_column_name_db
                        )

                        mycursor.execute(sql, values)

                    #Commit to the DB. Close the mycursor and conn.
                    mycursor.close()
                    conn.commit()
                    conn.close()

        except Exception as e:
            #Close cursor and connection if error
            mycursor.close()
            conn.close()
            print('Error')
            print(e)

Answer 1

我会通过确定 - 在你的第一个 for 循环中 - 哪些值存在来动态创建 sql 和 values，然后保留一个值和索引的元组列表需要输入数据库。

例如，您可以在每次外部 for 迭代时初始化一个新列表：

sqlCols = []

然后将最后一行添加到 colidx, cell in enumerate(row): 循环中的每个条件：

if cell.value == "Header_Name_Column_A":
    #Varablize header index #

    sqlCols.append(('first_column_name_db',colidx))

然后，当您进入 INSERT 命令时，您可以定义 sql 字符串和 sql 值（仅当该列存在时）：

sql = "INSERT INTO db_table_name("+','.join([x[0] for x in sqlCols])+") VALUES("+','.join(['%s' for _ in range(len(sqlCols))])+")"

并且因为 sqlCols 的每个元素中的元组 (column_name,column_index) 始终将名称与索引相匹配，所以您以何种顺序写入值并不重要, 这样可以减少到

for r in range(1, sheet.nrows):
    #The columns are added as they are located/found in the Excel
    #The values are updated based on the columns located/found in the Excel

    values = [sheet.cell(r,x[1]).value for x in sqlCols]

我需要如何将 sql 插入到通用值、哪些列和值中？

How would I need to variablize sql insert into, generic values, which columns and values?

python

postgresql

xlrd

python-3.x