在 Python 3.6 中，长运行循环如何导致变量变为未定义？

Question

简短版本： 我有一个嵌套循环，当它运行s 的时间比正常情况（2 小时）长时，某些原因导致变量仅在该内部循环内变为未定义。外循环可以毫无问题地使用它。

长版： 我在 RHEL 机器上有一个 Python 3.6 脚本运行ning，它从 SQL 服务器中提取表格，一次 500,000 行，并将它们写入 .csv 文件，然后被处理到不同的数据库中。

我首先在 Class 中建立与源和目标的连接，以便会话持续存在并为每个操作重复使用。我有心跳查询运行在我运行循环时保持其他非活动会话处于活动状态。（未显示，但目前仅在外循环中 - 这很快就会改变）

该框架相当简单...我将源表列表从文件读取到数据框中。然后，代码是（删除了内部循环中的一些代码，但保留了解释正在发生的事情的注释）。 :

for index, row in tables_to_load.iterrows():

    #try to load all new rows into target
    try: 
        table_name = tables_to_load['Source_Table'][index]
        pk = tables_to_load['pk'][index]
        odsstage_table_name = tables_to_load['ODSSTAGE_Table'][index]
        ods_table_name = tables_to_load['ODS_Table'][index]
        last_loaded_lsn = sf_ods.get_last_loaded_lsn(table_name)
        max_lsn = source.get_max_lsn(table_name)

        # Get the number of rows to load
        incr_count = source.get_incr_count(table_name, pk, last_loaded_lsn)
        total_process_rows = total_process_rows + incr_count

        #WORKS FINE EVERY TIME
        file.write("There are  " + str(incr_count) + " rows to be loaded for the table: " + table_name + "\n\n")

        # Set up your offset limits
        offset_iterator = 0 # Start at the 1st row of the increment
        
        fetch_count = 500000 # Increment by 500k rows. 
        
        if incr_count == 0: #only run if there are new rows to load to target
            status = "No Rows To Load"

        
        else: # If there are rows to load, proceed loading them to target
            try:
                inc_write_success = 1

                # Begin looping through the increment in source until offset_iterator > the number of rows in the increment
                while offset_iterator < incr_count:
                    sf_dw.heartbeat()

                    #ERROR OCCURS HERE
                    file.write(table_name + ": Loading the next " + str(fetch_count) + " rows starting at row " + str(offset_iterator + 1) + " out of " + str(incr_count) + " rows" + "\n")
                                                             
                    # Get the incremental dataframe starting at the offset_indicator row
                    #the df is the first object returned
                    #the list of columns in the original table is the third object returned
                    
                    # Write the dataframe to a compressed file at the temporary storage
                    # Move the file to target and COPY INTO the ODSSTAGE table in target
                    
                    
                    # Merge ODSSTAGE to ODS
                                            
                    # Update the offset_iterator to fetch the next increment of rows until you've reached the max_lsn
                    offset_iterator += fetch_count
                 
            except Exception as e:
                any_errors = True
                inc_write_success = 0
         # Update log_cdc table
            try:
                if inc_write_success == 1:
                    #WORKS FINE
                    sf_ods.update_last_loaded_lsn(table_name, max_lsn, status)
                else:
                    #WORKS FINE
                    file.write("Unable to Update Max_LSN for " + table_name + " due to increment failure. \n")
            except Exception as e:
                file.write(str(e) + "\n")

    except Exception as e:
        file.write(str(e) + "\n")
        inc_write_success = 0

    stop_table_load = datetime.datetime.now()
    load_table_time = stop_table_load - table_load_start
    
    #AGAIN WORKS FINE
    file.write("Finished loading " + table_name + ". Load Time: " + str(load_table_time) + "\n")

问题：每当内部循环需要超过 4 小时才能完成时，下一个 'for' 实例似乎可以很好地处理一切，直到内部循环的第一个运行，我在那里受到欢迎与“名称 'table_name' 未定义”。据我所知，它似乎发生在指定的位置，我正在为日志文件填写一些行。日志文件中有几行表明 table_name 变量在调用内部循环之前和之后都正常运行。

该变量未定义为全局...它是在尝试时定义的。

我应该将其创建为全局变量吗？在内部循环中重命名它？

我无权 post 关于该项目的任何更具体的内容。

Answer 1

我不相信程序在您建议的地方失败了，至少单独看这个片段是这样。

我猜 table_name = tables_to_load['Source_Table'][index] 实际上失败了。如果失败，table_name 将永远不会被定义；您的代码将跳转到您的外部 except 语句并使用 inc_write_success 内容写入文件。然后它将尝试执行 file.write("Finished loading " + table_name + ". Load Time: " + str(load_table_time) + "\n") 并在那里失败。

提供显示错误发生位置的堆栈跟踪在这些情况下也很有帮助。

在 Python 3.6 中，长运行循环如何导致变量变为未定义？

In Python 3.6, how could a long running loop cause a variable to become undefined?

python

variables

在 Python 3.6 中，长 运行 循环如何导致变量变为未定义？

In Python 3.6, how could a long running loop cause a variable to become undefined?

python

variables

在 Python 3.6 中，长运行循环如何导致变量变为未定义？