获取 pandas 数据框,用户输入的数据框名称字符串用于函数

Get pandas dataframe with user input string of dataframe name for a function

我有几个 Pandas 数据框,其名称例如 -timeslices1_df。我想从此数据框中提取某些列,但需要用户输入条件。

def user_input_dataframe():
    
    timeslices_number=int(input("timeslice_number:"))
    process_number=int(input("process_number:"))
    core_number=input("core_number:")
    #timeslices_4__profilerdataprocess_45__c0_us_ example column_name
    dataframe_name="timeslices_"+str(timeslices_number)+"_df"
    column_name="timeslices_"+str(timeslices_number) +'__'+ "profilerdataprocess_"+str(process_number)+'__'+str(core_number)+'_'+"us"
    #print(column_name)
    list_of_datasets = [timeslices0_df,timeslices1_df,timeslices2_df ,timeslices3_df,timeslices4_df,timeslices5_df,
                       timeslices6_df,timeslices7_df,timeslices8_df]
    for index, dataset in enumerate(list_of_datasets):
        if  dataframe_name in dataset:
            X_df=pd.DataFrame()
            X_df.append(dataframe_name)
            X1 = [col for col in X_df.columns if column_name in col]
            X2=pd.DataFrame()
            X2=X_df[X1]
            X2['date'] = pd.date_range(start='1/1/2020', periods=len(X1), freq='D')
            X2=X2.set_index('date')
            return X2

我做不到,因为用户输入的是一个字符串。我收到这个错误。 有没有其他方法可以使用用户输入功能检索数据框?

Sample Input: 
   timeslice_number:4

   process_number:45
 
   core_number:c0
Expected Output:new dataframe with a single selected column

Actual Output: Empty dataframe

您似乎想以编程方式访问变量。内置函数 locals() and globals() 让您可以做到这一点。

variable1 = 1
variable2 = 2
variable3 = 3

i = 2
print(locals().get(f'variable{i}'))
# prints '2'

def get_variable(i):
    return globals().get(f'variable{i}')

print(get_variable(3))
# prints '3'

但是,将 DataFrame 保存在列表或字典中不是更干净吗?类似于:

timeslice_dfs = [
    timeslices1_df,
    timeslices2_df,
    # etc.
]

dfs = {
    'timeslice1': timeslices1_df,
    # etc.
}

你能试一试吗?

def user_input_dataframe():
    timeslices_number=int(input("timeslice_number:"))
    process_number=int(input("process_number:"))
    core_number=input("core_number:")
    #timeslices_4__profilerdataprocess_45__c0_us_ example column_name
    column_name="timeslices_"+str(timeslices_number) +'__'+ "profilerdataprocess_"+str(process_number)+'__'+str(core_number)+'_'+"us"
    list_of_datasets = [timeslices0_df,timeslices1_df,timeslices2_df ,timeslices3_df,timeslices4_df,timeslices5_df,
                       timeslices6_df,timeslices7_df,timeslices8_df]
    if timeslices_number >= len(list_of_datasets):
        return None
    dataset = list_of_datasets[timeslices_number]
    X1 = dataset[[col for col in dataset.columns if column_name in col]]
    X1['date'] = pd.date_range(start='1/1/2020', periods=len(X1), freq='D')
    X1=X1.set_index('date')
    return X1