嵌套 for 循环内分配的数据类型不符合预期
Data type assigned inside nested for loop isn't as expected
我收到错误:
AttributeError: 'float' object 没有属性 'lower'
尝试编译这个三重嵌套 for 循环时:
for row_data in df_row_list:
for row_item_data in row_data:
for param in search_params:
if row_item_data.lower() == param.lower():
row_index = df_row_list.index(row_data)
df_row_list是18个系列的列表。我正在尝试遍历它并梳理数据。如何将 str 数据类型分配给 row_item_data 以便我可以使用 .lower() 属性?
这是我正在处理的数据的样子:
0 NaN NaN ... NaN NaN
1 REV. : NC ... NaN NaN
2 OP.# : 0200-00-0 ... NaN NaN
3 NaN NaN ... NaN NaN
4 WI ASM # HOLDER # ... TOOL STICK OUT TOOL LIFE (MIN)
5 NaN NaN ... 0.55 120
6 NaN NaN ... 0.55 120
7 NaN NaN ... 0.55 120
8 NaN NaN ... 0.55 240
9 NaN NaN ... 0.55 300
搜索参数正在寻找包含以下词的系列:HOLDER DESCRIPTION、CUTTER #、Operation、TOOL DESCRIPTION
我创建了一个电子表格,其中存储了数百个选项,我将与之进行比较。
我希望它从 df_row_list(其中包含多个系列的列表)中吐出系列的索引,这样我就可以知道我想使用的数据行在哪里“标题行”是。
或者这甚至不是尝试针对特定关键字梳理系列列表的最佳方式吗?我是 python 的新手,愿意接受任何帮助。
只是发帖以防有人遇到类似问题并正在寻找不同的解决方案
这就是我找到解决方案的方式:
import os
import pandas as pd
#the file path I want to pull from
in_path = r'W:\R1_Manufacturing\Parts List Project\Tool_scraping\Excel'
#the file path where row search items are stored
search_parameters = r'W:\R1_Manufacturing\Parts List Project\search_params.xlsx'
#the file I will write the dataframes to
outfile_path = r'W:\R1_Manufacturing\Parts List Project\xlsx_reader.xlsx'
#establishing my list that I will store looped data into
file_list = []
main_header_list = []
#open the file path to store the directory in files
files = os.listdir(in_path)
#database with terms that I want to track
search = pd.read_excel(search_parameters)
length_search = search.index
#turn search dataframe into string to do case-insensitive compare
search_string = search.to_string(header = False, index = False)
#function for case-insenitive string compare
def insensitive_compare(x1, y1):
if x1.lower() == y1.lower():
return True
#function to iterate through current_file for strings and compare to
#search_parameters to grab data column headers
def filter_data(columns, rows): #I need to fix this to stop getting that A
for name in columns:
for number in rows:
cell = df.at[number, name]
if cell == '':
continue
for place in length_search:
#main compare, if str and matches search params, then do...
if isinstance(cell, str) and insensitive_compare(search.at[place, 'Parameters'], cell) == True:
#this is to prevent repeats in the header list
if cell in header_list:
continue
else:
header_list.append(cell) #store data headers
row_list.append(number) #store row number where it is in that data frame
column_list.append(name) #store column number where it is in that data frame
else:
continue
#searching only for files that end with .xlsx
for file in files:
if file.endswith('.xlsx'):
file_list.append(in_path + '/' + file)
#read in the files to a dataframe, main loop the files will be maninpulated in
for current_file in file_list:
df = pd.read_excel(current_file)
header_list = []
#get columns headers and a range for total rows
columns = df.columns
total_rows = df.index
#adding to store where headers are stored in DF
row_list = []
column_list = []
storage_list = []
#add the file name to the header file so it can be separated by file
#header_list.append(current_file)
main_header_list.append(header_list)
#running function to grab header names
filter_data(columns, total_rows)
所以现在当我编译并输出数据时,我得到:
WI ASM #
HOLDER #
HOLDER DESCRIPTION
A.63.140.1/8z
A.63.140.1/8z
A.63.140.1/8z
A.63.140.1/8z
A.63.140.1/8z
CUTTER #
Harvey 980215
Harvey 980215
Harvey 28178
Harvey 28178
Harvey 74362-C3
OPERATION
GROOVE
ROUGHING
SEMI-FINISH
FINISH
DEBURR & BLEND
TOOL DESCRIPTION
CREM_.125_.015R_1
CREM_.125_.015R_2
CREM_.0781_.015R_1
CREM_.0781_.015R_2
BEM_.0625
Starting Radial Wear
-
-
-
-0.0002
-
TOOL STICK OUT
0.55
0.55
0.55
0.55
0.55
TOOL LIFE (MIN)
120
120
120
240
300
已按我查找的顺序进行清理。
我收到错误:
AttributeError: 'float' object 没有属性 'lower'
尝试编译这个三重嵌套 for 循环时:
for row_data in df_row_list:
for row_item_data in row_data:
for param in search_params:
if row_item_data.lower() == param.lower():
row_index = df_row_list.index(row_data)
df_row_list是18个系列的列表。我正在尝试遍历它并梳理数据。如何将 str 数据类型分配给 row_item_data 以便我可以使用 .lower() 属性?
这是我正在处理的数据的样子:
0 NaN NaN ... NaN NaN
1 REV. : NC ... NaN NaN
2 OP.# : 0200-00-0 ... NaN NaN
3 NaN NaN ... NaN NaN
4 WI ASM # HOLDER # ... TOOL STICK OUT TOOL LIFE (MIN)
5 NaN NaN ... 0.55 120
6 NaN NaN ... 0.55 120
7 NaN NaN ... 0.55 120
8 NaN NaN ... 0.55 240
9 NaN NaN ... 0.55 300
搜索参数正在寻找包含以下词的系列:HOLDER DESCRIPTION、CUTTER #、Operation、TOOL DESCRIPTION 我创建了一个电子表格,其中存储了数百个选项,我将与之进行比较。
我希望它从 df_row_list(其中包含多个系列的列表)中吐出系列的索引,这样我就可以知道我想使用的数据行在哪里“标题行”是。
或者这甚至不是尝试针对特定关键字梳理系列列表的最佳方式吗?我是 python 的新手,愿意接受任何帮助。
只是发帖以防有人遇到类似问题并正在寻找不同的解决方案
这就是我找到解决方案的方式:
import os
import pandas as pd
#the file path I want to pull from
in_path = r'W:\R1_Manufacturing\Parts List Project\Tool_scraping\Excel'
#the file path where row search items are stored
search_parameters = r'W:\R1_Manufacturing\Parts List Project\search_params.xlsx'
#the file I will write the dataframes to
outfile_path = r'W:\R1_Manufacturing\Parts List Project\xlsx_reader.xlsx'
#establishing my list that I will store looped data into
file_list = []
main_header_list = []
#open the file path to store the directory in files
files = os.listdir(in_path)
#database with terms that I want to track
search = pd.read_excel(search_parameters)
length_search = search.index
#turn search dataframe into string to do case-insensitive compare
search_string = search.to_string(header = False, index = False)
#function for case-insenitive string compare
def insensitive_compare(x1, y1):
if x1.lower() == y1.lower():
return True
#function to iterate through current_file for strings and compare to
#search_parameters to grab data column headers
def filter_data(columns, rows): #I need to fix this to stop getting that A
for name in columns:
for number in rows:
cell = df.at[number, name]
if cell == '':
continue
for place in length_search:
#main compare, if str and matches search params, then do...
if isinstance(cell, str) and insensitive_compare(search.at[place, 'Parameters'], cell) == True:
#this is to prevent repeats in the header list
if cell in header_list:
continue
else:
header_list.append(cell) #store data headers
row_list.append(number) #store row number where it is in that data frame
column_list.append(name) #store column number where it is in that data frame
else:
continue
#searching only for files that end with .xlsx
for file in files:
if file.endswith('.xlsx'):
file_list.append(in_path + '/' + file)
#read in the files to a dataframe, main loop the files will be maninpulated in
for current_file in file_list:
df = pd.read_excel(current_file)
header_list = []
#get columns headers and a range for total rows
columns = df.columns
total_rows = df.index
#adding to store where headers are stored in DF
row_list = []
column_list = []
storage_list = []
#add the file name to the header file so it can be separated by file
#header_list.append(current_file)
main_header_list.append(header_list)
#running function to grab header names
filter_data(columns, total_rows)
所以现在当我编译并输出数据时,我得到:
WI ASM #
HOLDER #
HOLDER DESCRIPTION
A.63.140.1/8z
A.63.140.1/8z
A.63.140.1/8z
A.63.140.1/8z
A.63.140.1/8z
CUTTER #
Harvey 980215
Harvey 980215
Harvey 28178
Harvey 28178
Harvey 74362-C3
OPERATION
GROOVE
ROUGHING
SEMI-FINISH
FINISH
DEBURR & BLEND
TOOL DESCRIPTION
CREM_.125_.015R_1
CREM_.125_.015R_2
CREM_.0781_.015R_1
CREM_.0781_.015R_2
BEM_.0625
Starting Radial Wear
-
-
-
-0.0002
-
TOOL STICK OUT
0.55
0.55
0.55
0.55
0.55
TOOL LIFE (MIN)
120
120
120
240
300
已按我查找的顺序进行清理。