尽管使用 NOT NULL 和 <> '' 仍出现空白行
Blank rows occuring despite use of NOT NULL and <> ''
我正在尝试从我的 table 中删除 所有 empty/blank 个单元格。但是,在 之后,我仍然有一些空白单元格,我尝试使用标题中提到的方法删除它们。
和<> ''
。 None 其中似乎删除了空白单元格。我不确定它可能是什么其他类型。这些列是 varchar
貌似没有人遇到过这个,因为我一直找不到类似的文章或问题。这 table 是一个令人难以置信的混乱,因为到处都有明显的不一致。
SELECT * FROM table WHERE column is NOT NULL AND column <> ''
理想情况下,所有空白单元格都会消失,这样我就可以确保我的 Pandas df 是准确的。
我在 Python 中的代码在 table:
中找到了大约 2,000 个 "null" 条目
def enumerate_null_data(df):
#pandas doesn't support blank strings or None distinguishments with isnull/isna, so we replace those with np.NaN
#a data type that is consistent with its archictecture/is handled properly
df['rfid_sent'].replace(['', None], np.nan, inplace=True)
df['rfid_received'].replace(['', None], np.nan, inplace=True)
#dataframe that no longer contains the null values
sent_null_removed = df.dropna(subset=['rfid_sent'])
received_null_removed = df.dropna(subset=['rfid_received'])
#create a dataframe that has all of the entries that were removed from sent_null_removed/received_null_removed
#and count them (get the length of that dataframe)
num_sent_null_removed = len(df[~df.index.isin(sent_null_removed.index)].index)
num_received_null_removed = len(df[~df.index.isin(received_null_removed.index)].index)
# dataframe containing only the values that were null/NA
na_only = df[~df.index.isin(sent_null_removed.index) | ~df.index.isin(received_null_removed.index)]
return (na_only, num_sent_null_removed, num_received_null_removed)
老实说,我不知道还能尝试什么。我在这里缺少一些 "Empty" 格式吗? Pandas 将空白单元格识别为:
和 np.nan
是的,品种齐全。 :S
WHERE col IS NOT NULL AND NOT col ~ '^\s*$'
然而,数据(我 anticipating/want 保留)在结构上高度一致,所以我只是使用 RegEx 过滤掉任何与我的预期不符的数据条目。
我正在尝试从我的 table 中删除 所有 empty/blank 个单元格。但是,在 之后,我仍然有一些空白单元格,我尝试使用标题中提到的方法删除它们。
和<> ''
。 None 其中似乎删除了空白单元格。我不确定它可能是什么其他类型。这些列是 varchar
貌似没有人遇到过这个,因为我一直找不到类似的文章或问题。这 table 是一个令人难以置信的混乱,因为到处都有明显的不一致。
SELECT * FROM table WHERE column is NOT NULL AND column <> ''
理想情况下,所有空白单元格都会消失,这样我就可以确保我的 Pandas df 是准确的。
我在 Python 中的代码在 table:
中找到了大约 2,000 个 "null" 条目def enumerate_null_data(df):
#pandas doesn't support blank strings or None distinguishments with isnull/isna, so we replace those with np.NaN
#a data type that is consistent with its archictecture/is handled properly
df['rfid_sent'].replace(['', None], np.nan, inplace=True)
df['rfid_received'].replace(['', None], np.nan, inplace=True)
#dataframe that no longer contains the null values
sent_null_removed = df.dropna(subset=['rfid_sent'])
received_null_removed = df.dropna(subset=['rfid_received'])
#create a dataframe that has all of the entries that were removed from sent_null_removed/received_null_removed
#and count them (get the length of that dataframe)
num_sent_null_removed = len(df[~df.index.isin(sent_null_removed.index)].index)
num_received_null_removed = len(df[~df.index.isin(received_null_removed.index)].index)
# dataframe containing only the values that were null/NA
na_only = df[~df.index.isin(sent_null_removed.index) | ~df.index.isin(received_null_removed.index)]
return (na_only, num_sent_null_removed, num_received_null_removed)
老实说,我不知道还能尝试什么。我在这里缺少一些 "Empty" 格式吗? Pandas 将空白单元格识别为:
和 np.nan
是的,品种齐全。 :S
WHERE col IS NOT NULL AND NOT col ~ '^\s*$'
然而,数据(我 anticipating/want 保留)在结构上高度一致,所以我只是使用 RegEx 过滤掉任何与我的预期不符的数据条目。