polars dataframe TypeError: must be real number, not str
polars dataframe TypeError: must be real number, not str
所以基本上我将 panda.frame 更改为 polars.frame 以获得更好的 yolov5 速度
但是当我 运行 代码时,它在某些时候工作正常(我不知道什么时候发生错误)并且它给我 TypeError: must be real number, not str. 运行将它与 panda 结合使用效果很好,没有任何错误,但仅适用于 polars。我知道它一定是使用了错误的数据类型,但我真的不知道我应该在哪里寻找,因为我刚刚开始 python。
Traceback (most recent call last):
File "C:\yolov5\test.py", line 61, in <module>
boxes = results.polars().xywh[0]
File "c:\yolov5\.\models\common.py", line 684, in polars
setattr(new, k, [pl.DataFrame(x, columns=c) for x in a])
File "c:\yolov5\.\models\common.py", line 684, in <listcomp>
setattr(new, k, [pl.DataFrame(x, columns=c) for x in a])
packages\polars\internals\frame.py", line 311, in __init__
self._df = sequence_to_pydf(data, columns=columns, orient=orient)
packages\polars\internals\construction.py", line 495, in
data_series = [
packages\polars\internals\construction.py", line 496, in
pli.Series(columns[i], data[i], dtypes.get(columns[i])).inner()
packages\polars\internals\series.py", line 227, in __init__
self._s = sequence_to_pyseries(name, values, dtype=dtype,
packages\polars\internals\construction.py", line 239, in
return constructor(name, values, strict)
TypeError: must be real number, not str
import polars as pl
import pandas as pd
class new:
xyxy = 0
a = [[[370.01605224609375, 346.4305114746094, 398.3968811035156,
384.5684814453125, 0.9011853933334351, 0, 'corn'],
[415.436767578125, 279.4227294921875, 433.930419921875,
305.5151672363281, 0.8829901814460754, 0, 'corn'],
[383.8118896484375, 268.781494140625, 402.35479736328125,
292.4585266113281, 0.8579609394073486, 0, 'corn'],
[431.42791748046875, 570.9154663085938, 476.672119140625, 600.0,
0.810459554195404, 0, 'corn'], [414.912841796875,
257.7676086425781, 427.7708740234375, 274.69635009765625,
0.7384995818138123, 0, 'corn'], [391.22821044921875,
250.48876953125, 403.9199523925781, 268.1374816894531,
0.6828912496566772, 0, 'corn'], [414.2362060546875,
250.18174743652344, 423.82537841796875, 264.02667236328125,
0.517136812210083, 0, 'corn']]]
ca = 'xmin', 'ymin', 'xmax', 'ymax', 'confidence', 'class', 'name' # xyxy columns
cb = 'xcenter', 'ycenter', 'width', 'height', 'confidence', 'class', 'name' # xywh columns
for k, c in zip(['xyxy', 'xyxyn', 'xywh', 'xywhn'], [ca, ca, cb,
setattr(new, k, [pl.DataFrame(x, columns=c) for x in a])
print (new.xyxy[0])
接近代码末尾时,您正在创建一个新列表 DataFrame
setattr(new, k, [polars.DataFrame(x, columns=c) for x in a])
polars.DataFrame(x, columns=c)
正在发生的事情是,您传递给 (x
) 到其中一个 DataFrame 的列表之一混合了数字和字符串。更具体地说,其中一个列表以一个或多个数字开头,但之后某处包含一个字符串。这导致了一个错误,因为 Polars 试图从该列表中创建一列数字。
让我们仔细看看。下面是创建 DataFrame 的示例:
import polars as pl
pl.DataFrame([["one", "two", "three"], [1.0, 2.0, 3.0]],
columns=["col1", "col2"])
注意["one", "two", "three"]
都是字符串。而 [1.0, 2.0, 3.0]
shape: (3, 2)
│ col1 ┆ col2 │
│ --- ┆ --- │
│ str ┆ f64 │
│ one ┆ 1.0 │
│ two ┆ 2.0 │
│ three ┆ 3.0 │
pl.DataFrame([["one", "two", "three"], [1.0, 2.0, "Oops, this is a string mixed in with numbers"]],
columns=["col1", "col2"])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/frame.py", line 311, in __init__
self._df = sequence_to_pydf(data, columns=columns, orient=orient)
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 495, in sequence_to_pydf
data_series = [
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 496, in <listcomp>
pli.Series(columns[i], data[i], dtypes.get(columns[i])).inner()
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/series.py", line 227, in __init__
self._s = sequence_to_pyseries(name, values, dtype=dtype, strict=strict)
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 239, in sequence_to_pyseries
return constructor(name, values, strict)
TypeError: must be real number, not str
因此,您需要查找以一个或多个数字开头但包含字符串的列表。 Polars 尝试使用此列表创建一列数字,并抛出错误。
您需要做的是将 orient="row"
添加到您创建 DataFrame 的调用中:
pl.DataFrame(x, columns=c, orient="row")
一旦我们通过添加 orient="row"
关键字和 re-run 对您的代码进行更改,我们将得到:
shape: (7, 7)
│ xmin ┆ ymin ┆ xmax ┆ ymax ┆ confidence ┆ class ┆ name │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ f64 ┆ f64 ┆ f64 ┆ f64 ┆ f64 ┆ i64 ┆ str │
│ 370.016052 ┆ 346.430511 ┆ 398.396881 ┆ 384.568481 ┆ 0.901185 ┆ 0 ┆ corn │
│ 415.436768 ┆ 279.422729 ┆ 433.9304 ┆ 305.515167 ┆ 0.8829 ┆ 0 ┆ corn │
│ 383.8118 ┆ 268.781494 ┆ 402.354797 ┆ 292.458527 ┆ 0.857961 ┆ 0 ┆ corn │
│ 431.427917 ┆ 570.915466 ┆ 476.672119 ┆ 600.0 ┆ 0.8104 ┆ 0 ┆ corn │
│ 414.912842 ┆ 257.767609 ┆ 427.770874 ┆ 274.6963 ┆ 0.7385 ┆ 0 ┆ corn │
│ 391.2282 ┆ 250.4887 ┆ 403.919952 ┆ 268.137482 ┆ 0.682891 ┆ 0 ┆ corn │
│ 414.236206 ┆ 250.181747 ┆ 423.825378 ┆ 264.026672 ┆ 0.517137 ┆ 0 ┆ corn │
为什么在这种情况下需要 orient
让我们从一个简单的例子开始。我们将提供三个 个列表,以及两个 个列名称:
pl.DataFrame([[1.1, 'a'], [2.2, 'b'], [3.3, 'c']], columns=['col_1', 'col_2'])
在此示例中,Polars 尝试推断每个列表(例如,[1.1, 'a'])代表一行还是一列。来自 polars.DataFrame 的文档:
orient{‘col’, ‘row’}, default None
Whether to interpret two-dimensional data as columns or as rows. If None, the orientation is inferred by matching the columns and data dimensions. If this does not yield conclusive results, column orientation is used.
因此,在上述情况下,Polars 会尝试通过查看 columns
shape: (3, 2)
│ col_1 ┆ col_2 │
│ --- ┆ --- │
│ f64 ┆ str │
│ 1.1 ┆ a │
│ 2.2 ┆ b │
│ 3.3 ┆ c │
现在,让我们删除其中一个列表,以便有 两个 个列表和 两个 个列名:
pl.DataFrame([[1.1, 'a'], [2.2, 'b']], columns=['col_1', 'col_2'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/xxxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/frame.py", line 311, in __init__
self._df = sequence_to_pydf(data, columns=columns, orient=orient)
File "/home/xxxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 495, in sequence_to_pydf
data_series = [
File "/home/xxxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 496, in <listcomp>
pli.Series(columns[i], data[i], dtypes.get(columns[i])).inner()
File "/home/xxxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/series.py", line 227, in __init__
self._s = sequence_to_pyseries(name, values, dtype=dtype, strict=strict)
File "/home/xxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 239, in sequence_to_pyseries
return constructor(name, values, strict)
TypeError: must be real number, not str
因为现在有两个个列表和两个个列名,所以不清楚每个列表代表一行还是代表一列。因此,根据文档,Polars 将每个列表解释为一列,而不是一行。
但这会导致问题,因为每个列表(在本例中为 [1, 'a'])既有数字又有字符串。这会导致错误。
因此,由于列表的数量等于列名的数量,我们需要告诉 Polars 每个列表代表一行,而不是一列。
pl.DataFrame([[1.1, 'a'], [2.2, 'b']], columns=['col_1', 'col_2'], orient='row')
shape: (2, 2)
│ col_1 ┆ col_2 │
│ --- ┆ --- │
│ f64 ┆ str │
│ 1.1 ┆ a │
│ 2.2 ┆ b │
考虑到这一点,让我们看看您的代码。 a
中有多少个列表? 七。提供了多少个列名? ca
和 cb
都提供 7 列名称。由于列表的数量和列名的数量相等,因此 Polars 将每个列表解释为 列 ,而不是一行。例如,Polars 解释
[370.01605224609375, 346.4305114746094, 398.3968811035156,
384.5684814453125, 0.9011853933334351, 0, 'corn']
作为 列,而不是行。因此,Polars 看到字符串“corn”与同一列中的数字混合在一起。因此错误。
所以基本上我将 panda.frame 更改为 polars.frame 以获得更好的 yolov5 速度 但是当我 运行 代码时,它在某些时候工作正常(我不知道什么时候发生错误)并且它给我 TypeError: must be real number, not str. 运行将它与 panda 结合使用效果很好,没有任何错误,但仅适用于 polars。我知道它一定是使用了错误的数据类型,但我真的不知道我应该在哪里寻找,因为我刚刚开始 python。 所以如果有人能帮助我,我将不胜感激!感谢阅读,祝你有美好的一天!
Traceback (most recent call last):
File "C:\yolov5\test.py", line 61, in <module>
boxes = results.polars().xywh[0]
File "c:\yolov5\.\models\common.py", line 684, in polars
setattr(new, k, [pl.DataFrame(x, columns=c) for x in a])
File "c:\yolov5\.\models\common.py", line 684, in <listcomp>
setattr(new, k, [pl.DataFrame(x, columns=c) for x in a])
packages\polars\internals\frame.py", line 311, in __init__
self._df = sequence_to_pydf(data, columns=columns, orient=orient)
packages\polars\internals\construction.py", line 495, in
data_series = [
packages\polars\internals\construction.py", line 496, in
pli.Series(columns[i], data[i], dtypes.get(columns[i])).inner()
packages\polars\internals\series.py", line 227, in __init__
self._s = sequence_to_pyseries(name, values, dtype=dtype,
packages\polars\internals\construction.py", line 239, in
return constructor(name, values, strict)
TypeError: must be real number, not str
import polars as pl
import pandas as pd
class new:
xyxy = 0
a = [[[370.01605224609375, 346.4305114746094, 398.3968811035156,
384.5684814453125, 0.9011853933334351, 0, 'corn'],
[415.436767578125, 279.4227294921875, 433.930419921875,
305.5151672363281, 0.8829901814460754, 0, 'corn'],
[383.8118896484375, 268.781494140625, 402.35479736328125,
292.4585266113281, 0.8579609394073486, 0, 'corn'],
[431.42791748046875, 570.9154663085938, 476.672119140625, 600.0,
0.810459554195404, 0, 'corn'], [414.912841796875,
257.7676086425781, 427.7708740234375, 274.69635009765625,
0.7384995818138123, 0, 'corn'], [391.22821044921875,
250.48876953125, 403.9199523925781, 268.1374816894531,
0.6828912496566772, 0, 'corn'], [414.2362060546875,
250.18174743652344, 423.82537841796875, 264.02667236328125,
0.517136812210083, 0, 'corn']]]
ca = 'xmin', 'ymin', 'xmax', 'ymax', 'confidence', 'class', 'name' # xyxy columns
cb = 'xcenter', 'ycenter', 'width', 'height', 'confidence', 'class', 'name' # xywh columns
for k, c in zip(['xyxy', 'xyxyn', 'xywh', 'xywhn'], [ca, ca, cb,
setattr(new, k, [pl.DataFrame(x, columns=c) for x in a])
print (new.xyxy[0])
接近代码末尾时,您正在创建一个新列表 DataFrame
setattr(new, k, [polars.DataFrame(x, columns=c) for x in a])
polars.DataFrame(x, columns=c)
正在发生的事情是,您传递给 (x
) 到其中一个 DataFrame 的列表之一混合了数字和字符串。更具体地说,其中一个列表以一个或多个数字开头,但之后某处包含一个字符串。这导致了一个错误,因为 Polars 试图从该列表中创建一列数字。
让我们仔细看看。下面是创建 DataFrame 的示例:
import polars as pl
pl.DataFrame([["one", "two", "three"], [1.0, 2.0, 3.0]],
columns=["col1", "col2"])
注意["one", "two", "three"]
都是字符串。而 [1.0, 2.0, 3.0]
shape: (3, 2)
│ col1 ┆ col2 │
│ --- ┆ --- │
│ str ┆ f64 │
│ one ┆ 1.0 │
│ two ┆ 2.0 │
│ three ┆ 3.0 │
pl.DataFrame([["one", "two", "three"], [1.0, 2.0, "Oops, this is a string mixed in with numbers"]],
columns=["col1", "col2"])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/frame.py", line 311, in __init__
self._df = sequence_to_pydf(data, columns=columns, orient=orient)
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 495, in sequence_to_pydf
data_series = [
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 496, in <listcomp>
pli.Series(columns[i], data[i], dtypes.get(columns[i])).inner()
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/series.py", line 227, in __init__
self._s = sequence_to_pyseries(name, values, dtype=dtype, strict=strict)
File "/home/xxxx/.virtualenvs/PolarsTesting3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 239, in sequence_to_pyseries
return constructor(name, values, strict)
TypeError: must be real number, not str
因此,您需要查找以一个或多个数字开头但包含字符串的列表。 Polars 尝试使用此列表创建一列数字,并抛出错误。
您需要做的是将 orient="row"
添加到您创建 DataFrame 的调用中:
pl.DataFrame(x, columns=c, orient="row")
一旦我们通过添加 orient="row"
关键字和 re-run 对您的代码进行更改,我们将得到:
shape: (7, 7)
│ xmin ┆ ymin ┆ xmax ┆ ymax ┆ confidence ┆ class ┆ name │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ f64 ┆ f64 ┆ f64 ┆ f64 ┆ f64 ┆ i64 ┆ str │
│ 370.016052 ┆ 346.430511 ┆ 398.396881 ┆ 384.568481 ┆ 0.901185 ┆ 0 ┆ corn │
│ 415.436768 ┆ 279.422729 ┆ 433.9304 ┆ 305.515167 ┆ 0.8829 ┆ 0 ┆ corn │
│ 383.8118 ┆ 268.781494 ┆ 402.354797 ┆ 292.458527 ┆ 0.857961 ┆ 0 ┆ corn │
│ 431.427917 ┆ 570.915466 ┆ 476.672119 ┆ 600.0 ┆ 0.8104 ┆ 0 ┆ corn │
│ 414.912842 ┆ 257.767609 ┆ 427.770874 ┆ 274.6963 ┆ 0.7385 ┆ 0 ┆ corn │
│ 391.2282 ┆ 250.4887 ┆ 403.919952 ┆ 268.137482 ┆ 0.682891 ┆ 0 ┆ corn │
│ 414.236206 ┆ 250.181747 ┆ 423.825378 ┆ 264.026672 ┆ 0.517137 ┆ 0 ┆ corn │
为什么在这种情况下需要 orient
让我们从一个简单的例子开始。我们将提供三个 个列表,以及两个 个列名称:
pl.DataFrame([[1.1, 'a'], [2.2, 'b'], [3.3, 'c']], columns=['col_1', 'col_2'])
在此示例中,Polars 尝试推断每个列表(例如,[1.1, 'a'])代表一行还是一列。来自 polars.DataFrame 的文档:
orient{‘col’, ‘row’}, default None
Whether to interpret two-dimensional data as columns or as rows. If None, the orientation is inferred by matching the columns and data dimensions. If this does not yield conclusive results, column orientation is used.
因此,在上述情况下,Polars 会尝试通过查看 columns
shape: (3, 2)
│ col_1 ┆ col_2 │
│ --- ┆ --- │
│ f64 ┆ str │
│ 1.1 ┆ a │
│ 2.2 ┆ b │
│ 3.3 ┆ c │
现在,让我们删除其中一个列表,以便有 两个 个列表和 两个 个列名:
pl.DataFrame([[1.1, 'a'], [2.2, 'b']], columns=['col_1', 'col_2'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/xxxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/frame.py", line 311, in __init__
self._df = sequence_to_pydf(data, columns=columns, orient=orient)
File "/home/xxxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 495, in sequence_to_pydf
data_series = [
File "/home/xxxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 496, in <listcomp>
pli.Series(columns[i], data[i], dtypes.get(columns[i])).inner()
File "/home/xxxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/series.py", line 227, in __init__
self._s = sequence_to_pyseries(name, values, dtype=dtype, strict=strict)
File "/home/xxx/.virtualenvs/Whosebug3.10/lib/python3.10/site-packages/polars/internals/construction.py", line 239, in sequence_to_pyseries
return constructor(name, values, strict)
TypeError: must be real number, not str
因为现在有两个个列表和两个个列名,所以不清楚每个列表代表一行还是代表一列。因此,根据文档,Polars 将每个列表解释为一列,而不是一行。
但这会导致问题,因为每个列表(在本例中为 [1, 'a'])既有数字又有字符串。这会导致错误。
因此,由于列表的数量等于列名的数量,我们需要告诉 Polars 每个列表代表一行,而不是一列。
pl.DataFrame([[1.1, 'a'], [2.2, 'b']], columns=['col_1', 'col_2'], orient='row')
shape: (2, 2)
│ col_1 ┆ col_2 │
│ --- ┆ --- │
│ f64 ┆ str │
│ 1.1 ┆ a │
│ 2.2 ┆ b │
考虑到这一点,让我们看看您的代码。 a
中有多少个列表? 七。提供了多少个列名? ca
和 cb
都提供 7 列名称。由于列表的数量和列名的数量相等,因此 Polars 将每个列表解释为 列 ,而不是一行。例如,Polars 解释
[370.01605224609375, 346.4305114746094, 398.3968811035156,
384.5684814453125, 0.9011853933334351, 0, 'corn']
作为 列,而不是行。因此,Polars 看到字符串“corn”与同一列中的数字混合在一起。因此错误。