有效地标记边界框内的点

Efficiently label points inside a bounding box

我有一个问题想要解决,我已经找到了一个有效的代码,但由于我需要处理的数据量很大,所以效率非常低。所以这是我正在尝试做的事情的描述:

我有一个数据框,其中包含货架上产品周围的边界框。因此,每一行都包含有关边界框边界的信息、拍摄照片的相机、拍摄照片的日期和时间以及我计算出的边界框中心。一条信息丢失了它是哪个产品(没有 ID,没有条形码)。

index  boundingX0  boundingX1  boundingY0  boundingY1           cameraId  \
0      0      3167.0      3276.0      2532.0      2662.0  Z4301160003414164   
1      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164   
2      2      3278.0      3387.0      2532.0      2663.0  Z4301160003414164   
3      3      1264.0      1373.0       946.0      1097.0  Z4301160003414164   
4      4      1909.0      2002.0      1983.0      2151.0  Z4301160003414164   
5      5      1722.0      1808.0      1982.0      2150.0  Z4301160003414164   
6      6      3163.0      3281.0      2301.0      2460.0  Z4301160003414164   
7      7      2359.0      2469.0      2512.0      2629.0  Z4301160003414164   
8      8      1381.0      1496.0       947.0      1097.0  Z4301160003414164   
9      9      1053.0      1172.0      1958.0      2146.0  Z4301160003414164   

  filename        Date  Hour  facing_center_x  facing_center_y  
0        A  2022-05-17    13           3221.5           2597.0  
1        A  2022-05-17    13           1859.0           2068.5  
2        A  2022-05-17    13           3332.5           2597.5  
3        A  2022-05-17    13           1318.5           1021.5  
4        A  2022-05-17    13           1955.5           2067.0  
5        A  2022-05-17    13           1765.0           2066.0  
6        A  2022-05-17    13           3222.0           2380.5  
7        A  2022-05-17    13           2414.0           2570.5  
8        A  2022-05-17    13           1438.5           1022.0  
9        A  2022-05-17    13           1112.5           2052.0  

然而,我有第二个数据框,其中包含产品应该所在的整个区域的边界框以及有关产品的所有必要信息(id、条形码)以及有关相机、数据、小时和等等。

index        Date           cameraId filename        itemId  \
0      0  2022-05-17  Z4301160003414164        A  5.903282e+07   
1      1  2022-05-17  Z4301160003414164        A  5.903282e+07   
2      2  2022-05-17  Z4301160003414164        A  8.013546e+07   
3      3  2022-05-17  Z4301160003414164        A  8.013546e+07   
4      4  2022-05-17  Z4301160003414164        A  3.760011e+10   
5      5  2022-05-17  Z4301160003414164        A  3.760011e+10   
6      6  2022-05-17  Z4301160003414164        A  3.017620e+12   
7      7  2022-05-17  Z4301160003414164        A  3.017620e+12   
8      8  2022-05-17  Z4301160003414164        A  3.017761e+12   
9      9  2022-05-17  Z4301160003414164        A  3.088541e+12   

             barcode       x       y  boundingX0  boundingX1  boundingY0  \
0  N4131466489013277  2117.0  1828.0      2117.0      3232.0      1540.0   
1  N4131466408713275  3233.0  1832.0      3233.0      3995.0      1540.0   
2  N4131466510613278  2905.0  1099.0      2905.0      4055.0       846.0   
3  N4131465123513276  2921.0   757.0      2921.0      4145.0       457.0   
4  N4131466272113278  1684.0   760.0      1684.0      2920.0       460.0   
5  N4131465122713277  1212.0   761.0      1212.0      1683.0       461.0   
6  N4131465130213271  2127.0  1461.0      2127.0      4013.0      1185.0   
7  N4131466226313279  2122.0  2158.0      2122.0      3981.0      1900.0   
8  N4141461925413272  4254.0  3081.0      4254.0      4598.0      2769.0   
9  N4131465932913278  1323.0  1817.0      1323.0      1478.0      1539.0   

   boundingY1  Hour  
0      1828.0    11  
1      1832.0    11  
2      1099.0    11  
3       757.0    11  
4       760.0    11  
5       761.0    11  
6      1461.0    11  
7      2158.0    11  
8      3081.0    11  
9      1817.0    11  

我想要做的是将 facing 中的边界框中心放置在 label 中的产品区域边界框内。如果中心在给定的框中,则将条形码附加到 facing.

中的数据

我已经这样做了:

facing_index = list(set(facing.index))
label_index = list(set(label.index))

LABEL  =[]
for i in range(len(label_index)):
    f = label[label.index == i]
    cameraId = f.cameraId.iloc[0]
    date     = f.Date.iloc[0]
    hour     = f.Hour.iloc[0]
    for j in range(len(facing_index)):
        g = facing[(facing['cameraId']==cameraId) & (facing['Date']==date) & (facing['Hour']==hour)]
        points = [(g['facing_center_x'], g['facing_center_y'])]
        pts = np.array(points)
        ll = np.array([f['boundingX0'], f['boundingY0']])  # lower-left
        ur = np.array([f['boundingX1'], f['boundingY1']])  # upper-right
        inidx = np.all(np.logical_and(ll <= pts, pts <= ur), axis=1)
        inbox = pts[inidx]
        outbox = pts[np.logical_not(inidx)] 
        if len(inbox)>0:
            g['barcode']=f.barcode
        else:
            0
        LABEL.append(g)

LABEL = pd.concat(LABEL)

问题是这需要很长时间,因为 label 包含超过 125,000 行,facing 包含超过 400,000 行。

我尝试的另一种方法是:定义函数

def BoundingBoxContains(rectangle,point):
    logic = rectangle[0] < point[0] < rectangle[0]+rectangle[2] and rectangle[1] < point[1] < rectangle[1]+rectangle[3]
    return logic

检查点是否在矩形中。那么:

LABEL  =[]
for i in range(len(label_index)):
    f = label[label.index == i]
    BoundingBox   = (f.boundingX0[i],f.boundingX1[i],f.boundingY0[i],f.boundingY1[i])
    f = f.reset_index()
    date     = f.Date.iloc[0]
    filename     = f.filename.iloc[0]
    for j in range(len(facing_index)):
        g = facing[(facing['Date']==date) & (facing['filename']==filename)].reset_index()
        K = len(g)
        for k in range(K):
            gk = g[g.index==k]
            facingCenter = (gk['facing_center_x'][k], gk['facing_center_y'][k])
            a = rectContains(BoundingBox, facingCenter)
            if a == True:
                gk['barcode'] = f.barcode
            else:
                0

            LABEL.append(gk)

给出:

   level_0  index  boundingX0  boundingX1  boundingY0  boundingY1  \
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   

            cameraId filename        Date  Hour  facing_center_x  \
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   

   facing_center_y            barcode  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276 

我还没有找到更有效的方法,非常感谢您提供任何见解。

IIUC,我认为你应该先在 DateHourcameraId 列上合并 facinglabel 数据框,然后应用你的 BoundingBoxContains函数。

如果你有足够的内存使用 merge 不用任何小心。 apply 部分只是每一行的一个循环。这部分可以使用multiprocessing真正优化。如果第一部分成功,我可以建议您使用 multiprocessing.Pool.

实现 MP

现在代码:

def BoundingBoxContains(rectangle, point):
    logic = rectangle[0] < point[0] < rectangle[0]+rectangle[2] and rectangle[1] < point[1] < rectangle[1]+rectangle[3]
    return logic

bbox_contains = lambda x: BoundingBoxContains((x.boundingX0, x.boundingX1, x.boundingY0, x.boundingY1),
                                              (x.facing_center_x, x.facing_center_y))

cols = ['Date', 'Hour', 'cameraId', 'barcode']
out = facing.merge(label[cols], on=cols[:-1])
out = out.loc[out.apply(bbox_contains, axis=1)]

注意:我必须修改 Hour (13 -> 11) 才能匹配。

你能解释一下吗?

facing_index = list(set(facing.index))
label_index = list(set(label.index))

输出:

>>> out.drop_duplicates(cols)  # if you want to keep only one instance per cols
    index  boundingX0  boundingX1  boundingY0  boundingY1           cameraId filename        Date  Hour  facing_center_x  facing_center_y            barcode
10      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466489013277
11      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466408713275
12      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466510613278
13      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131465123513276
14      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466272113278
15      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131465122713277
16      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131465130213271
17      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466226313279
18      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4141461925413272
19      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131465932913278

更新 1

你有足够的内存来在 GCP 上创建这个数据帧吗?

cam_cat = pd.CategoricalDtype(np.unique([facing['cameraId'].unique(), 
                                         label['cameraId'].unique()]))

df1 = pd.DataFrame({
    'DateTime': pd.to_datetime(facing['Date'] + ' ' + facing['Hour'].astype(str)),
    'cameraId': facing['cameraId'].astype(cam_cat),
    'facing': facing['index']
})

df2 = pd.DataFrame({
    'DateTime': pd.to_datetime(label['Date'] + ' ' + label['Hour'].astype(str)),
    'cameraId': label['cameraId'].astype(cam_cat),
    'label': label['index']
})

# Lighweight merge to use with multiprocessing
dfm = df1.merge(df2, on=['DateTime', 'cameraId'])

更新 2

在使用 multiprocessing 之前,你能检查一下 dfm 在 2-pass 过滤之后的输出吗:

import pandas as pd
import numpy as np

# Vectorized function
def BoundingBoxContains(df):
    m1 = df['facing_center_x'].between(df['boundingX0'], df['boundingX0'] + df['boundingY0'])
    m2 = df['facing_center_y'].between(df['boundingX1'], df['boundingX1'] + df['boundingY1'])
    return m1 & m2


# Your load routine
facing = pd.read_csv('facing.csv')
label = pd.read_csv('label.csv')

# Create a category dtype from cameraId to reduce memory footprint
cam_cat = pd.CategoricalDtype(np.unique([facing['cameraId'].unique(),
                                         label['cameraId'].unique()]))

# Extract real index (not 'index' column) from each dataframes
df1 = pd.DataFrame({
    'DateTime': pd.to_datetime(facing['Date'] + ' ' + facing['Hour'].astype(str)),
    'cameraId': facing['cameraId'].astype(cam_cat),
    'facing': facing.index
})

df2 = pd.DataFrame({
    'DateTime': pd.to_datetime(label['Date'] + ' ' + label['Hour'].astype(str)),
    'cameraId': label['cameraId'].astype(cam_cat),
    'label': label.index
})

# 1st pass: lookup on DateTime and cameraId to keep only possible match
# Cross product of facing / label with valid DateTime / cameraId
dfm = df1.merge(df2, on=['DateTime', 'cameraId'])

CHUNKSIZE = 10  # Chuncksize
facing_cols = ['facing_center_x', 'facing_center_y']
label_cols = ['boundingX0', 'boundingX1', 'boundingY0', 'boundingY1', 'barcode']

# 2nd pass: match facing coords on bounding box
# Filter out the dataframe
mask = []
for i in range(0, len(dfm), CHUNKSIZE):
    F = facing.loc[dfm.iloc[i:i+CHUNKSIZE]['facing'], facing_cols].reset_index(drop=True)
    L = label.loc[dfm.iloc[i:i+CHUNKSIZE]['label'], label_cols].reset_index(drop=True)
    mask.append(BoundingBoxContains(pd.concat([F, L], axis=1)))
dfm = dfm.loc[pd.concat(mask, ignore_index=True)]

输出:

>>> dfm
              DateTime           cameraId  facing  label
19 2022-05-17 11:00:00  Z4301160003414164       1      9
49 2022-05-17 11:00:00  Z4301160003414164       4      9
59 2022-05-17 11:00:00  Z4301160003414164       5      9
79 2022-05-17 11:00:00  Z4301160003414164       7      9

更新 3

最后一步是从 dfmfacinglabel 列重建数据帧:

out = facing.loc[dfm['facing']].assign(barcode=label.loc[dfm['label'], 'barcode'].values)
print(out)

# Output
   index  boundingX0  boundingX1  boundingY0  boundingY1           cameraId  \
1      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164   
4      4      1909.0      2002.0      1983.0      2151.0  Z4301160003414164   
5      5      1722.0      1808.0      1982.0      2150.0  Z4301160003414164   
7      7      2359.0      2469.0      2512.0      2629.0  Z4301160003414164   

  filename        Date  Hour  facing_center_x  facing_center_y  \
1        A  2022-05-17    11           1859.0           2068.5   
4        A  2022-05-17    11           1955.5           2067.0   
5        A  2022-05-17    11           1765.0           2066.0   
7        A  2022-05-17    11           2414.0           2570.5   

             barcode  
1  N4131465932913278  
4  N4131465932913278  
5  N4131465932913278  
7  N4131465932913278