
Efficiently label points inside a bounding box


我有一个数据框,其中包含货架上产品周围的边界框。因此,每一行都包含有关边界框边界的信息、拍摄照片的相机、拍摄照片的日期和时间以及我计算出的边界框中心。一条信息丢失了它是哪个产品(没有 ID,没有条形码)。

index  boundingX0  boundingX1  boundingY0  boundingY1           cameraId  \
0      0      3167.0      3276.0      2532.0      2662.0  Z4301160003414164   
1      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164   
2      2      3278.0      3387.0      2532.0      2663.0  Z4301160003414164   
3      3      1264.0      1373.0       946.0      1097.0  Z4301160003414164   
4      4      1909.0      2002.0      1983.0      2151.0  Z4301160003414164   
5      5      1722.0      1808.0      1982.0      2150.0  Z4301160003414164   
6      6      3163.0      3281.0      2301.0      2460.0  Z4301160003414164   
7      7      2359.0      2469.0      2512.0      2629.0  Z4301160003414164   
8      8      1381.0      1496.0       947.0      1097.0  Z4301160003414164   
9      9      1053.0      1172.0      1958.0      2146.0  Z4301160003414164   

  filename        Date  Hour  facing_center_x  facing_center_y  
0        A  2022-05-17    13           3221.5           2597.0  
1        A  2022-05-17    13           1859.0           2068.5  
2        A  2022-05-17    13           3332.5           2597.5  
3        A  2022-05-17    13           1318.5           1021.5  
4        A  2022-05-17    13           1955.5           2067.0  
5        A  2022-05-17    13           1765.0           2066.0  
6        A  2022-05-17    13           3222.0           2380.5  
7        A  2022-05-17    13           2414.0           2570.5  
8        A  2022-05-17    13           1438.5           1022.0  
9        A  2022-05-17    13           1112.5           2052.0  


index        Date           cameraId filename        itemId  \
0      0  2022-05-17  Z4301160003414164        A  5.903282e+07   
1      1  2022-05-17  Z4301160003414164        A  5.903282e+07   
2      2  2022-05-17  Z4301160003414164        A  8.013546e+07   
3      3  2022-05-17  Z4301160003414164        A  8.013546e+07   
4      4  2022-05-17  Z4301160003414164        A  3.760011e+10   
5      5  2022-05-17  Z4301160003414164        A  3.760011e+10   
6      6  2022-05-17  Z4301160003414164        A  3.017620e+12   
7      7  2022-05-17  Z4301160003414164        A  3.017620e+12   
8      8  2022-05-17  Z4301160003414164        A  3.017761e+12   
9      9  2022-05-17  Z4301160003414164        A  3.088541e+12   

             barcode       x       y  boundingX0  boundingX1  boundingY0  \
0  N4131466489013277  2117.0  1828.0      2117.0      3232.0      1540.0   
1  N4131466408713275  3233.0  1832.0      3233.0      3995.0      1540.0   
2  N4131466510613278  2905.0  1099.0      2905.0      4055.0       846.0   
3  N4131465123513276  2921.0   757.0      2921.0      4145.0       457.0   
4  N4131466272113278  1684.0   760.0      1684.0      2920.0       460.0   
5  N4131465122713277  1212.0   761.0      1212.0      1683.0       461.0   
6  N4131465130213271  2127.0  1461.0      2127.0      4013.0      1185.0   
7  N4131466226313279  2122.0  2158.0      2122.0      3981.0      1900.0   
8  N4141461925413272  4254.0  3081.0      4254.0      4598.0      2769.0   
9  N4131465932913278  1323.0  1817.0      1323.0      1478.0      1539.0   

   boundingY1  Hour  
0      1828.0    11  
1      1832.0    11  
2      1099.0    11  
3       757.0    11  
4       760.0    11  
5       761.0    11  
6      1461.0    11  
7      2158.0    11  
8      3081.0    11  
9      1817.0    11  

我想要做的是将 facing 中的边界框中心放置在 label 中的产品区域边界框内。如果中心在给定的框中,则将条形码附加到 facing.



facing_index = list(set(facing.index))
label_index = list(set(label.index))

LABEL  =[]
for i in range(len(label_index)):
    f = label[label.index == i]
    cameraId = f.cameraId.iloc[0]
    date     = f.Date.iloc[0]
    hour     = f.Hour.iloc[0]
    for j in range(len(facing_index)):
        g = facing[(facing['cameraId']==cameraId) & (facing['Date']==date) & (facing['Hour']==hour)]
        points = [(g['facing_center_x'], g['facing_center_y'])]
        pts = np.array(points)
        ll = np.array([f['boundingX0'], f['boundingY0']])  # lower-left
        ur = np.array([f['boundingX1'], f['boundingY1']])  # upper-right
        inidx = np.all(np.logical_and(ll <= pts, pts <= ur), axis=1)
        inbox = pts[inidx]
        outbox = pts[np.logical_not(inidx)] 
        if len(inbox)>0:

LABEL = pd.concat(LABEL)

问题是这需要很长时间,因为 label 包含超过 125,000 行,facing 包含超过 400,000 行。


def BoundingBoxContains(rectangle,point):
    logic = rectangle[0] < point[0] < rectangle[0]+rectangle[2] and rectangle[1] < point[1] < rectangle[1]+rectangle[3]
    return logic


LABEL  =[]
for i in range(len(label_index)):
    f = label[label.index == i]
    BoundingBox   = (f.boundingX0[i],f.boundingX1[i],f.boundingY0[i],f.boundingY1[i])
    f = f.reset_index()
    date     = f.Date.iloc[0]
    filename     = f.filename.iloc[0]
    for j in range(len(facing_index)):
        g = facing[(facing['Date']==date) & (facing['filename']==filename)].reset_index()
        K = len(g)
        for k in range(K):
            gk = g[g.index==k]
            facingCenter = (gk['facing_center_x'][k], gk['facing_center_y'][k])
            a = rectContains(BoundingBox, facingCenter)
            if a == True:
                gk['barcode'] = f.barcode



   level_0  index  boundingX0  boundingX1  boundingY0  boundingY1  \
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   
0        0      0      3167.0      3276.0      2532.0      2662.0   

            cameraId filename        Date  Hour  facing_center_x  \
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   
0  Z4301160003414164        A  2022-05-17    13           3221.5   

   facing_center_y            barcode  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276  
0           2597.0  N4131465122613276 


IIUC,我认为你应该先在 DateHourcameraId 列上合并 facinglabel 数据框,然后应用你的 BoundingBoxContains函数。

如果你有足够的内存使用 merge 不用任何小心。 apply 部分只是每一行的一个循环。这部分可以使用multiprocessing真正优化。如果第一部分成功,我可以建议您使用 multiprocessing.Pool.

实现 MP


def BoundingBoxContains(rectangle, point):
    logic = rectangle[0] < point[0] < rectangle[0]+rectangle[2] and rectangle[1] < point[1] < rectangle[1]+rectangle[3]
    return logic

bbox_contains = lambda x: BoundingBoxContains((x.boundingX0, x.boundingX1, x.boundingY0, x.boundingY1),
                                              (x.facing_center_x, x.facing_center_y))

cols = ['Date', 'Hour', 'cameraId', 'barcode']
out = facing.merge(label[cols], on=cols[:-1])
out = out.loc[out.apply(bbox_contains, axis=1)]

注意:我必须修改 Hour (13 -> 11) 才能匹配。


facing_index = list(set(facing.index))
label_index = list(set(label.index))


>>> out.drop_duplicates(cols)  # if you want to keep only one instance per cols
    index  boundingX0  boundingX1  boundingY0  boundingY1           cameraId filename        Date  Hour  facing_center_x  facing_center_y            barcode
10      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466489013277
11      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466408713275
12      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466510613278
13      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131465123513276
14      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466272113278
15      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131465122713277
16      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131465130213271
17      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131466226313279
18      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4141461925413272
19      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164        A  2022-05-17    11           1859.0           2068.5  N4131465932913278

更新 1

你有足够的内存来在 GCP 上创建这个数据帧吗?

cam_cat = pd.CategoricalDtype(np.unique([facing['cameraId'].unique(), 

df1 = pd.DataFrame({
    'DateTime': pd.to_datetime(facing['Date'] + ' ' + facing['Hour'].astype(str)),
    'cameraId': facing['cameraId'].astype(cam_cat),
    'facing': facing['index']

df2 = pd.DataFrame({
    'DateTime': pd.to_datetime(label['Date'] + ' ' + label['Hour'].astype(str)),
    'cameraId': label['cameraId'].astype(cam_cat),
    'label': label['index']

# Lighweight merge to use with multiprocessing
dfm = df1.merge(df2, on=['DateTime', 'cameraId'])

更新 2

在使用 multiprocessing 之前,你能检查一下 dfm 在 2-pass 过滤之后的输出吗:

import pandas as pd
import numpy as np

# Vectorized function
def BoundingBoxContains(df):
    m1 = df['facing_center_x'].between(df['boundingX0'], df['boundingX0'] + df['boundingY0'])
    m2 = df['facing_center_y'].between(df['boundingX1'], df['boundingX1'] + df['boundingY1'])
    return m1 & m2

# Your load routine
facing = pd.read_csv('facing.csv')
label = pd.read_csv('label.csv')

# Create a category dtype from cameraId to reduce memory footprint
cam_cat = pd.CategoricalDtype(np.unique([facing['cameraId'].unique(),

# Extract real index (not 'index' column) from each dataframes
df1 = pd.DataFrame({
    'DateTime': pd.to_datetime(facing['Date'] + ' ' + facing['Hour'].astype(str)),
    'cameraId': facing['cameraId'].astype(cam_cat),
    'facing': facing.index

df2 = pd.DataFrame({
    'DateTime': pd.to_datetime(label['Date'] + ' ' + label['Hour'].astype(str)),
    'cameraId': label['cameraId'].astype(cam_cat),
    'label': label.index

# 1st pass: lookup on DateTime and cameraId to keep only possible match
# Cross product of facing / label with valid DateTime / cameraId
dfm = df1.merge(df2, on=['DateTime', 'cameraId'])

CHUNKSIZE = 10  # Chuncksize
facing_cols = ['facing_center_x', 'facing_center_y']
label_cols = ['boundingX0', 'boundingX1', 'boundingY0', 'boundingY1', 'barcode']

# 2nd pass: match facing coords on bounding box
# Filter out the dataframe
mask = []
for i in range(0, len(dfm), CHUNKSIZE):
    F = facing.loc[dfm.iloc[i:i+CHUNKSIZE]['facing'], facing_cols].reset_index(drop=True)
    L = label.loc[dfm.iloc[i:i+CHUNKSIZE]['label'], label_cols].reset_index(drop=True)
    mask.append(BoundingBoxContains(pd.concat([F, L], axis=1)))
dfm = dfm.loc[pd.concat(mask, ignore_index=True)]


>>> dfm
              DateTime           cameraId  facing  label
19 2022-05-17 11:00:00  Z4301160003414164       1      9
49 2022-05-17 11:00:00  Z4301160003414164       4      9
59 2022-05-17 11:00:00  Z4301160003414164       5      9
79 2022-05-17 11:00:00  Z4301160003414164       7      9

更新 3

最后一步是从 dfmfacinglabel 列重建数据帧:

out = facing.loc[dfm['facing']].assign(barcode=label.loc[dfm['label'], 'barcode'].values)

# Output
   index  boundingX0  boundingX1  boundingY0  boundingY1           cameraId  \
1      1      1812.0      1906.0      1985.0      2152.0  Z4301160003414164   
4      4      1909.0      2002.0      1983.0      2151.0  Z4301160003414164   
5      5      1722.0      1808.0      1982.0      2150.0  Z4301160003414164   
7      7      2359.0      2469.0      2512.0      2629.0  Z4301160003414164   

  filename        Date  Hour  facing_center_x  facing_center_y  \
1        A  2022-05-17    11           1859.0           2068.5   
4        A  2022-05-17    11           1955.5           2067.0   
5        A  2022-05-17    11           1765.0           2066.0   
7        A  2022-05-17    11           2414.0           2570.5   

1  N4131465932913278  
4  N4131465932913278  
5  N4131465932913278  
7  N4131465932913278