Python Dataframe 通过名称指定满足的行

Python Dataframe designate the satisfying rows by a name

我有一个在一列中包含包裹重量的数据框,现在我必须将它们分配给符合要求的包。

我的代码:

df = pd.DataFrame({'parcel':[a,b,c,d,e],
                    'weight':[85,60,15,30,150]})
# I have bags that can take 100 kg parcels. Now I want to name the parcels 
# that go into a specific bag. The Bags are numbered 1,2,3,4,5. I want to use minimal bags possible. 

预期答案:

df = 
  parcel  weight  bag_num
0      a      85      1
1      b      60      2
2      c      15      1
3      d      30      2
4      e     150      NaN # This parcel is overweight, cannot be accommodated

我的回答:

df['bag_num'] = df['weight']<100
df['bag_num'].replace(False,np.nan,inplace=True)
df=
  parcel  weight bag_num
4      e     150     NaN
0      a      85    True
1      b      60    True
3      d      30    True
2      c      15    True

我到了这个地方。我无法继续?

您可以解决迭代数据帧的行并相应地分配 bag_number 的问题:

import pandas as pd

df = pd.DataFrame(
    {"parcel": ["a", "b", "c", "d", "e"], "weight": [85, 60, 15, 30, 150]}
)


MIN_BAG = 1
MAX_BAG = 5
bags_range = range(MIN_BAG, MAX_BAG + 1)

# We keep track of the bags and how much weight they hold at any moment
used_bags = {bag_idx: 0 for bag_idx in bags_range}

# Create empty df column
df["bag_num"] = pd.NA

for row in df.itertuples():

    parcel_weight = row.weight

    if parcel_weight > 100:
        continue

    for bag in bags_range:
        temp_weight = used_bags.get(bag) + parcel_weight
        if temp_weight <= 100:
            used_bags[bag] = temp_weight
            df.at[row.Index, "bag_num"] = bag
            break


print(df)

这会产生这样的结果:

  parcel  weight bag_num
0      a      85       1
1      b      60       2
2      c      15       1
3      d      30       2
4      e     150    <NA>