对于列表中的所有集合,仅提取第一个数字

For all sets in a list, extract the first number only

我有一个如下所示的列表:

 b = [{'dg_12.942_ch_293','dg_22.38_ca_627'}, 
{'dg_12.651_cd_286','dg_14.293_ce_334'}, 
{'dg_17.42_cr_432','dg_18.064_cm_461','dg_18.85_cn_474','dg_20.975_cf_489'}]

我只想保留每组中每个项目的第一个数字:

 b = [{'12','22'}, 
{'12','14'}, 
{'17','18','18','20'}]    

然后我想找出每个集合中最小和最大数之间的差异并将其放入列表中,所以在这种情况下我会:

b = [3,2,3]

这将 o/p 作为 [10,2,3] (b/w22和12的差是10)

 b = [{'12','22'}, 
{'12','14'}, 
{'17','18','18','20'}] 
l = []
for i in b:
    large ,small = -99, 99
    for j in i:
        j = int(j)
        if large < j:
            large = j
        if small >j:
            small = j
    l.append(large - small)
print(l)

丑陋且没有任何健全性检查,但完成工作。

import re

SEARCH_NUMBER_REGEX = re.compile("(\d+)")

def foo(dataset):
    out = []
    for entries in dataset:
        numbers = []
        for entry in entries:
           # Search for the first number in the str
           n = SEARCH_NUMBER_REGEX.search(entry).group(1)
           n = int(n)
           numbers.append(n)
        
        # Sort the numbers and sustract the last one (largest)
        # by the first one (smallest)
        numbers.sort()
        out.append(numbers[-1] - numbers[0])

    return out

b = [
    {'dg_12.942_ch_293', 'dg_22.38_ca_627'}, 
    {'dg_12.651_cd_286', 'dg_14.293_ce_334'}, 
    {'dg_17.42_cr_432', 'dg_18.064_cm_461', 'dg_18.85_cn_474', 'dg_20.975_cf_489'}
]

print(b)
# > [10, 2, 3]

还有另一种方法:

import re

ba = [{'dg_12.942_ch_293', 'dg_22.38_ca_627'},
     {'dg_12.651_cd_286', 'dg_14.293_ce_334'},
     {'dg_17.42_cr_432', 'dg_18.064_cm_461', 'dg_18.85_cn_474', 'dg_20.975_cf_489'}]

bb = []

for s in ba:
    ns = sorted([int(re.search(r'(\d+)', ss)[0]) for ss in s])
    bb.append(ns[-1]-ns[0])

print(bb)

输出:

[10, 2, 3]

或者,如果你想变得可笑:

ba = [{'dg_12.942_ch_293', 'dg_22.38_ca_627'},
     {'dg_12.651_cd_286', 'dg_14.293_ce_334'},
     {'dg_17.42_cr_432', 'dg_18.064_cm_461', 'dg_18.85_cn_474', 'dg_20.975_cf_489'}]

bb = [(n := sorted([int(re.search(r'(\d+)', ss)[0]) for ss in s]))[-1]-n[0] for s in ba]

print(bb)

在你的最终产品中,我看到它是“[3,2,3]”,但如果我理解你的问题是正确的,它应该是 [10,2,3]。无论哪种方式,我下面的代码至少会为您指明正确的方向(希望如此)。

此代码将遍历列表中的每个元组并拆分 str(因为这是我们要比较的所有内容)并将它们添加到列表中。然后评估这些数字并从最大数字中减去最小数字,并将其放在一个单独的数组中。这个“单独的数组”是你问题中显示的最后一个。

祝你好运 - 希望这对您有所帮助!

import re

b = [('dg_12.942_ch_293','dg_22.38_ca_627'), ('dg_12.651_cd_286','dg_14.293_ce_334'), ('dg_17.42_cr_432','dg_18.064_cm_461','dg_18.85_cn_474','dg_20.975_cf_489')]

final_array = []

for tup in b:
    x = tup
    temp_array = []
    for num in x: 
        split_number = re.search(r'\d+', num).group()
        temp_array.append(split_number)
    
    difference = int(max(temp_array)) - int(min(temp_array))
    final_array.append(difference)

print(final_array)