将特征和特征值从一行映射到 python 中的字典

Question

我被以下问题困住了。我必须创建一个函数，从一行中提取目标标签和特征（值）并将特征和相应的值放入字典中。行格式如下(target,feature1:feature_valuefeature2:feature_value),所以比如:

line = '1 0:2.0 3:4.0 123:1.0\n'

应该return

({0: 2.0, 123: 1.0, 3: 4.0}, 1)

因此，对于每个功能，我需要将“:”之前的所有内容设为字典键，将其后的所有内容设为键值。但我不知道怎么办。到目前为止，我有以下代码：

def parse_line(line):
    parse_dict = {}
    split_line = line.split()
    target_label = ''
    for i in split_line:
        if target_label == '':
            target_label = i
        else:
            # and now I need to map everything before ':' to a key and everything after to the key value 

    return parse_dict, int(target_label)

line = '1 0:2.0 3:4.0 123:1.0\n'
print parse_line(line)

提前致谢

Answer 1

您可以在一行中进行解析：

target, features = '1 0:2.0 3:4.0 123:1.0\n'.split(' ', 1)
parsed = (dict((kv.split(':') for kv in features.strip().split())), target)

示例：

>>> target, features = '1 0:2.0 3:4.0 123:1.0\n'.split(' ', 1)
>>> parsed = (dict((kv.split(':') for kv in features.strip().split())), target)
>>> parsed
({'123': '1.0', '3': '4.0', '0': '2.0'}, '1')
>>>

请注意，dict 键和值是字符串，但您可以自己进行转换:-)

Answer 2

您使用的相同功能

line = '1 0:2.0 3:4.0 123:1.0\n'

def parse_line(line):
   data = line.split(" ")
   return(dict((i.split(":")) for i in data[1:]),data[0])

print parse_line(line)

Answer 3

你也可以考虑这个：

# If the position of target doesn't matter:
>>> line = '1 0:2.0 3:4.0 123:1.0\n'
>>> result = [{i.split(':')[0] : i.split(':')[1]} if ':' in i else i for i in line.split()]
>>> print(tuple(result))
>>> ('1', {'0': '2.0'}, {'3': '4.0'}, {'123': '1.0'})

# If the position of target does matter:
>>> line = '1 0:2.0 3:4.0 123:1.0\n'
>>> result = [{i.split(':')[0] : i.split(':')[1]} if ':' in i for i in line.split()[1:]]
>>> result.append(line.split()[0])
>>> print(tuple(result))
>>> ({'0': '2.0'}, {'3': '4.0'}, {'123': '1.0'}, '1')

将特征和特征值从一行映射到 python 中的字典

Mapping feature and feature value from a line to a dictionary in python

python

mapping

dictionary