字符串中的正则表达式单词提取将其存储到 python 列表中

Question

我是正则表达式的新手，我想提取 python 字符串中的特定单词。这是字符串：

'1. feature name: occupation_Transport-moving coefficient: 0.1776 2. feature name: education coefficient: 0.0726 3. feature name: occupation_Machine-op-inspct coefficient: 0.0661 4. feature name: occupation_Armed-Forces coefficient: 0.0006 5. feature name: workclass_Without-pay coefficient: -0.0194 6. feature name: occupation_Handlers-cleaners coefficient: -0.1256 7. feature name: occupation_Farming-fishing coefficient: -0.3938 8. feature name: GDP Group coefficient: -0.4138 9. feature name: occupation_Other-service coefficient: -0.4294 10. feature name: occupation_Priv-house-serv coefficient: -0.6560 '

我要找的结果：

[occupation_Transport-moving,education,occupation_Machine-op-inspct,occupation_Armed-Forces,workclass_Without-pay,occupation_Handlers-cleaners,occupation_Farming-fishing,GDP Group,occupation_Other-service,occupation_Priv-house-serv]

我已经试过了，但它 return 从 : 开始的整个字符串： re.findall(':\s(.*)<',txt)

提前感谢您的帮助。

Answer 1

使用

:\s*([^:.<]+)<

参见regex proof。

解释

--------------------------------------------------------------------------------
  :                        ':'
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to :
--------------------------------------------------------------------------------
    [^:.<]+                  any character except: ':', '.', '<' (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of 
--------------------------------------------------------------------------------
  <                        '<'

字符串中的正则表达式单词提取将其存储到 python 列表中

Regex words extraction within a string store it into a python list

python

regex

re-python

python-re