
Capture opening but not the closing tag


(?<tag>.)(?: href="(?<url>.+?)")?>(?<text>.+?)<

它有效,但我希望 "tag" 在未包含在标签中的段中为空,但是对于当前的注册,这些捕获前面段的结束标签...:(



      match: "p>The <",
      tag: "p",
      url: null,
      text: "The "
      match: "a href=\"https://www.legislation.gov.uk/ukpga/2010/23/contents\">UK Bribery Act<",
      tag: "a",
      url: "https://www.legislation.gov.uk/ukpga/2010/23/contents",
      text: "UK Bribery Act"
      match: "/a> (“the Act”) received Royal Assent in April 2010 and came into ... <",
      tag: null
      url: null,
      text: " (“the Act”) received Royal Assent in April 2010 and came into ... "
      match: "a href=\"http://www.oecd.org/daf/anti-bribery/ConvCombatBribery_ENG.pdf\">OECD anti-bribery Convention<",
      tag: "a",
      url: "http://www.oecd.org/daf/anti-bribery/ConvCombatBribery_ENG.pdf",
      text: "OECD anti-bribery Convention"
      match: "/a>. The Act outlined four prime offences, including the introduction ... <",
      tag: null,
      url: null,
      text: ". The Act outlined four prime offences, including the introduction ... "
      match: "b>rest is history<",
      tag: "b",
      url: null,
      text: "rest is history"


根据我在 regex101MATCH INFORMATION 框中看到的内容,我认为这可行:

/(?:(?<tag>(?<!\/).)|(?:\/.))(?: href="(?<url>.+?)")?>(?<text>.+?)</gm