使用RegEx在Notepad++中一键执行多个find/replace命令

Use RegEx to execute multiple find/replace commands in Notepad++ in one click

这样做的目的是从 XML(Adobe LiveCycle Designer)中提取字段名称列表。所以,我在设计器中创建了字段,然后,我复制了相关字段的 XML,粘贴到 Notepad++ 中,然后执行器 find/replace (ctrl-h) 以仅获取字段名称,一个字段在每一行中。

这将使编写 SQL 语句以将此类字段添加到数据库以进行注册变得更加容易。

XML 如下所示:

<field xmlns="http://www.xfa.org/schema/xfa-template/2.8/" y="0in" x="0.343mm" w="8.881pt" h="9.108pt" name="detcon_recreation_only">
   <ui>
      <checkButton size="8.881pt">
         <border>
            <edge stroke="lowered"/>
            <fill/>
         </border>
      </checkButton>
   </ui>
   <font size="0pt" typeface="Adobe Pi Std"/>
   <para vAlign="middle"/>
   <value>
      <text>0</text>
   </value>
   <items>
      <text>1</text>
      <text>0</text>
      <text/>
   </items>
</field>
<field xmlns="http://www.xfa.org/schema/xfa-template/2.8/" name="detcon_special_housing" y="5.393mm" w="27.94mm" h="4.134mm" x="0.343mm">
   <ui>
      <choiceList>
         <border>
            <edge stroke="lowered"/>
         </border>
         <margin/>
      </choiceList>
   </ui>
   <font typeface="Arial Narrow" size="6pt"/>
   <margin topInset="0mm" bottomInset="0mm" leftInset="0mm" rightInset="0mm"/>
   <para vAlign="middle"/>
   <value>
      <text>NA</text>
   </value>
   <items>
      <text>Not Applicable</text>
      <text>Hotel Component</text>
   </items>
   <items save="1" presence="hidden">
      <text>NA</text>
      <text>HC</text>
   </items>
</field>
<exclGroup xmlns="http://www.xfa.org/schema/xfa-template/2.8/" name="detcon_photo_taken" x="0in" y="0in">
   <?templateDesigner itemValuesSpecified 1?>
   <field w="12.446mm" h="3.825mm" name="lb_yes">
      <ui>
         <checkButton size="1.7639mm" shape="round">
            <border>
               <?templateDesigner StyleID apcb1?>
               <edge/>
               <fill/>
            </border>
         </checkButton>
      </ui>
      <font typeface="Myriad Pro"/>
      <margin leftInset="1mm" rightInset="1mm"/>
      <para vAlign="middle"/>
      <caption placement="right" reserve="7.698mm">
         <para vAlign="middle" spaceAbove="0pt" spaceBelow="0pt" textIndent="0pt" marginLeft="0pt" marginRight="0pt"/>
         <font size="8pt" typeface="Arial Narrow" baselineShift="0pt"/>
         <value>
            <text>YES</text>
         </value>
      </caption>
      <value>
         <text xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
      </value>
      <items>
         <text>1</text>
      </items>
   </field>
   <field w="28.702mm" h="3.825mm" name="lb_no" x="13.233mm">
      <ui>
         <checkButton size="1.7639mm" shape="round">
            <border>
               <?templateDesigner StyleID apcb1?>
               <edge/>
               <fill/>
            </border>
         </checkButton>
      </ui>
      <font typeface="Myriad Pro"/>
      <margin leftInset="1mm" rightInset="1mm"/>
      <para vAlign="middle"/>
      <caption placement="right" reserve="23.954mm">
         <para vAlign="middle" spaceAbove="0pt" spaceBelow="0pt" textIndent="0pt" marginLeft="0pt" marginRight="0pt"/>
         <font size="8pt" typeface="Arial Narrow" baselineShift="0pt"/>
         <value>
            <text>NO (see comments)</text>
         </value>
      </caption>
      <value>
         <text xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/>
      </value>
      <items>
         <text>0</text>
      </items>
   </field>
   <border>
      <edge presence="hidden"/>
   </border>
   <?templateDesigner expand 1?></exclGroup>

所以我想出了以下 RegEx 来执行 find/replace 以仅获取字段名称,每行一个字段。

查找以获取字段名称:(?i)<(field|exclGroup).*name="([a-z_]\w*)".*$
替换:</code> </p> <p>另一个find/replace...<br> 删除所有其他行:<code>^.*<(?!.*name=).*.*[\r\n]*
替换为 blank

如果您执行以上两个 find/replace 会话,您最终将得到每行一个字段的字段名称列表。

我想做的是在一个 find/replace 会话中执行上述操作,然后将上述操作转换为 SQL 也使用 find/replace 的语句,使用此模板:

INSERT INTO table_name (element_id, element_name, element_type, default_value, required, clone) 
VALUES (12345,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12346,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12347,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12348,"field_name_goes_here","/Tx", "", "N", "Y"),
VALUES (12349,"field_name_goes_here","/Tx", "", "N", "Y"),

element_id 字段是顺序的,不过不用担心,我可以在 Excel 中处理这个问题。

感谢您的帮助, 塔瑞克

刮板系列

One small help. The element is slightly different than the which is making things a little more complicated. Your RegEx is so sophisticated, I couldn't modify it to include the element. I think I need more time to digest it. So could you modify it to include exclGroup and only extract name only without extracting the inner field elements of the exclGroup?

好的,给你。
它确实使它变得有点复杂。

我有两个版本可以做到这一点。一种使用递归,一种不使用。

我正在 post 使用递归的版本。
如果您需要非递归,请告诉我,我会post。

查找(?:(?!<(?:field|exclGroup)(?!\w)(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)[\S\s])*(?><(field|exclGroup)(?=(?:[^>"']|"[^"]*"|'[^']*')*?\sname\s*=\s*(?:(['"])([\S\s]*?)))\s+(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)(?:(?&core)|)</\s*>(?:(?!<(?:field|exclGroup)(?!\w)(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)[\S\s])*(?(DEFINE)(?<core>(?>(?><([\w:]+)(?>"[\S\s]*?"|'[\S\s]*?'|(?:(?!/>)[^>])?)+>)(?:(?&core)|)</\s*>|(?!</[\w:]+\s*>)(?>[\S\s]))+))

替换VALUES (12345,"","/Tx", "", "N", "Y"),\r\n

https://regex101.com/r/icnF3i/1

Formatted(以备不时之需)

 (?:                           # Prefix - Optional any chars that don't start a field or exclGroup tag
      (?!
           < 
           (?: field | exclGroup )
           (?! \w )
           (?>
                " [\S\s]*? "
             |  ' [\S\s]*? '
             |  (?:
                     (?! /> )
                     [^>] 
                )?
           )+
           >
      )
      [\S\s] 
 )*

 (?>                           # open 'field' or 'exclGroup' tag ------------------
      < 
      ( field | exclGroup )         # (1)
      (?=                           # Asserttion (a pseudo atomic group)
           (?: [^>"'] | " [^"]* " | ' [^']* ' )*?
           \s name \s* = \s* 
           (?:
                ( ['"] )                      # (2), Quote
                ( [\S\s]*? )                  # (3), Name value - only thing we want
                 
           )
      )
      \s+ 
      (?>
           " [\S\s]*? "
        |  ' [\S\s]*? '
        |  (?:
                (?! /> )
                [^>] 
           )?
      )+
      >
 )
 (?:
      (?&core)                      # Call the core recursion function (balanced tags)
   |  
 )
 </  \s* >                   # Close 'field' or 'exclGroup' tag ------------------

 (?:                           # Postfix - Optional any chars that don't start a field or exclGroup tag
      (?!
           < 
           (?: field | exclGroup )
           (?! \w )
           (?>
                " [\S\s]*? "
             |  ' [\S\s]*? '
             |  (?:
                     (?! /> )
                     [^>] 
                )?
           )+
           >
      )
      [\S\s] 
 )*

 # ---------------------------------------------------------

 (?(DEFINE)
      (?<core>                      # (4 start), Inner balanced tags
           (?>
                (?>
                     < 
                     ( [\w:]+ )                    # (5), Any open tag
                     (?>
                          " [\S\s]*? "
                       |  ' [\S\s]*? '
                       |  (?:
                               (?! /> )
                               [^>] 
                          )?
                     )+
                     >
                )
                (?:                           # Recurse core 
                     (?&core) 
                  |  
                )
                </  \s* >                   # Balanced close tag (I can see you 5)
             |  
                (?! </ [\w:]+ \s* > )         # Any char not starting a close tag (passive)
                (?> [\S\s] )
           )+

      )                             # (4 end)
 )

您可以在此处查看非递归版本https://regex101.com/r/ztOrP5/1

我正在尝试简化 RegEx

这是我的简化版:

正则表达式:(?|(?><field.*name\s*=\s*"([a-z_]\w*)"(?:.|\n)*?(?:<\/field>))|(?:<exclGroup.*name\s*=\s*"([a-z_]\w*)"(?:.|\n)*?(?:<\/exclGroup>)))

替换:</code></p> <p>在这里查看:<a href="https://regex101.com/r/icnF3i/3" rel="nofollow noreferrer">https://regex101.com/r/icnF3i/3</a></p> <p>感谢您的反馈。</p> <p>感谢<a href="https://whosebug.com/users/557597/sln">sln</a>帮助我达到这个水平。</p> <hr> <p>编辑:<br> 上述 RegEx 在 Notepad++ 中不起作用。 </p> <p>要在 Notepad++ 下使用相同的方法,请使用以下 find/replace 组合:</p> <p>查找:<code>(?i)<(field|exclGroup).*name\s*=\s*"([a-z_]\w*)"[\s\S]*?<\/>
替换:\(12345,"","/Tx", "", "N", "Y"\),