Powershell:从字符串中提取 mediaid

Powershell: extract mediaid from string

我想从字符串

中提取mediaid
"\u003cimage mediaid=\"{2EB3AFF5-24C6-4C1F-8957-37CBFCBED751}\" alt=\"Caulfield to Dandenong Level Crossing Removal\" height=\"\" width=\"\" hspace=\"\" vspace=\"\" /\u003e",

如何做到?

您可以使用Substring功能。语法为:.Substring( StartIndex [, length] )

$string= '\u003cimage mediaid=\"{2EB3AFF5-24C6-4C1F-8957-37CBFCBED751}\" alt=\"Caulfield to Dandenong Level Crossing Removal\" height=\"\" width=\"\" hspace=\"\" vspace=\"\" /\u003e'
$mediaid = $string.Substring($string.Length - 148, 36)

一种选择是使用正则表达式:

$text = '"\u003cimage mediaid=\"{2EB3AFF5-24C6-4C1F-8957-37CBFCBED751}\" alt=\"Caulfield to Dandenong Level Crossing Removal\" height=\"\" width=\"\" hspace=\"\" vspace=\"\" /\u003e"'

[Regex]::Match($text, '(?<={).*(?=})').Value

哪个returns:

2EB3AFF5-24C6-4C1F-8957-37CBFCBED751

由于发布的问题中缺少要求,可能需要某种正则表达式:

要在这种情况下提取 mediaid 或引用的 GUID 之前的任何标识符名称,您可以执行以下操作:

$str = '"\u003cimage mediaid=\"{2EB3AFF5-24C6-4C1F-8957-37CBFCBED751}\" alt=\"Caulfield to Dandenong Level Crossing Removal\" height=\"\" width=\"\" hspace=\"\" vspace=\"\" /\u003e",'
$regex = [regex]'\S+(?==\"{[-A-F0-9]+}\")'
$regex.Match($str).Value

要提取 mediaid 的值,您可以执行以下操作:

$str = '"\u003cimage mediaid=\"{2EB3AFF5-24C6-4C1F-8957-37CBFCBED751}\" alt=\"Caulfield to Dandenong Level Crossing Removal\" height=\"\" width=\"\" hspace=\"\" vspace=\"\" /\u003e",'
$regex = [regex]'(?<=mediaid=)\S+'
$regex.Match($str).Value

解释:

  • \S+:匹配非空白字符一次或多次(+)
  • (?=<something>)<something> 的正面前瞻,但未捕获字符
  • (?<=): Positive lookbehind for ` 但不捕获
  • [A-F0-9-]:字符 class 匹配,包括大写 AF、零到九和 -(破折号)。请注意,这里强制使用大写字母,因为我们使用的是 .NET class Regex。通常,Windows PowerShell 不区分大小写,而无需另行指定,例如$str -match '[A-F]' 将匹配 a-fA-F.

您也可以使用正则表达式 -replace 运算符来执行此操作,您只需捕获要保留的内容并删除所有其余内容:

$string= '\u003cimage mediaid=\"{2EB3AFF5-24C6-4C1F-8957-37CBFCBED751}\" alt=\"Caulfield to Dandenong Level Crossing Removal\" height=\"\" width=\"\" hspace=\"\" vspace=\"\" /\u003e'
$string -replace '.*mediaid=\?"({[0-9A-F-]+}).*', ''

结果:

{2EB3AFF5-24C6-4C1F-8957-37CBFCBED751}

正则表达式详细信息:

.                    Match any single character that is not a line break character
   *                 Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
mediaid=             Match the characters “mediaid=” literally
\                   Match the character “\” literally
   ?                 Between zero and one times, as many times as possible, giving back as needed (greedy)
"                    Match the character “"” literally
(                    Match the regular expression below and capture its match into backreference number 1
   {                 Match the character “{” literally
   [0-9A-F-]         Match a single character present in the list below
                     A character in the range between “0” and “9”
                     A character in the range between “A” and “F”
                     The character “-”
      +              Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   }                 Match the character “}” literally
)
.                    Match any single character that is not a line break character
   *                 Between zero and unlimited times, as many times as possible, giving back as needed (greedy)