Bash 在子字符串之后和子字符串之前提取

Question

假设我有一个字符串：

random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f

我想要一个 shell 命令来提取 "authentication_token = '" 之后和下一个 ' 之前的所有内容。

基本上，我想 return pYWastSemJrMqwJycZPZ。

我该怎么做？

Answer 1

如果你的 grep 支持 -P 那么你可以使用这个 PCRE 正则表达式，

$ echo "random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f" | grep -oP "authentication_token = '\K[^']*"
pYWastSemJrMqwJycZPZ

$ echo "random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f" | grep -oP "authentication_token = '\K[^']*(?=')"
pYWastSemJrMqwJycZPZ

\K 在最后打印时丢弃先前匹配的字符。
[^']* 否定字符 class 匹配任何字符但不匹配 ' 零次或多次。
(?=') 断言匹配后必须跟单引号的肯定前瞻。

Answer 2

使用参数扩展：

#!/bin/bash
text="random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f"
token=${text##* authentication_token = \'}   # Remove the left part.
token=${token%%\'*}                          # Remove the right part.
echo "$token"

请注意，即使随机文本包含 authentication token = '...'.

，它也能正常工作

Answer 3

IMO，grep -oP 是最好的解决方案。为了完整起见，有几个备选方案：

sed 's/.*authentication_token = '\''//; s/'\''.*//' <<<"$string"

awk -F "'" '{for (i=1; i<NF; i+=2) if ( ~ /authentication_token = $/) {print $(i+1); break}}' <<< "$string"

Answer 4

使用 bash 的正则表达式匹配工具。

$ regex="_token = '([^']+)'"
$ string="random text before authentication_token = 'pYWastSemJrMqwJycZPZ', gravatar_hash = 'd74a97f'"
$ [[ $string =~ $regex ]] && hash=${BASH_REMATCH[1]}
$ echo "$hash"
pYWastSemJrMqwJycZPZ

使用变量代替文字正则表达式可以简化对空格和单引号的引用。

Answer 5

我的简单版本是

sed -r "s/(.*authentication_token = ')([^']*)(.*)//"

Bash 在子字符串之后和子字符串之前提取

Bash extract after substring and before substring

string

bash

grep

expression

sed