检查数组的每个元素是否存在于 bash 中的字符串中，忽略某些字符和顺序

Question

在网络上，我找到了查找字符串中是否存在数组的元素的答案。但是我想查找数组中的 each 元素是否存在于字符串中。

例如。 str1 = "This_is_a_big_sentence"

最初 str2 就像

str2 = "Sentence_This_big"

现在我想搜索if string str1 contains "sentence"&"this"&"big" (全部3个，忽略字母顺序和大小写)

所以我用了arr=(${str2//_/ }) 我现在该怎么做，我知道 comm 命令找到交集，但它需要一个排序列表，我还需要忽略 _ 下划线。

我通过使用命令

查找特定类型文件的扩展名来获得我的 str2

    for i in `ls snooze.*`; do echo    $i | cut -d "." -f2 
# Till here i get str2 and need to check as mentioned above. Not sure how to do this, i tried putting str2 as array and now just need to check if all elements of my array occur in str1 (ignore case,order)

如有任何帮助，我们将不胜感激。我确实尝试使用 This link

Answer 1

Now I wanted to search if string a contains "sentence"&"this"&"big" (All 3, ignore alphabatic order and case)

这是一种方法：

#!/bin/bash
str1="This_is_a_big_sentence"
str2="Sentence_This_big"
if ! grep -qvwFf <(sed 's/_/\n/g' <<<${str1,,}) <(sed 's/_/\n/g' <<<${str2,,})
then
    echo "All words present"
else
    echo "Some words missing"
fi

工作原理

${str1,,} returns 字符串 str1 所有大写字母都替换为小写字母。
sed 's/_/\n/g' <<<${str1,,} returns 字符串 str1，全部转换为小写，下划线替换为新行，以便每个单词在新行上.
<(sed 's/_/\n/g' <<<${str1,,}) returns 一个类似文件的对象，包含 str1 中的所有单词，每个单词小写并在单独的一行上。

类文件对象的创建称为进程替换。在这种情况下，它允许我们将 shell 命令的输出视为要读取的文件。
<(sed 's/_/\n/g' <<<${str2,,}) 对 str2 做同样的事情。
假设 file1 和 file2 每行一个词，grep -vwFf file1 file2 从 file2 中删除 file2 中出现的每个词。如果没有剩余的单词，则意味着 file2 中的每个单词都出现在 file1 中。

通过添加选项 -q，grep 将 return 没有输出，但会设置一个退出代码，我们可以在 if 语句中使用。

在实际命令中，file1 和 file2 被我们的类文件对象替换。

剩下的grep个选项可以这样理解：
- -w 告诉 grep 只查找整个单词。
- -F 告诉 grep 查找固定字符串，而不是正则表达式。
- -f 告诉 grep 在后面的文件（或类似文件的对象）中查找要匹配的模式。
- -v 告诉 grep 删除（默认是保留）匹配的单词。

Answer 2

这是一个方法。

if [ "$(echo "This_BIG_senTence" | grep -ioE 'this|big|sentence' | wc -l)" == "3" ]; then echo "matched"; fi

它是如何工作的。 grep options -i 使 grep 不区分大小写，-E 用于扩展正则表达式，-o 按行分隔匹配项。现在它是按行分隔的，使用 wc 和 -l 来计算行数。因为我们有 3 个条件，我们检查它是否等于 3。Grep 将 return 匹配发生的行，所以如果你只使用一个字符串，上面的例子将 return 每个条件的字符串，在本例中为 3，所以不会有任何问题。

请注意，您还可以创建一个 grep 链并查看其是否为空。

if [ $(echo "This_BIG_SenTence" | grep -i this | grep -i big | grep -i sentence) ]; then echo matched; else echo not_matched; fi

Answer 3

这是一个 awk 解决方案，用于检查一个字符串中的所有单词是否存在于另一个字符串中：

str1="This_is_a_big_sentence"
str2="Sentence_This_big"

awk -v RS=_ 'FNR==NR{a[tolower()]; next} {delete a[tolower()]} END{print (length(a)) ? "Not all words" : "All words"}' <(echo "$str2") <(echo "$str1")

有缩进：

awk -v RS=_ 'FNR==NR {
   a[tolower()]; 
   next
}
{ delete a[tolower()] }
END {
   print (length(a)) ? "Not all words" : "All words"
}' <(echo "$str2") <(echo "$str1")

解释：

-v RS=_ 我们使用记录分隔符作为 _
FNR==NR - 为 str2
a[tolower()]; next - 以每个小写单词作为键

a

{delete a[tolower()]} - 对于 str1 中的每个单词，删除数组 a
END - 如果数组 a 的长度仍然不为 0，则还剩下一些单词。

Answer 4

这是另一个解决方案：

#!/bin/bash
str1="This_is_a_big_sentence"
str2="sentence_This_big"
var=0
var2=0

while read in
do
        if [  $(echo $str1 | grep -ioE $in) ]
        then
                var=$((var+1))
        fi
        var2=$((var2+1))
done < <(echo $str2 | sed -e 's/\(.*\)/\L/' -e 's/_/\n/g')

if [[ $var -eq $var2 && $var -ne 0 ]]
then
        echo "matched"
else
        echo "not matched"

此脚本的作用是使 str2 全部小写为 sed -e 's/$.*$/\L/'，这是将任何字符替换为其小写，然后替换下划线 _ return 行 \n 具有以下 sed 表达式：sed -e 's/_/\n/g'，这是另一个替换。

现在将单个单词输入 while 循环，该循环将 str1 与输入的单词进行比较。每次匹配时，增加 var 并且每次我们迭代 while 时，我们都会增加 var2。如果var == var2，那么在str1中找到了str2的所有单词。希望对您有所帮助。

Answer 5

现在我明白你的意思了。试试这个：

#!/bin/bash

# add 4 non-matching examples
> snooze.foo_bar
> snooze.bar_go
> snooze.go_foo
> snooze.no_match

# add 3 matching examples
> snooze.foo_bar_go
> snooze.goXX_XXfoo_XXbarXX
> snooze.bar_go_foo_Ok

str1=("foo" "bar" "go")
for i in `ls snooze.*`; do
    str2=${i#snooze.}
    j=0
    found=1
    while [[ $j -lt ${#str1[@]} ]]; do
       if ! echo $str2 | eval grep ${str1[$j]} >& /dev/null; then
           found=0
           break
       fi
       ((j++))
    done
    if [[ $found -ne 0 ]]; then
        echo Match found: $str2
    fi
done

此脚本的打印结果：

Match found: bar_go_foo_Ok
Match found: foo_bar_go
Match found: goXX_XXfoo_XXbarXX

或者，上面的 if..grep 行可以替换为

if [[ ! $str2 =~  `eval echo ${str1[$j]}` ]]; then

利用bash的正则表达式匹配。

注意：我不太注意搜索字符串中的特殊字符，例如“\”或“”(space)，这可能会导致问题。

--- 一些解释---

在if .. grep行中，$j首先求值到运行索引，从0到$str1中的元素个数减1。然后，eval会重新求值整个grep 再次命令，导致 ${str1[jjj]} 被重新计算（这里，jjj 是已经计算过的索引）

策略是设置 found=1（默认找到），然后当任何 grep 失败时，我们将 found 设置为 0 并中断内部 j 循环。

其他一切都应该简单明了。

检查数组的每个元素是否存在于 bash 中的字符串中，忽略某些字符和顺序

Check if each element of an array is present in a string in bash, ignoring certain characters and order

arrays

string

bash

comm

工作原理