在恒定时间内从一个非常大的文件中 grep 第 n 个字符串（文件大小独立）？

Question

linux 中是否有类似 grep (sed/awk) 的工具来从一个非常大的文件中查找第 n 次出现的字符串 (regex)？另外，我想查找文件中搜索字符串的出现次数。请记住，该文件确实很大 (> 2 gb)。

Answer 1

I would like to find the number of occurrences of the search string within the file

如果搜索字符串不能包含空格，以下可能就足够了：

awk -v RS=" " '/string/{i++}END{print "string count : " i}' file

但是它的速度取决于系统上可用的 RAM。

Answer 2

Grep 解决方案：

grep -on regexp < file.txt

one two one

two

one

two two

two one

grep -on one < test.txt

1:one

1:one

3:one

5:one

grep -on one < test.txt | wc -l

4

grep -m1 one < test.txt | tail -n1

one two one

更新： 现在，解决方案不使用 cat。感谢@tripleee 的提示。