获取文本文件中两个单词之间的文本

Question

为简洁起见，我有一个如下所示的文本文件（在 windows 中）：

Blah Blah Blah Blah
Blah Blah Blah 2016
START-OF-FILE
ABC
ABCDE Blah Blah Blah
Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah
END-OF-FILE
Blah Blah Blah
Blah Blah Blah

我只想要 START-OF-FILE 和 END-OF-FILE 之间的文本

ABC
ABCDE Blah Blah Blah
Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah

我尝试使用 Findstr，但效果不佳。有人可以帮忙吗？

这是我目前的情况：

@echo off
setlocal enabledelayedexpansion

set quote=

for /f "tokens=*" %%a in (infile.txt) do (
  set str=%%a
  set str=!str:"=:!

  if not "!str!"=="!str::=!" (
    if defined quote (
      set quote=
      for %%b in (^"%%a) do set str=%%~b
      if not "!str!"==START-OF-FILE if not "!str: =!"==END-OF-FILE echo !str! >> outfile.txt
    ) else (
      set quote=1
      for %%b in (%%a^") do set str=%%~b
    )
  )

  if defined quote (
    if not "!str!"==START-OF-FILE if not "!str: =!"==END-OF-FILE echo !str! >> outfile.txt
  )
)

这是结果：

2016" 
START-OF-FILE 
ABC
ABCDE Blah Blah Blah
Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah
END-OF-FILE
Blah Blah Blah

我需要不包含 2016" 、START-OF-FILE 、END-OF-FILE 和 END-OF-FILE 之后的行 (Blah Blah Blah)

Answer 1

您可以使用

String[] lines = Files.readAllLines(Paths.get("myfile.txt"));

获取文件的所有行作为一个数组。从那里很容易遍历它们并找到你想要的。

String result = "";
boolean withinBounds = false;
for (int i = 0; i < lines.length; i++) {
 if (lines[i].equals("START-OF-FILE")) {
  withinBounds = true;
 }
 if (lines[i].equals("END-OF-FILE")) {
  withinBounds = false;
 }
 if (withinBounds) {
  //do whatever you want to do with the lines between your tags here
  result = result + lines[i] + "\n";
 }
}

请注意，这是未经测试的，但一般概念应该绝对适合您。请注意，它还假设您的标签将自己排成一行。

Answer 2

使用Windows Powershell

如果您知道起点和终点，这将是一个两步过程。第一行切断顶部，第二行切断底部。

获取内容file.txt| select -last n > output.txt

获取内容output.txt| select -first n > output2.txt

如果您不知道起点和终点在哪里，则需要执行两次此额外步骤...

类型file.txt| select-字符串-模式"START_OF_FILE"| Select-对象行号

类型file.txt| select-字符串-模式"END_OF_FILE"| Select-对象行号

Answer 3

@ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q36416492.txt"
SET "outfile=%destdir%\outfile.txt"
SET "output="
(
FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO (
 IF "%%a"=="END-OF-FILE" SET "output="
 IF DEFINED output ECHO(%%a
 IF "%%a"=="START-OF-FILE" SET "output=Y"
)
)>"%outfile%"

GOTO :EOF

您需要更改 sourcedir 和 destdir 的设置以适合您的情况。

我使用了一个名为 q36416492.txt 的文件，其中包含您的数据用于我的测试。

生成定义为 %outfile%

的文件

使用 if defined 解释变量的运行时间值这一事实。

读取文件的每一行，如果ON-trigger字符串匹配则设置output为一个值，OFF-trigger字符串匹配则清除它。如果定义了标志 output，则仅反省该行。

Answer 4

@echo off
setlocal EnableDelayedExpansion

set "skip="
for /F "delims=:" %%a in ('findstr /N "START-OF-FILE END-OF-FILE" input.txt') do (
   if not defined skip (
      set "skip=%%a"
   ) else (
      set /A "lines=%%a-skip-1"
   )
)
(for /F "skip=%skip% delims=" %%a in (input.txt) do (
   echo %%a
   set /A lines-=1
   if !lines! equ 0 goto break
)) > output.txt
:break

获取文本文件中两个单词之间的文本

Getting texts between two words in a text file

windows

batch-file

findstr