Double if 在 line.startswith 策略中有条件

Question

我有一个 data.dat 格式的文件：

REAL PART 

FREQ     1.6     5.4     2.1    13.15    13.15    17.71
FREQ     51.64   51.64   82.11  133.15   133.15   167.71

.
.
.

IMAGINARY PART 

FREQ     51.64    51.64     82.12   132.15    129.15    161.71
FREQ     5.64     51.64     83.09   131.15    120.15    160.7

.
.
.

REAL PART 

FREQ     1.6     5.4     2.1    13.15    15.15    17.71
FREQ     51.64   57.64   82.11  183.15   133.15   167.71

.
.
.

IMAGINARY PART 

FREQ     53.64    53.64     81.12   132.15    129.15    161.71
FREQ     5.64     55.64     83.09   131.15    120.15    160.7

整个文档 REAL 和 IMAGINARY 块被报告

在 REAL PART 块中，

我想 split 以 FREQ 开头的每一行。

我已经做到了：

1) 拆分行并提取 FREQ 和

的值

2) 将此结果附加到列表列表中，并且

3) 创建最终列表，All_frequencies:

FREQ = []
fname ='data.dat'
f = open(fname, 'r')
for line in f:
    if line.startswith(' FREQ'):
    FREQS = line.split()
    FREQ.append(FREQS)

print 'Final FREQ = ', FREQ
All_frequencies = list(itertools.chain.from_iterable(FREQ))
print 'All_frequencies = ', All_frequencies

此代码的问题在于它还提取了 FREQ 的 IMAGINARY PART 值。只需要提取 FREQ 的 REAL PART 值。

我试过做类似的东西：

if line.startswith('REAL PART'):
   if line.startswith('IMAGINARY PART'):
      code...

或：

if line.startswith(' REAL') and line.startswith(' FREQ'):
   code...

但这不起作用。如果你能帮助我，我将不胜感激

Answer 1

我们从设置为 False 的标志开始。如果我们找到包含 "REAL" 的行，我们将其设置为 True 以开始复制 REAL 部分下方的数据，直到找到包含 IMAGINARY 的行，这会将标志设置为 False 并转到下一行，直到找到另一个 "REAL"（因此标志变回 True）

以简单的方式使用标志概念：

with open('this.txt', 'r') as content:
    my_lines = content.readlines()

f=open('another.txt', 'w')

my_real_flag = False    
for line in my_lines:
    if "REAL" in line:
        my_real_flag = True
    elif "IMAGINARY" in line:
        my_real_flag = False
    if my_real_flag:
        #do code here because we found real frequencies
        f.write(line)
    else:
         continue #because my_real_flag isn't true, so we must have found a 
f.close()

this.txt 看起来像这样：

REAL
1
2
3
IMAGINARY
4
5
6
REAL
1
2
3
IMAGINARY
4
5
6

another.txt 最终看起来像这样：

REAL
1
2
3
REAL
1
2
3

只有在有一个 REAL 部分时才有效的原始答案

如果文件 "small" 足以作为整个字符串读取并且只有一个 "IMAGINARY PART" 实例，您可以这样做：

file_str = file_str.split("IMAGINARY PART")[0]

这将使您获得 "IMAGINARY PART" 行以上的所有内容。

然后您可以将其余代码应用到这个 file_str 仅包含实部的字符串

更详细地说，file_str 是一个 str ，它是通过以下方式获得的：

with open('data.dat', 'r') as my_data:
    file_str = my_data.read()

"with" 块在整个堆栈交换中都被引用，因此可能有比我更好的解释。我直觉上认为是

"open a file named 'data.dat' with the ability to only read it and name it as the variable my_data. once its opened, read the entirety of the file into a str, file_str, using my_data.read(), then close 'data.dat' "

现在你有了一个 str，你可以对它应用所有适用的 str 函数。

如果 "IMAGINARY PART" 在整个文件中频繁发生或文件太大，Tadgh 的建议 ~~a flag~~ a break 效果很好。

for line in f:
    if "IMAGINARY PART" not in line:
        #do stuff
    else:
        f.close()
        break

Answer 2

您需要跟踪您正在查看的部分，因此您可以使用标志来执行此操作：

section = None #will change to either "real" or "imag"
for line in f:
    if line.startswith("IMAGINARY PART"):
        section = "imag"
    elif line.startswith('REAL PART'):
        section = "real"
    else:
        freqs = line.split()
        if section == "real":
            FREQ.append(freqs)
        #elif section == "imag":
        #    IMAG_FREQ.append(freqs)

顺便说一下，您可以考虑 extending FREQ 而不是 appending 到 FREQ 然后需要使用 itertools.chain.from_iterable。 =16=]

Answer 3

将其视为具有两个状态的状态机。在一种状态下，当程序在开头读取了一行 REAL 时，它会进入 REAL 状态并聚合值。当它读取带有 IMAGINARY 的行时，它会进入备用状态并忽略值。

REAL, IMAGINARY = 1,2

FREQ = []
fname = 'data.dat'
f = open(fname)
state = None
for line in f:
    line = line.strip()
    if not line: continue
    if line.startswith('REAL'):
        state = REAL
        continue
    elif line.startswith('IMAGINARY'):
        state = IMAGINARY
        continue
    else:
        pass
    if state == IMAGINARY:
        continue
    freqs = line.split()[1:]
    FREQ.extend(freqs)

我假设您只需要数值；因此 [:1] 在脚本末尾分配给 freqs 的末尾。

使用没有省略号的数据文件，在 FREQ 中产生以下结果：

['1.6', '5.4', '2.1', '13.15', '13.15', '17.71', '51.64', '51.64', '82.11', '133.15', '133.15', '167.71', '1.6', '5.4', '2.1', '13.15', '15.15', '17.71', '51.64', '57.64', '82.11', '183.15', '133.15', '167.71 ']

Answer 4

根据问题中的示例数据，以 'REAL' 或 'IMAGINARY' 开头的行上没有任何数据，它们只是标记块的开头。如果是这种情况（并且您不会再次更改问题），您只需要跟踪您所在的块。您也可以使用 yield 而不是建立一个更大的列表频率，只要此代码在函数中即可。

def read_real_parts(fname):
    f = open(fname, 'r')
    real_part = False
    for line in f:
        if line.startswith(' REAL'):
            real_part = True
        elif line.startswith(' IMAGINARY'):
            real_part = False
        elif line.startswith(' FREQ') and real_part:
            FREQS = line.split()
            yield FREQS

FREQ = read_real_parts('data.dat') #this gives you a generator
All_frequencies = list(itertools.chain.from_iterable(FREQ)) #then convert to list

Double if 在 line.startswith 策略中有条件

Double if conditional in the line.startswith strategy

python

split

if-statement

startswith

match

只有在有一个 REAL 部分时才有效的原始答案