Python 字符串切片没有按预期工作
Python string slice does not work as expected
我打算从文件或标准输入中读取,所以写了下面的代码。当 Debug 为真时通过 fileinput(filename) 从文件中读取,否则通过 fileinput() 读取。奇怪的是,当我打印 'line' 时,它在输出文件上打印得很好。但是,当我按索引从行中获取字符或在调试中断时检查行时,它看起来像“疖풵UBWEWUB”,而正确的输入是 "WUBWEWUB"(它是输入文件内容)。
import os
import sys
import psutil
import fileinput
if __debug__:
Debug = True
import defs
else: Debug = False
if(Debug == True):
def memory():
pass
inputFilePath = os.path.join(defs.IO_Dir, "input.txt")
inf = open(inputFilePath, "r")
outf = open(os.path.join(defs.IO_Dir, "output.txt"), "w")
logf = open(os.path.join(defs.IO_Dir, "log.txt"), "a")
logf.write(f"Program started at : {gettime()}\n")
def write(str):
print(str, file = outf)
pass
inval = inputFilePath
sys.stdout = outf
pass
else:
inval = None
pass
if(Debug): print("Start------------------------")
for x in fileinput.input(inval):
line = x.strip()
if(line == "Exit"):
break
ans = ""
i = 0
l = 3
print(type(line))
print(type(line[0]))
for i in range(2, len(line) - 3):
if(line[i] == 'W' and line[i+1] == 'U' and line[i+2] == 'B'):
if(i + 3 == len(line)):
break
pass
l = i + 3
if(ans[-1] != ' '):
ans = ans + " "
pass
elif(i >= l):
ans = ans + line[i]
pass
pass
pass
print(ans)
print(line)
if(Debug): print("End------------------------")
输出文件:
Start------------------------
<class 'str'>
<class 'str'>
BWE
WUBWEWUB
End------------------------
我以为是编码问题,但行的类型只是 'str',我找不到为什么会有奇怪的字母。
文本文件有 BOM header,它包含在切片中(第 [0] 行,第 [1] 行)。我正在使用 visual studio 并通过另存为没有签名的 utf 8 删除了 BOM header。
我打算从文件或标准输入中读取,所以写了下面的代码。当 Debug 为真时通过 fileinput(filename) 从文件中读取,否则通过 fileinput() 读取。奇怪的是,当我打印 'line' 时,它在输出文件上打印得很好。但是,当我按索引从行中获取字符或在调试中断时检查行时,它看起来像“疖풵UBWEWUB”,而正确的输入是 "WUBWEWUB"(它是输入文件内容)。
import os
import sys
import psutil
import fileinput
if __debug__:
Debug = True
import defs
else: Debug = False
if(Debug == True):
def memory():
pass
inputFilePath = os.path.join(defs.IO_Dir, "input.txt")
inf = open(inputFilePath, "r")
outf = open(os.path.join(defs.IO_Dir, "output.txt"), "w")
logf = open(os.path.join(defs.IO_Dir, "log.txt"), "a")
logf.write(f"Program started at : {gettime()}\n")
def write(str):
print(str, file = outf)
pass
inval = inputFilePath
sys.stdout = outf
pass
else:
inval = None
pass
if(Debug): print("Start------------------------")
for x in fileinput.input(inval):
line = x.strip()
if(line == "Exit"):
break
ans = ""
i = 0
l = 3
print(type(line))
print(type(line[0]))
for i in range(2, len(line) - 3):
if(line[i] == 'W' and line[i+1] == 'U' and line[i+2] == 'B'):
if(i + 3 == len(line)):
break
pass
l = i + 3
if(ans[-1] != ' '):
ans = ans + " "
pass
elif(i >= l):
ans = ans + line[i]
pass
pass
pass
print(ans)
print(line)
if(Debug): print("End------------------------")
输出文件:
Start------------------------
<class 'str'>
<class 'str'>
BWE
WUBWEWUB
End------------------------
我以为是编码问题,但行的类型只是 'str',我找不到为什么会有奇怪的字母。
文本文件有 BOM header,它包含在切片中(第 [0] 行,第 [1] 行)。我正在使用 visual studio 并通过另存为没有签名的 utf 8 删除了 BOM header。