(Python 3) 如何在不先保存的情况下将二进制文件作为文本传递
(Python 3) How to pass binary file as text without saving first
或者,也许是一个更好的标题:如何在将二进制文件传递给文本模式写入子句时避免不需要的额外回车 return。
Python3.6,Windows。输入文件需要首先经过二进制search/replace,然后是正则表达式search/replace。
我首先以二进制模式打开输入文件,完成工作,然后以二进制模式将其保存在一个临时文件中。然后我以文本模式打开它,执行正则表达式 search/replace,并以文本模式保存它(名称类似于输入文件的名称)。
def fixbin(infile):
with open(infile, 'rb') as f:
file = f.read()
# a few bytearray operations here, then:
with open('bin.tmp', 'wb') as f:
f.write(file)
def fix4801(fname, ext):
outfile = '{}_OK{}'.format(fname, ext)
with open('bin.tmp', encoding='utf-8-sig', mode='r') as f, \
open(outfile, encoding='utf-8-sig', mode='w') as g:
infile = f.read()
x = re.sub(r'(\n4801.+\n)4801', r' ', infile)
g.write(y)
infile, fname, ext = get_infile() # function get_infile not shown for brevity
fixbin(infile)
fix4801(fname, ext)
它有效,但它很丑。我宁愿将输出作为文件传递,如下所示:
def fixbin(infile):
with open(infile, 'rb') as f:
file = f.read()
# a few bytearray operations here, and then
return file.decode('utf-8')
def fix4801(infile):
x = re.sub(r'(\n4801.+\n)4801', r' ', infile)
return x
...
temp = fixbin(infile)
result = fix4801(temp)
outfile = '{}_OK{}'.format(fname, ext)
with open(outfile, encoding='utf-8-sig', mode='w') as g:
g.write(result)
但随后输出文件 (Windows) 出现了不需要的额外回车 return。症状描述为 here,但原因不同:我没有使用 os.linesep
,换句话说,我的代码中没有 os.linesep。 (底层库里可能有,我没查)
我做错了什么?
Python » Documentation : open
open(file, mode='r', buffering=-1, encoding=None, errors=None,
newline=None, closefd=True, opener=None)
默认:newline=None
,如果换行符是''
或'\n'
,则不会进行翻译。
如果有任何不同,请尝试以下操作:
#change
open(outfile, encoding='utf-8-sig', mode='w') as g:
#with
open(outfile, encoding='utf-8-sig', mode='w', newline='') as g:
Question: ... there is no os.linesep in my code.
Python » Documentation : open
When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.
或者,也许是一个更好的标题:如何在将二进制文件传递给文本模式写入子句时避免不需要的额外回车 return。
Python3.6,Windows。输入文件需要首先经过二进制search/replace,然后是正则表达式search/replace。
我首先以二进制模式打开输入文件,完成工作,然后以二进制模式将其保存在一个临时文件中。然后我以文本模式打开它,执行正则表达式 search/replace,并以文本模式保存它(名称类似于输入文件的名称)。
def fixbin(infile):
with open(infile, 'rb') as f:
file = f.read()
# a few bytearray operations here, then:
with open('bin.tmp', 'wb') as f:
f.write(file)
def fix4801(fname, ext):
outfile = '{}_OK{}'.format(fname, ext)
with open('bin.tmp', encoding='utf-8-sig', mode='r') as f, \
open(outfile, encoding='utf-8-sig', mode='w') as g:
infile = f.read()
x = re.sub(r'(\n4801.+\n)4801', r' ', infile)
g.write(y)
infile, fname, ext = get_infile() # function get_infile not shown for brevity
fixbin(infile)
fix4801(fname, ext)
它有效,但它很丑。我宁愿将输出作为文件传递,如下所示:
def fixbin(infile):
with open(infile, 'rb') as f:
file = f.read()
# a few bytearray operations here, and then
return file.decode('utf-8')
def fix4801(infile):
x = re.sub(r'(\n4801.+\n)4801', r' ', infile)
return x
...
temp = fixbin(infile)
result = fix4801(temp)
outfile = '{}_OK{}'.format(fname, ext)
with open(outfile, encoding='utf-8-sig', mode='w') as g:
g.write(result)
但随后输出文件 (Windows) 出现了不需要的额外回车 return。症状描述为 here,但原因不同:我没有使用 os.linesep
,换句话说,我的代码中没有 os.linesep。 (底层库里可能有,我没查)
我做错了什么?
Python » Documentation : open
open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
默认:newline=None
,如果换行符是''
或'\n'
,则不会进行翻译。
如果有任何不同,请尝试以下操作:
#change
open(outfile, encoding='utf-8-sig', mode='w') as g:
#with
open(outfile, encoding='utf-8-sig', mode='w', newline='') as g:
Question: ... there is no os.linesep in my code.
Python » Documentation : open
When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.