file.write() 有时(但不总是)将文本写入文件

file.write() sometimes (but not always) writing text to file

我正在使用 file.write() 将数字数据添加到文本文件。然而,在 516159 个字符之后,发生了一些有趣的事情:大约一半的时间我 运行 我的代码,它删除了最后 7k 个字符。另一半,它工作正常。 这是一些代码:

#Create or open file (it strangely couldn't create the file without using mode='x')
try:
  corpus_txt = open("corpus.txt", mode = "x")
except:
  corpus_txt = open("corpus.txt", mode = "w")

corpus_txt.truncate(0)#delete contents

content_length = 0

#X_train is a 2D array of integers
for sentence in X_train:
  for word in sentence:

    corpus_txt.write(str(word)+" ")
    content_length += len(str(word)+" ")

  corpus_txt.write("\n")
  content_length += 1

corpus_txt = open("corpus.txt")
content = corpus_txt.read()
corpus_txt.close()

print("FILE LENGTH (chars):", len(content))
print("TOTAL LENGTH OF TEXT ADDED TO FILE:", content_length)

当我用我的数据 运行 反复这样做时:

一些其他信息:

非常感谢help/explanation。谢谢!

您需要在写入文件后close()文件;否则它不能保证被刷新到磁盘,随后的 open() 将不会“看到”你所做的写入。使用上下文管理器语法 (with open(...) as ...:) 被认为是最佳实践,因为它几乎不可能犯这种错误。

这应该有效:

with open("corpus.txt", mode="w") as corpus_txt:

    # opening with "w" automatically overwrites previous contents
    content_length = 0

    #X_train is a 2D array of integers
    for sentence in X_train:
        for word in sentence:
            corpus_txt.write(str(word)+" ")
            content_length += len(str(word)+" ")
        corpus_txt.write("\n")
        content_length += 1

with open("corpus.txt") as corpus_txt:
    content = corpus_txt.read()

print("FILE LENGTH (chars):", len(content))
print("TOTAL LENGTH OF TEXT ADDED TO FILE:", content_length)

与文件写入问题无关:我可能建议将其简化为仅预先生成 content 作为字符串(因为它显然足够小以适合内存)因此您不需要所有额外的簿记来计算它有多长:

with open("corpus.txt", mode="w") as corpus_txt:
    content = "\n".join(
        " ".join(str(word) for word in sentence)
        for sentence in X_train
    ) + "\n"
    corpus_txt.write(content)
print(f"File length as written: {len(content)}")

with open("corpus.txt") as corpus_txt:
    content = corpus_txt.read()
print(f"File length as read: {len(content)}")