使用 For 循环处理 Python 中的目录内容

Question

我正在尝试遍历目录中的一系列文本文件，寻找某些类型的词的出现，并为每个找到的词加上用户定义的标签作为前缀。我的代码如下。

ACC_Tagged_Test = 'C:/ACC_Tag_Test'

for filename in glob.glob(os.path.join(ACC_Tagged_Test, '*.txt')):
 with open(filename) as f:
    data = f.read()
    data = data.lower()

modals = {"could":1, "would":1, "should":1, "can":1, "may":1, "might":1}
personal_attribute = {"believes":1, "guess":1, "surmise":1, "considers":1, 
"presume":1, "speculate":1, "postulate":1, "surmised":1, "assume":1}
approx_adapt = {"broadly":1, "mainly":1, "mostly":1, "loosely":1, 
"generally":1, "usually":1,"typically":1, "regularly":1, "widely":1}
plaus_shields = {"wonder":1, "suspect":1, "theorize":1, "hypothesize":1, 
"cogitate":1, "contemplate":1, "deliberate":1}

format_modal = "<555>{} ".format
format_attribute = "<666>{} ".format
format_app_adaptor = "<777>{} ".format
format_plaus_shield = "<888>{} ".format


data = " ".join(format_modal(word) if word in modals else word for word in data.split())

data = " ".join(format_attribute(word) if word in personal_attribute else word for word in data.split())

data = " ".join(format_app_adaptor(word) if word in approx_adapt else word for word in data.split())

data = " ".join(format_plaus_shield(word) if word in plaus_shields else word for word in data.split())

with open (filename, "w") as f:

 f.write(str(data))
 print(data) # This is just added in order to check on screen all files
              # Are being processed.

我的问题是，虽然代码在目录中的最后一个文件上工作，但它在以前的文件上不起作用（在这个文件中是十分之一）我已经尝试在文件写出语句上方的第二个 For 循环但是那根本不起作用。谁能解释一下我在这里做错了什么？

问候

Answer 1

假设您的所有代码都应该在您的 for 循环中。您正在覆盖您的文本文件，因此看起来只有您的最后一个运行在工作：

#this overrides the file
with open(filename, "w") as fh:
    fh.write(str(data))

更改为：

#this append to the file
with open(filename, "a") as fh:
    fh.write(str(data))

这将附加到您的文本文件，并且不会用上一个循环中的数据覆盖之前添加的数据。

Answer 2

My speculation is your code is only showing the last file because it's not indented properly to have all relevant code within the for loop.

试试这个缩进：

ACC_Tagged_Test = 'C:/ACC_Tag_Test'

for filename in glob.glob(os.path.join(ACC_Tagged_Test, '*.txt')):
  with open(filename) as f:
      data = f.read()
      data = data.lower()

  modals = {"could":1, "would":1, "should":1, "can":1, "may":1, "might":1}
  personal_attribute = {"believes":1, "guess":1, "surmise":1, "considers":1, 
  "presume":1, "speculate":1, "postulate":1, "surmised":1, "assume":1}
  approx_adapt = {"broadly":1, "mainly":1, "mostly":1, "loosely":1, 
  "generally":1, "usually":1,"typically":1, "regularly":1, "widely":1}
  plaus_shields = {"wonder":1, "suspect":1, "theorize":1, "hypothesize":1, 
  "cogitate":1, "contemplate":1, "deliberate":1}

  format_modal = "<555>{} ".format
  format_attribute = "<666>{} ".format
  format_app_adaptor = "<777>{} ".format
  format_plaus_shield = "<888>{} ".format


  data = " ".join(format_modal(word) if word in modals else word for word in data.split())

  data = " ".join(format_attribute(word) if word in personal_attribute else word for word in data.split())

  data = " ".join(format_app_adaptor(word) if word in approx_adapt else word for word in data.split())

  data = " ".join(format_plaus_shield(word) if word in plaus_shields else word for word in data.split())

  with open (filename, "w") as f:
    f.write(str(data))
    print(data) # This is just added in order to check on screen all files
                # Are being processed.

使用 For 循环处理 Python 中的目录内容

Use of For loop in processing directory contents in Python

python

for-loop

tagging