如何检查文件内的所有文件夹和文件内的子文件夹是否存在特定字符串

How to check all the folder inside files and subfolder inside files have particular string present

import os
match_str = ['20210624']
not_match_str =  ['20210625']
for root, dirs, files in os.walk(path):
    for name in files:
        if name.endswith((".txt")):
             ## search files with match_str `20210624`  and not_match_str `20210625`

我可以使用 import walk

您可以使用 pathlibglob 来实现。

import pathlib
path = pathlib.Path(path)
maybe_valids = list(path.glob("*20210624*.txt"))
valids = [elem for elem in maybe_valids if "20210625" not in elem.name]
print(valids)

maybe_valids 列表的创建包含包含“20210624”并以 .txt 结尾的每个元素,而 valids 是不包含“20210625”的元素。

从这里继续 -

if name.endswith((".txt")):
   f = file.read(name,mode='r')
   a = f.read()
   if match_str[0] in f.read():
      # Number is present

如果你有多个 match_str,你也可以使用 for 循环来读取。 同样,您可以使用 not in 关键字来检查 not_match_str

您可以将glob.glob()方法中的recursive关键字参数设置为True,让程序递归搜索文件夹、子文件夹等文件

from glob import glob

path = 'C:\Users\User\Desktop'
for file in glob(path + '\**\*.txt', recursive=True):
    with open(file) as f:
        text = f.read()
        if '20210624'  in text and '20210625' not in text:
            print(file)

如果您不想打印文件的整个路径;只有文件名,然后:

from glob import glob

path = 'C:\Users\User\Desktop'
for file in glob(path + '\**\*.txt', recursive=True):
    with open(file) as f:
        text = f.read()
        if '20210624'  in text and '20210625' not in text:
            print(file.split('\')[-1])

为了使用 os.walk() method, you can use the str.endswith() 方法 (正如您在 post 中所做的那样) 像这样:

import os

for path, _, files in os.walk('C:\Users\User\Desktop'):
    for file in files:
        if file.endswith('.txt'):
            with open(os.path.join(path, file)) as f:
                text = f.read()
                if '20210624'  in text and '20210625' not in text:
                    print(file)

并在最大级别的子目录中搜索:

import os

levels = 2
root = 'C:\Users\User\Desktop'
total = root.count('\') + levels

for path, _, files in os.walk(root):
    if path.count('\') > total:
        break
    for file in files:
        if file.endswith('.txt'):
            print(os.path.join(path, file))

您可以使用几个简单的 shell 命令获取文件名:

find . -name "*.txt" | xargs grep -l "20210624" | xargs grep -L "20210625"