列出包含 Python 中的部分字符串的文件名
Listing name of files with part of a string from Python
我正在尝试列出目录中的所有文件,这些文件包含我在其名称中指定的字符串。我想在循环的每次迭代中改变这个字符串。我使用的代码是:
from subprocess import Popen
from subprocess import call
species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
run_length = (len(species_array) - 5)
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for i in range (run_length):
s = Popen("find", path, "-name", *species_array[i+1]*)
print s.communicate()[0]
文件名称中应包含 species_array[i+1]。提前致谢。
如果您想使用 find
,您需要在 shell=False
时传递 args
的 list
。 check_output
将适用于您的情况,您可以对列表进行切片而不是使用范围,并且您需要 str.format
将每个 specie/ele 包装在 *
:
from subprocess import check_output
species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for ele in species_array[1:-5]:
s = check_output(["find", path, "-name", "*{0}*".format(ele)])
print s
对于 python 2.6 使用 Popen:
from subprocess Popen,PIPE
species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for ele in species_array[1:-5]:
s = Popen(["find", path, "-name", "*{0}*".format(ele)],stdout=PIPE,stderr=PIPE)
out,err = s.communicate()
print(out,err)
你的循环全错了。 python 比那个更具表现力:
1) 您可以通过从 1 开始范围来跳过第一个元素:
for i in range(1, len(species_arr) - 4):
...然后在循环中使用 i
而不是 i+1
。
2) 更简单(也更惯用)的是使用列表切片:
for species in species_arr[1:-4]:
3) 您可以使用 format() 方法格式化 python 中的字符串。
下面是一个使用这些概念的例子:
species_arr = [
"homo_sapiens",
"pan_troglodytes",
"pongo_abelii",
"gorilla_gorilla",
"macaca_mulatta",
"callithrix_jacchus",
"bos_taurus",
"canis_familiaris",
"equus_caballus",
"felis_catus",
"ovis_aries",
"sus_scrofa",
"oryctolagus_cuniculus",
"rattus_norvegicus",
"mus_caroli",
"mus_pahari",
"mus_musculus"
]
chop_from_end = 4
for species in species_arr[1:-chop_from_end]:
fname = "*{0}*".format(species)
print fname
--output:--
*pan_troglodytes*
*pongo_abelii*
*gorilla_gorilla*
*macaca_mulatta*
*callithrix_jacchus*
*bos_taurus*
*canis_familiaris*
*equus_caballus*
*felis_catus*
*ovis_aries*
*sus_scrofa*
*oryctolagus_cuniculus*
format() 方法是在 python 3.0 中引入的——但它被反向移植到 python 2.6(以更有限的形式)。如果由于某种原因您的安装没有 format() 方法,您可以使用旧方法:
fname = "*%s*" % species
在此处查看其他 format() 示例:
https://docs.python.org/3/library/string.html#format-examples
4) 以下是您可以使用 glob module
执行的操作:
import glob
import os.path
import pprint
base_dir = '/Users/7stud/python_programs/dir1'
names = ['a', 'b', 'c']
for name in names:
fname = "*{0}*".format(name)
path = os.path.join(base_dir, fname)
pprint.pprint(glob.glob(path))
print '-' * 20
--output:--
['/Users/7stud/python_programs/dir1/__pycache__',
'/Users/7stud/python_programs/dir1/a.txt',
'/Users/7stud/python_programs/dir1/aa.txt',
'/Users/7stud/python_programs/dir1/ab.txt',
'/Users/7stud/python_programs/dir1/ba.txt']
--------------------
['/Users/7stud/python_programs/dir1/ab.txt',
'/Users/7stud/python_programs/dir1/b.txt',
'/Users/7stud/python_programs/dir1/ba.txt']
--------------------
['/Users/7stud/python_programs/dir1/__pycache__']
--------------------
或者,作为 name, matches
对的字典:
results = dict(
(
name,
glob.iglob(os.path.join(base_dir, "*{0}*".format(name)))
)
for name in names
)
for name, _iter in results.items():
print "{0}:".format(name)
pprint.pprint(list(_iter))
--output:--
a:
['/Users/7stud/python_programs/dir1/__pycache__',
'/Users/7stud/python_programs/dir1/a.txt',
'/Users/7stud/python_programs/dir1/aa.txt',
'/Users/7stud/python_programs/dir1/ab.txt',
'/Users/7stud/python_programs/dir1/ba.txt']
c:
['/Users/7stud/python_programs/dir1/__pycache__']
b:
['/Users/7stud/python_programs/dir1/ab.txt',
'/Users/7stud/python_programs/dir1/b.txt',
'/Users/7stud/python_programs/dir1/ba.txt']
我正在尝试列出目录中的所有文件,这些文件包含我在其名称中指定的字符串。我想在循环的每次迭代中改变这个字符串。我使用的代码是:
from subprocess import Popen
from subprocess import call
species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
run_length = (len(species_array) - 5)
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for i in range (run_length):
s = Popen("find", path, "-name", *species_array[i+1]*)
print s.communicate()[0]
文件名称中应包含 species_array[i+1]。提前致谢。
如果您想使用 find
,您需要在 shell=False
时传递 args
的 list
。 check_output
将适用于您的情况,您可以对列表进行切片而不是使用范围,并且您需要 str.format
将每个 specie/ele 包装在 *
:
from subprocess import check_output
species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for ele in species_array[1:-5]:
s = check_output(["find", path, "-name", "*{0}*".format(ele)])
print s
对于 python 2.6 使用 Popen:
from subprocess Popen,PIPE
species_array = ["homo_sapiens", "pan_troglodytes", "pongo_abelii", "gorilla_gorilla", "macaca_mulatta", "callithrix_jacchus", "bos_taurus", "canis_familiaris", "equus_caballus", "felis_catus", "ovis_aries", "sus_scrofa", "oryctolagus_cuniculus", "rattus_norvegicus", "mus_caroli", "mus_pahari", "mus_musculus"]
path = "/homes/varshith/maf_files/1/testmafs/HAL_Files/"
for ele in species_array[1:-5]:
s = Popen(["find", path, "-name", "*{0}*".format(ele)],stdout=PIPE,stderr=PIPE)
out,err = s.communicate()
print(out,err)
你的循环全错了。 python 比那个更具表现力:
1) 您可以通过从 1 开始范围来跳过第一个元素:
for i in range(1, len(species_arr) - 4):
...然后在循环中使用 i
而不是 i+1
。
2) 更简单(也更惯用)的是使用列表切片:
for species in species_arr[1:-4]:
3) 您可以使用 format() 方法格式化 python 中的字符串。
下面是一个使用这些概念的例子:
species_arr = [
"homo_sapiens",
"pan_troglodytes",
"pongo_abelii",
"gorilla_gorilla",
"macaca_mulatta",
"callithrix_jacchus",
"bos_taurus",
"canis_familiaris",
"equus_caballus",
"felis_catus",
"ovis_aries",
"sus_scrofa",
"oryctolagus_cuniculus",
"rattus_norvegicus",
"mus_caroli",
"mus_pahari",
"mus_musculus"
]
chop_from_end = 4
for species in species_arr[1:-chop_from_end]:
fname = "*{0}*".format(species)
print fname
--output:--
*pan_troglodytes*
*pongo_abelii*
*gorilla_gorilla*
*macaca_mulatta*
*callithrix_jacchus*
*bos_taurus*
*canis_familiaris*
*equus_caballus*
*felis_catus*
*ovis_aries*
*sus_scrofa*
*oryctolagus_cuniculus*
format() 方法是在 python 3.0 中引入的——但它被反向移植到 python 2.6(以更有限的形式)。如果由于某种原因您的安装没有 format() 方法,您可以使用旧方法:
fname = "*%s*" % species
在此处查看其他 format() 示例:
https://docs.python.org/3/library/string.html#format-examples
4) 以下是您可以使用 glob module
执行的操作:
import glob
import os.path
import pprint
base_dir = '/Users/7stud/python_programs/dir1'
names = ['a', 'b', 'c']
for name in names:
fname = "*{0}*".format(name)
path = os.path.join(base_dir, fname)
pprint.pprint(glob.glob(path))
print '-' * 20
--output:--
['/Users/7stud/python_programs/dir1/__pycache__',
'/Users/7stud/python_programs/dir1/a.txt',
'/Users/7stud/python_programs/dir1/aa.txt',
'/Users/7stud/python_programs/dir1/ab.txt',
'/Users/7stud/python_programs/dir1/ba.txt']
--------------------
['/Users/7stud/python_programs/dir1/ab.txt',
'/Users/7stud/python_programs/dir1/b.txt',
'/Users/7stud/python_programs/dir1/ba.txt']
--------------------
['/Users/7stud/python_programs/dir1/__pycache__']
--------------------
或者,作为 name, matches
对的字典:
results = dict(
(
name,
glob.iglob(os.path.join(base_dir, "*{0}*".format(name)))
)
for name in names
)
for name, _iter in results.items():
print "{0}:".format(name)
pprint.pprint(list(_iter))
--output:--
a:
['/Users/7stud/python_programs/dir1/__pycache__',
'/Users/7stud/python_programs/dir1/a.txt',
'/Users/7stud/python_programs/dir1/aa.txt',
'/Users/7stud/python_programs/dir1/ab.txt',
'/Users/7stud/python_programs/dir1/ba.txt']
c:
['/Users/7stud/python_programs/dir1/__pycache__']
b:
['/Users/7stud/python_programs/dir1/ab.txt',
'/Users/7stud/python_programs/dir1/b.txt',
'/Users/7stud/python_programs/dir1/ba.txt']