如何使用 pathlib 从多个文件路径中提取词干?

How to extract the stems out of multiple file paths using pathlib?

我正在尝试使用 pathlib 从多个文件路径中提取主干,但未能成功。

这是我试过的代码:

base_path = Path(__file__).parent
paths = (base_path / "../dictionary/files/").glob('**/*')
files = [x for x in paths if x.is_file()]
for i in range(len(files)):
     stem_name = files.stem[i]

这是错误:

for i in range(len(files)):
TypeError: object of type 'generator' has no len()

我有名称为 1.txt2.txt3.txt

的文本文件

预计:

1
2
3
for file_ in files:
    stem = file_.stem
    print(stem)

你很接近。

您应该索引 files(即列表),然后列表 (files[i]) 的每个元素都将是一个 <class 'pathlib.PosixPath'> 实例,它具有 .stem方法。

for i in range(len(files)):
    stem_name = files[i].stem
(test-py38) gino:Q$ cat test.py
from pathlib import Path

base_path = Path(__file__).parent
paths = (base_path / "./files").glob('**/*')
files = [x for x in paths if x.is_file()]
for i in range(len(files)):
    stem_name = files[i].stem
    print(stem_name)

(test-py38) gino:Q$ ls files
1.txt  2.txt  3.txt

(test-py38) gino:Q$ python test.py
2
3
1

虽然我不确定这个错误,因为它不能从发布的代码中重现:

for i in range(len(files)):
    TypeError: object of type 'generator' has no len()

这只有在您使用 map to create files or you used a generator expression (files = (...)) instead of a list comprehension (files = [...]). In both cases, you would be calling len on a generator, and that won't work because generators don't support len().

时才能重现
(test-py38) gino:Q$ cat test.py
from pathlib import Path

base_path = Path(__file__).parent
paths = (base_path / "./files").glob('**/*')
files = (x for x in paths if x.is_file())  # <---- generator expression
for i in range(len(files)):
    stem_name = files[i].stem
    print(stem_name)

(test-py38) gino:Q$ python test.py
Traceback (most recent call last):
  File "test.py", line 6, in <module>
    for i in range(len(files)):
TypeError: object of type 'generator' has no len()

如果您需要 loop through a generator,请不要使用索引。

files = (x for x in paths if x.is_file())
for a_file in files:
    stem_name = a_file.stem