如何只获取没有扩展名的文件名?
How to get only filename without extension?
假设您有这些文件路径,您希望从中获取不带扩展名的文件名:
relfilepath
0 20210322636.pdf
12 factuur-f23622.pdf
14 ingram micro.pdf
19 upfront.nl domein - Copy.pdf
21 upfront.nl domein.pdf
Name: relfilepath, dtype: object
我想到了以下内容,但这给了我一个问题,即第一项变成了输出“20210322636.0”的数字。
from pathlib import Path
for i, row in dffinalselection.iterrows():
dffinalselection['xmlfilename'][i] = Path(dffinalselection['relfilepath'][i]).stem
dffinalselection['xmlfilename'] = dffinalselection['xmlfilename'].astype(str)
这是错误的,因为它应该是“20210322636”
请帮忙!
如果列值始终是 filename/filepath,则在 .
上从右边拆分它,将 maxsplit 参数设置为 1
,并取拆分后的第一个值。
>>> df['relfilepath'].str.rsplit('.', n=1).str[0]
0 20210322636
12 factuur-f23622
14 ingram micro
19 upfront.nl domein - Copy
21 upfront.nl domein
Name: relfilepath, dtype: object
你做的是正确的,但是你对数据帧的操作不正确。
from pathlib import Path
for i, row in dffinalselection.iterrows():
dffinalselection['xmlfilename'][i] = Path(dffinalselection['relfilepath'][i]).stem # THIS WILL NOT RELIABLY MUTATE THE DATAFRAME
dffinalselection['xmlfilename'] = dffinalselection['xmlfilename'].astype(str) # THIS OVERWROTE EVERYTHING
相反,只需执行:
from pathlib import Path
dffinalselection['xmlfilename'] = ''
for row in dffinalselection.itertuples():
dffinalselection.at[row.index, 'xmlfilename']= Path(row.relfilepath).stem
或者,
dffinalselection['xmlfilename'] = dffinalselection['relfilepath'].apply(lambda value: Path(value).stem)
假设您有这些文件路径,您希望从中获取不带扩展名的文件名:
relfilepath
0 20210322636.pdf
12 factuur-f23622.pdf
14 ingram micro.pdf
19 upfront.nl domein - Copy.pdf
21 upfront.nl domein.pdf
Name: relfilepath, dtype: object
我想到了以下内容,但这给了我一个问题,即第一项变成了输出“20210322636.0”的数字。
from pathlib import Path
for i, row in dffinalselection.iterrows():
dffinalselection['xmlfilename'][i] = Path(dffinalselection['relfilepath'][i]).stem
dffinalselection['xmlfilename'] = dffinalselection['xmlfilename'].astype(str)
这是错误的,因为它应该是“20210322636”
请帮忙!
如果列值始终是 filename/filepath,则在 .
上从右边拆分它,将 maxsplit 参数设置为 1
,并取拆分后的第一个值。
>>> df['relfilepath'].str.rsplit('.', n=1).str[0]
0 20210322636
12 factuur-f23622
14 ingram micro
19 upfront.nl domein - Copy
21 upfront.nl domein
Name: relfilepath, dtype: object
你做的是正确的,但是你对数据帧的操作不正确。
from pathlib import Path
for i, row in dffinalselection.iterrows():
dffinalselection['xmlfilename'][i] = Path(dffinalselection['relfilepath'][i]).stem # THIS WILL NOT RELIABLY MUTATE THE DATAFRAME
dffinalselection['xmlfilename'] = dffinalselection['xmlfilename'].astype(str) # THIS OVERWROTE EVERYTHING
相反,只需执行:
from pathlib import Path
dffinalselection['xmlfilename'] = ''
for row in dffinalselection.itertuples():
dffinalselection.at[row.index, 'xmlfilename']= Path(row.relfilepath).stem
或者,
dffinalselection['xmlfilename'] = dffinalselection['relfilepath'].apply(lambda value: Path(value).stem)