Python 抓取两个特定字符之间的子串
Python grab substring between two specific characters
我有一个包含数百个文件的文件夹,名称如下:
"2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
惯例:
year_month_ID_zone_date_0_L2A_B01.tif
("_0_L2A_B01.tif"
,并且 "zone"
永远不会改变)
我需要遍历每个文件并根据它们的名称构建路径以便下载它们。
例如:
name = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
path = "2017/5/S2B_7VEG_20170528_0_L2A/B01.tif"
路径约定需要是:path = year/month/ID_zone_date_0_L2A/B01.tif
我想做一个循环,每次遇到 "_"
字符时将我的字符串“剪切”成几个部分,然后按正确的顺序拼接不同的部分以创建我的路径名。
我试过了,但没用:
import re
filename =
"2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
try:
found = re.search('_(.+?)_', filename).group(1)
except AttributeError:
# _ not found in the original string
found = '' # apply your error handling
我怎样才能在 Python 上做到这一点?
无需正则表达式——您只需使用 split()
.
filename = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
parts = filename.split("_")
year = parts[0]
month = parts[1]
因为你只有一个分隔符,你不妨简单地使用 Python 的内置拆分功能:
import os
items = filename.split('_')
year, month = items[:2]
new_filename = '_'.join(items[2:])
path = os.path.join(year, month, new_filename)
filename = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
temp = filename.split('_')
result = "/".join(temp)
print(result)
结果是
2017/05/S2B/7VEG/20170528/0/L2A/B01.tif
试试下面的代码片段
filename = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
found = re.sub('(\d+)_(\d+)_(.*)_(.*)\.tif', r'///.tif', filename)
print(found) # prints 2017/05/S2B_7VEG_20170528_0_L2A/B01.tif
也许你可以这样做:
from os import listdir, mkdir
from os.path import isfile, join, isdir
my_path = 'your_soure_dir'
files_name = [f for f in listdir(my_path) if isfile(join(my_path, f))]
def create_dir(files_name):
for file in files_name:
month = file.split('_', '1')[0]
week = file.split('_', '2')[1]
if not isdir(my_path):
mkdir(month)
mkdir(week)
### your download code
我有一个包含数百个文件的文件夹,名称如下:
"2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
惯例:
year_month_ID_zone_date_0_L2A_B01.tif
("_0_L2A_B01.tif"
,并且 "zone"
永远不会改变)
我需要遍历每个文件并根据它们的名称构建路径以便下载它们。 例如:
name = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
path = "2017/5/S2B_7VEG_20170528_0_L2A/B01.tif"
路径约定需要是:path = year/month/ID_zone_date_0_L2A/B01.tif
我想做一个循环,每次遇到 "_"
字符时将我的字符串“剪切”成几个部分,然后按正确的顺序拼接不同的部分以创建我的路径名。
我试过了,但没用:
import re
filename =
"2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
try:
found = re.search('_(.+?)_', filename).group(1)
except AttributeError:
# _ not found in the original string
found = '' # apply your error handling
我怎样才能在 Python 上做到这一点?
无需正则表达式——您只需使用 split()
.
filename = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
parts = filename.split("_")
year = parts[0]
month = parts[1]
因为你只有一个分隔符,你不妨简单地使用 Python 的内置拆分功能:
import os
items = filename.split('_')
year, month = items[:2]
new_filename = '_'.join(items[2:])
path = os.path.join(year, month, new_filename)
filename = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
temp = filename.split('_')
result = "/".join(temp)
print(result)
结果是
2017/05/S2B/7VEG/20170528/0/L2A/B01.tif
试试下面的代码片段
filename = "2017_05_S2B_7VEG_20170528_0_L2A_B01.tif"
found = re.sub('(\d+)_(\d+)_(.*)_(.*)\.tif', r'///.tif', filename)
print(found) # prints 2017/05/S2B_7VEG_20170528_0_L2A/B01.tif
也许你可以这样做:
from os import listdir, mkdir
from os.path import isfile, join, isdir
my_path = 'your_soure_dir'
files_name = [f for f in listdir(my_path) if isfile(join(my_path, f))]
def create_dir(files_name):
for file in files_name:
month = file.split('_', '1')[0]
week = file.split('_', '2')[1]
if not isdir(my_path):
mkdir(month)
mkdir(week)
### your download code