如何将末尾带有 UTC-# 的字符串转换为 Python 中的 DateTimeField?
How to convert strings with UTC-# at the end to a DateTimeField in Python?
我有一个字符串列表,格式如下:'2/24/2021 3:37:04 PM UTC-6'
我该如何转换?
我试过了
datetime.strptime(my_date_object, '%m/%d/%Y %I:%M:%s %p %Z')
但我收到一条错误消息“未转换的数据仍然存在:-6”
这是因为最后的UTC-6吗?
@MrFuppes mentioned in their 的方法是最简单的方法。
Ok seems like you need to split the string on 'UTC' and parse the offset separately. You can then set the tzinfo from a timedelta
input_string = '2/24/2021 3:37:04 PM UTC-6'
try:
dtm_string, utc_offset = input_string.split("UTC", maxsplit=1)
except ValueError:
# Couldn't split the string, so no "UTC" in the string
print("Warning! No timezone!")
dtm_string = input_string
utc_offset = "0"
dtm_string = dtm_string.strip() # Remove leading/trailing whitespace '2/24/2021 3:37:04 PM'
utc_offset = int(utc_offset) # Convert utc offset to integer -6
tzinfo = tzinfo = datetime.timezone(datetime.timedelta(hours=utc_offset))
result_datetime = datetime.datetime.strptime(dtm_string, '%m/%d/%Y %I:%M:%S %p').replace(tzinfo=tzinfo)
print(result_datetime)
# prints 2021-02-24 15:37:04-06:00
或者,如果您使用正则表达式很容易提取相关组件,则可以避免使用 datetime.strptime
rex = r"(\d{1,2})\/(\d{1,2})\/(\d{4}) (\d{1,2}):(\d{2}):(\d{2}) (AM|PM) UTC(\+|-)(\d{1,2})"
input_string = '2/24/2021 3:37:04 PM UTC-6'
r = re.findall(rex, input_string)
# gives: [('2', '24', '2021', '3', '37', '04', 'PM', '-', '6')]
mm = int(r[0][0])
dd = int(r[0][1])
yy = int(r[0][2])
hrs = int(r[0][3])
mins = int(r[0][4])
secs = int(r[0][5])
if r[0][6].upper() == "PM":
hrs = hrs + 12
tzoffset = int(f"{r[0][7]}{r[0][8]}")
tzinfo = datetime.timezone(datetime.timedelta(hours=tzoffset))
result_datetime = datetime.datetime(yy, mm, dd, hrs, mins, secs, tzinfo=tzinfo)
print(result_datetime)
# prints 2021-02-24 15:37:04-06:00
正则表达式(\d{1,2})\/(\d{1,2})\/(\d{4}) (\d{1,2}):(\d{2}):(\d{2}) (AM|PM) UTC(\+|-)(\d{1,2})
Demo
解释:
(\d{1,2})
:一位或两位数。周围的括号表示这是一个捕获组。类似的构造用于获取月份、日期和小时以及 UTC 偏移量
\/
: 正斜杠
(\d{4})
:正好是四位数。也是一个捕获组。分钟和秒使用类似的结构。
(AM|PM)
:“上午”或“下午”
UTC(\+|-)(\d{1,2})
:“UTC”,后跟一个加号或减号,再后跟一个或两个数字。
我有一个字符串列表,格式如下:'2/24/2021 3:37:04 PM UTC-6'
我该如何转换?
我试过了
datetime.strptime(my_date_object, '%m/%d/%Y %I:%M:%s %p %Z')
但我收到一条错误消息“未转换的数据仍然存在:-6”
这是因为最后的UTC-6吗?
@MrFuppes mentioned in their
Ok seems like you need to split the string on 'UTC' and parse the offset separately. You can then set the tzinfo from a timedelta
input_string = '2/24/2021 3:37:04 PM UTC-6'
try:
dtm_string, utc_offset = input_string.split("UTC", maxsplit=1)
except ValueError:
# Couldn't split the string, so no "UTC" in the string
print("Warning! No timezone!")
dtm_string = input_string
utc_offset = "0"
dtm_string = dtm_string.strip() # Remove leading/trailing whitespace '2/24/2021 3:37:04 PM'
utc_offset = int(utc_offset) # Convert utc offset to integer -6
tzinfo = tzinfo = datetime.timezone(datetime.timedelta(hours=utc_offset))
result_datetime = datetime.datetime.strptime(dtm_string, '%m/%d/%Y %I:%M:%S %p').replace(tzinfo=tzinfo)
print(result_datetime)
# prints 2021-02-24 15:37:04-06:00
或者,如果您使用正则表达式很容易提取相关组件,则可以避免使用 datetime.strptime
rex = r"(\d{1,2})\/(\d{1,2})\/(\d{4}) (\d{1,2}):(\d{2}):(\d{2}) (AM|PM) UTC(\+|-)(\d{1,2})"
input_string = '2/24/2021 3:37:04 PM UTC-6'
r = re.findall(rex, input_string)
# gives: [('2', '24', '2021', '3', '37', '04', 'PM', '-', '6')]
mm = int(r[0][0])
dd = int(r[0][1])
yy = int(r[0][2])
hrs = int(r[0][3])
mins = int(r[0][4])
secs = int(r[0][5])
if r[0][6].upper() == "PM":
hrs = hrs + 12
tzoffset = int(f"{r[0][7]}{r[0][8]}")
tzinfo = datetime.timezone(datetime.timedelta(hours=tzoffset))
result_datetime = datetime.datetime(yy, mm, dd, hrs, mins, secs, tzinfo=tzinfo)
print(result_datetime)
# prints 2021-02-24 15:37:04-06:00
正则表达式(\d{1,2})\/(\d{1,2})\/(\d{4}) (\d{1,2}):(\d{2}):(\d{2}) (AM|PM) UTC(\+|-)(\d{1,2})
Demo
解释:
(\d{1,2})
:一位或两位数。周围的括号表示这是一个捕获组。类似的构造用于获取月份、日期和小时以及 UTC 偏移量\/
: 正斜杠(\d{4})
:正好是四位数。也是一个捕获组。分钟和秒使用类似的结构。(AM|PM)
:“上午”或“下午”UTC(\+|-)(\d{1,2})
:“UTC”,后跟一个加号或减号,再后跟一个或两个数字。