使用 python 2.6 在 CSV 中格式化不同类型的日期
format the different types of date in CSV using python 2.6
我已经格式化了我的 csv 文件,现在它看起来像这样:
100|1000|newyork|2015/10/04|2015/10/04 16:23:37.040000|
101|1001|london|2015/10/04|2015/10/04 16:23:37.040000|
102|1002|california|2015/10/04|2015/10/04 16:23:37.041000|
103|1003|Delhi|2015/10/04|2015/10/04 16:23:37.041000|
104|1004|Mumbai|2015/10/04|2015/10/04 16:23:37.041000|
105|1005|Islamabad|2015/10/04|2015/10/04 16:23:37.041000|
106|1006|karachi|2015/10/04|2015/10/04 16:23:37.041000|
现在我有两种不同格式的日期,我想将其转换为 'YYmmdd' 格式。
任何人都可以提出实现这一目标的最佳方法。
注意:文件名不应更改,供您参考,这是我如何实现此处给出的格式化文件的方式:
inputfile = 'c:\Working\HK.txt'
outputfile = inputfile + '.tmp'
with contextlib.nested(open(inputfile, 'rb'), open(outputfile, 'wb')) as (inf,outf):
reader = csv.reader(inf)
writer = csv.writer(outf, delimiter='|')
for row in reader:
writer.writerow([col.replace('|', ' ') for col in row])
writer.writerow([])
os.remove(inputfile)
os.rename(outputfile,inputfile)
我认为这应该可行。您可以通过更改 strftime 来随意调整日期格式。
#!/usr/bin/python
from dateutil.parser import parse
lines = ['100|1000|newyork|2015/10/04|2015/10/04 16:23:37.040000|',
'101|1001|london|2015/10/04|2015/10/04 16:23:37.040000|',
'102|1002|california|2015/10/04|2015/10/04 16:23:37.041000|',
'103|1003|Delhi|2015/10/04|2015/10/04 16:23:37.041000|',
'104|1004|Mumbai|2015/10/04|2015/10/04 16:23:37.041000|',
'105|1005|Islamabad|2015/10/04|2015/10/04 16:23:37.041000|',
'106|1006|karachi|2015/10/04|2015/10/04 16:23:37.041000|']
for line in lines:
parts = line.split("|");
tmp_date = parse(parts[3])
parts[3] = tmp_date.strftime('%Y%m%d')
tmp_date = parse(parts[4])
parts[4] = tmp_date.strftime('%Y%m%d')
new_line = "|".join(parts)
print new_line
如果你有 Python 2.6+,你可以在 python
中完成
from __future__ import print_function
import re
with open('data','r') as f, open('data_out', 'w') as f_out:
for line in f:
line = re.sub('(|\d{4})/(\d{2})/(\d{2})',r'', line)
line = re.sub('\s+\d{2}:\d{2}:\d{2}.\d+(|)',r'', line)
print(line, file=f_out)
this is what i got in my data_out
100|1000|newyork|20151004|20151004|
101|1001|london|20151004|20151004|
102|1002|california|20151004|20151004|
103|1003|Delhi|20151004|20151004|
104|1004|Mumbai|20151004|20151004|
105|1005|Islamabad|20151004|20151004|
106|1006|karachi|20151004|20151004|
我已经格式化了我的 csv 文件,现在它看起来像这样:
100|1000|newyork|2015/10/04|2015/10/04 16:23:37.040000|
101|1001|london|2015/10/04|2015/10/04 16:23:37.040000|
102|1002|california|2015/10/04|2015/10/04 16:23:37.041000|
103|1003|Delhi|2015/10/04|2015/10/04 16:23:37.041000|
104|1004|Mumbai|2015/10/04|2015/10/04 16:23:37.041000|
105|1005|Islamabad|2015/10/04|2015/10/04 16:23:37.041000|
106|1006|karachi|2015/10/04|2015/10/04 16:23:37.041000|
现在我有两种不同格式的日期,我想将其转换为 'YYmmdd' 格式。
任何人都可以提出实现这一目标的最佳方法。 注意:文件名不应更改,供您参考,这是我如何实现此处给出的格式化文件的方式:
inputfile = 'c:\Working\HK.txt'
outputfile = inputfile + '.tmp'
with contextlib.nested(open(inputfile, 'rb'), open(outputfile, 'wb')) as (inf,outf):
reader = csv.reader(inf)
writer = csv.writer(outf, delimiter='|')
for row in reader:
writer.writerow([col.replace('|', ' ') for col in row])
writer.writerow([])
os.remove(inputfile)
os.rename(outputfile,inputfile)
我认为这应该可行。您可以通过更改 strftime 来随意调整日期格式。
#!/usr/bin/python
from dateutil.parser import parse
lines = ['100|1000|newyork|2015/10/04|2015/10/04 16:23:37.040000|',
'101|1001|london|2015/10/04|2015/10/04 16:23:37.040000|',
'102|1002|california|2015/10/04|2015/10/04 16:23:37.041000|',
'103|1003|Delhi|2015/10/04|2015/10/04 16:23:37.041000|',
'104|1004|Mumbai|2015/10/04|2015/10/04 16:23:37.041000|',
'105|1005|Islamabad|2015/10/04|2015/10/04 16:23:37.041000|',
'106|1006|karachi|2015/10/04|2015/10/04 16:23:37.041000|']
for line in lines:
parts = line.split("|");
tmp_date = parse(parts[3])
parts[3] = tmp_date.strftime('%Y%m%d')
tmp_date = parse(parts[4])
parts[4] = tmp_date.strftime('%Y%m%d')
new_line = "|".join(parts)
print new_line
如果你有 Python 2.6+,你可以在 python
中完成from __future__ import print_function
import re
with open('data','r') as f, open('data_out', 'w') as f_out:
for line in f:
line = re.sub('(|\d{4})/(\d{2})/(\d{2})',r'', line)
line = re.sub('\s+\d{2}:\d{2}:\d{2}.\d+(|)',r'', line)
print(line, file=f_out)
this is what i got in my data_out
100|1000|newyork|20151004|20151004|
101|1001|london|20151004|20151004|
102|1002|california|20151004|20151004|
103|1003|Delhi|20151004|20151004|
104|1004|Mumbai|20151004|20151004|
105|1005|Islamabad|20151004|20151004|
106|1006|karachi|20151004|20151004|