带有 iso-8859-1 (latin1) 字符的 Cookiecutter 模板
Cookiecutter template with iso-8859-1 (latin1) characters
我正在为主要使用 utf-8 编码的 Python 项目创建自己的 cookiecutter template。但它包含以 iso-8859-1 (latin1) 编码的 .ini
和 .php
文件。此资源必须以 latin1 编码,因为它是遗留代码的一部分。
当我运行:
cookiecutter cookiecutter-mytemplate # <- directory of my project
我在代码生成过程中遇到以下错误:
Traceback (most recent call last):
...
File ".../lib/python2.7/site-packages/cookiecutter/cli.py", line 123, in main
default_config=default_config,
File ".../lib/python2.7/site-packages/cookiecutter/main.py", line 91, in cookiecutter
output_dir=output_dir
File ".../lib/python2.7/site-packages/cookiecutter/generate.py", line 349, in generate_files
generate_file(project_dir, infile, context, env)
File ".../lib/python2.7/site-packages/cookiecutter/generate.py", line 166, in generate_file
tmpl = env.get_template(infile_fwd_slashes)
File ".../lib/python2.7/site-packages/jinja2/environment.py", line 830, in get_template
return self._load_template(name, self.make_globals(globals))
File ".../lib/python2.7/site-packages/jinja2/environment.py", line 804, in _load_template
template = self.loader.load(self, name, globals)
File ".../lib/python2.7/site-packages/jinja2/loaders.py", line 113, in load
source, filename, uptodate = self.get_source(environment, name)
File ".../lib/python2.7/site-packages/jinja2/loaders.py", line 175, in get_source
contents = f.read().decode(self.encoding)
File ".../lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 308: invalid continuation byte
当然,结果是部分生成的,在解析iso-8859-1文件(里面有一个“é”)时被打断了。
我可以使用 pre-/post- 钩子在模板生成之前将我的资源转换为 utf-8,然后再将它们转换回 iso-8859-1 吗?以及如何?
有没有办法处理非 utf-8 文件?
最后,我将所有 .ini
和 .php
文件存储在 utf-8 中,并使用钩子将文件转换为 post 代中的 iso-8859-1。
这里是hooks/post_gen_project.py
的代码:
# coding: utf-8
from __future__ import print_function, unicode_literals
import io
import os
def convert_resources(src_dir):
if "src" in os.listdir(src_dir):
src_dir = os.path.join(src_dir, "src")
print("src_dir: " + src_dir)
for root_dir, dirnames, filenames in os.walk(src_dir):
for filename in filenames:
ext = os.path.splitext(filename)[1]
if ext in ('.ini', '.php'):
src_path = os.path.join(root_dir, filename)
print(" Converting '{relpath}'...".format(relpath=os.path.relpath(src_path, src_dir)))
with io.open(src_path, mode="r", encoding="utf-8") as fd:
content = fd.read()
with io.open(src_path, mode="w", encoding="iso-8859-1") as fd:
fd.write(content)
if __name__ == '__main__':
convert_resources(os.getcwd())
我正在为主要使用 utf-8 编码的 Python 项目创建自己的 cookiecutter template。但它包含以 iso-8859-1 (latin1) 编码的 .ini
和 .php
文件。此资源必须以 latin1 编码,因为它是遗留代码的一部分。
当我运行:
cookiecutter cookiecutter-mytemplate # <- directory of my project
我在代码生成过程中遇到以下错误:
Traceback (most recent call last):
...
File ".../lib/python2.7/site-packages/cookiecutter/cli.py", line 123, in main
default_config=default_config,
File ".../lib/python2.7/site-packages/cookiecutter/main.py", line 91, in cookiecutter
output_dir=output_dir
File ".../lib/python2.7/site-packages/cookiecutter/generate.py", line 349, in generate_files
generate_file(project_dir, infile, context, env)
File ".../lib/python2.7/site-packages/cookiecutter/generate.py", line 166, in generate_file
tmpl = env.get_template(infile_fwd_slashes)
File ".../lib/python2.7/site-packages/jinja2/environment.py", line 830, in get_template
return self._load_template(name, self.make_globals(globals))
File ".../lib/python2.7/site-packages/jinja2/environment.py", line 804, in _load_template
template = self.loader.load(self, name, globals)
File ".../lib/python2.7/site-packages/jinja2/loaders.py", line 113, in load
source, filename, uptodate = self.get_source(environment, name)
File ".../lib/python2.7/site-packages/jinja2/loaders.py", line 175, in get_source
contents = f.read().decode(self.encoding)
File ".../lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 308: invalid continuation byte
当然,结果是部分生成的,在解析iso-8859-1文件(里面有一个“é”)时被打断了。
我可以使用 pre-/post- 钩子在模板生成之前将我的资源转换为 utf-8,然后再将它们转换回 iso-8859-1 吗?以及如何?
有没有办法处理非 utf-8 文件?
最后,我将所有 .ini
和 .php
文件存储在 utf-8 中,并使用钩子将文件转换为 post 代中的 iso-8859-1。
这里是hooks/post_gen_project.py
的代码:
# coding: utf-8
from __future__ import print_function, unicode_literals
import io
import os
def convert_resources(src_dir):
if "src" in os.listdir(src_dir):
src_dir = os.path.join(src_dir, "src")
print("src_dir: " + src_dir)
for root_dir, dirnames, filenames in os.walk(src_dir):
for filename in filenames:
ext = os.path.splitext(filename)[1]
if ext in ('.ini', '.php'):
src_path = os.path.join(root_dir, filename)
print(" Converting '{relpath}'...".format(relpath=os.path.relpath(src_path, src_dir)))
with io.open(src_path, mode="r", encoding="utf-8") as fd:
content = fd.read()
with io.open(src_path, mode="w", encoding="iso-8859-1") as fd:
fd.write(content)
if __name__ == '__main__':
convert_resources(os.getcwd())