编码定义必须在 Python 中的第 1/2 行吗?

Must the encoding definition be in the 1st/2nd line in Python?

来自PEP263

To define a source code encoding, a magic comment must be placed into the source files either as first or second line in the file, such as:

# coding=<encoding name>

or (using formats recognized by popular editors):

#!/usr/bin/python
# -*- coding: <encoding name> -*-

如果在某些情况下许可信息出现在最顶行怎么办?来自 https://github.com/google/seq2seq/blob/master/seq2seq/training/utils.py:

# Copyright 2017 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# -*- coding: utf-8 -*-
"""Miscellaneous training utility functions.
"""

Python 解释器仍会 "magically" 接受编码定义吗? 如果答案能解释为什么必须在第一两行和指向解释器代码的指针会很棒!

是的,在Python2中,UTF-8编码需要那个编码标记,如果超出第二行,文件中有任何非ASCII字符,你会提出一个像这样的错误:

File "encoded.py", line 5
SyntaxError: Non-ASCII character '\xe1' in file encoded.py on line 5, but 
no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

如果文件只包含ASCII字符,它仍然可以工作,即使UTF-8编码标记晚于第2行。ASCII是UTF-8的子集,基本上,后期编码指令被忽略。 (这似乎是您引用的特定 utils.py 的情况。)

许多解析器和其他文件处理器要求将此类神奇命令放在文件的开头,因为必须扫描并考虑它们才能正确解释文件。稍后再放,效率很低,需要扫描整个文件才能找到一些 "magic" 个特殊情况。

您将在 Python 3 中获得一些回旋余地,它假设使用 UTF-8 编码。尽管如果您的文件以其他方式编码,您仍然希望包含它。

规范允许前 两行 允许在 unix 系统上使用 shebang #!...

不行,第二行之后不允许。

这是来自 cpython 的分词器的一段代码,它检查(并解析)编码 cookie:https://github.com/python/cpython/blob/9e52c907b5511393ab7e44321e9521fe0967e34d/Parser/tokenizer.c#L613-L616