Date

:tada: Python 3.7.0 在本周发布了 :tada:

What’s New In Python 3.7

Syntax changes

asyncawait 成为关键字

In [9]: import keyword

In [10]: all(map(keyword.iskeyword, ('async', 'await')))
Out[10]: True

dict 对象保持插入顺序的特性已经被声明为 Python 语言规范的一部分

New Features

PEP 563 - Postponed Evaluation of Annotations

推迟了注解的求值时间,解决了下面的两个问题:

  • 注解只能使用当前作用域中已经存在的名称,换句话说它不支持 forward reference
  • 注解对于 Python 程序的启动时间存在不利影响

由于会破坏兼容性所以需要通过 from __future__ import annotations 来开启

# 支持 forward reference
from __future__ import annotations
def hello() -> Hello:
    return Hello()

class Hello:
    pass

原理是解释器将注解以 AST 等价的字符串形式存储,而不是在定义时对注解中的表达式进行求值。这种策略优化了 Python 的启动速度。可以自行对比 Python 3.6 和 Python 3.7 下此段代码的区别

from __future__ import annotations  # Only Python 3.7

def func() -> int:
    return 0

print(func.__annotations__)
# output Python 3.7
# {'return': 'int'}
# output Python 3.6
# {'return': <class 'int'>}

当然可以通过 typing.get_type_hints() 将字符串形式的注解转换为实际的类型

from __future__ import annotations
import typing

def func() -> int:
    return 0

print(typing.get_type_hints(func))

将字符串形式注解转换为实际类型的时候需要给与一个作用域,这点也可以从函数签名上看出 typing.get_type_hints(obj[, globals[, locals]])。如果人为地去干预作用域的话,是不是可以做一些有趣的事情呢?

PEP 538 - Legacy C Locale Coercion

PEP 538 updates the default interpreter command line interface to automatically coerce that locale to an available UTF-8 based locale as described in the documentation of the new PYTHONCOERCECLOCALE environment variable. Automatically setting LC_CTYPE this way means that both the core interpreter and locale-aware C extensions (such as readline) will assume the use of UTF-8 as the default text encoding, rather than ASCII.

PEP 540 - Forced UTF-8 Runtime Mode

通过 -X utf8 命令行参数或者 PYTHONUTF8 环境变量可以强制 CPython 开启 UTF-8 模式。在 UTF-8 模式下,CPython 会忽略本地设置,而使用 UTF-8 编码。sys.stdinsys.stdout 流的错误处理将被设置为 surrogateescape

PEP 553 - Built-in breakpoint()

简化了调试流程,原先如果使用 pdb 调试的话需要至少添加 27 个字符

foo()
import pdb; pdb.set_trace()
bar()

Python 3.7 添加了新的内建函数 breakpoint()breakpoint() 会调用 sys.breakpointhook(),其默认行为是 import pdb; pdb.set_trace()。可以通过修改 sys.breakpointhook 的绑定来自定义调试行为。另外 sys.breakpointhook 的默认实现里是支持通过 PYTHONBREAKPOINT 环境变量来自定义调试的。sys.__breakpointhook__ 存储了默认的 sys.breakpointhook 实现,所以可以通过 sys.breakpointhook = sys.__breakpointhook__ 来还原

PEP 539 - New C API for Thread-Local Storage

弃用原先的 TLS API,因为在所有平台上它都是使用 int 来表示 TLS key。对于官方支持的平台而言,这不是问题,但这不符合 POSIX 标准,也不具有任何实际意义上的可移植性。PEP 539 提供了新的 Thread Specific Storage (TSS) API,其使用新的类型 Py_tss_t 来表示 key

PEP 562 - Customization of Access to Module Attributes

可以在模块中定义 __getattr__()__dir__() 了。典型的应用场景是

  • module attribute deprecation
  • lazy loading

首先来看 module attribute deprecation 的例子

from warnings import warn

deprecated_names = ["old_function", ...]

def _deprecated_old_function(arg, other):
    ...

def __getattr__(name):
    if name in deprecated_names:
        warn(f"{name} is deprecated", DeprecationWarning)
        return globals()[f"_deprecated_{name}"]
    raise AttributeError(f"module {__name__} has no attribute {name}")

再来看一下 lazy loading,这本身是一种 设计模式

# lib/__init__.py

import importlib

__all__ = ['submod', ...]

def __getattr__(name):
    if name in __all__:
        return importlib.import_module("." + name, __name__)
    raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

# lib/submod.py

print("Submodule loaded")
class HeavyClass:
    ...

# main.py

import lib
lib.submodule.HeavyClass  # prints "Submodule loaded"

另外我发现了自己的一个知识盲区,下面的代码是会输出两次 xx

In [2]: !cat m.py
def __getattr__(attr):
    print(attr)

In [3]: from m import xx
__path__
xx
xx

原因可以在 Python 的 文档 中找到

The from form uses a slightly more complex process:

  1. find the module specified in the from clause, loading and initializing it if necessary;
  2. for each of the identifiers specified in the import clauses:
  3. check if the imported module has an attribute by that name
  4. if not, attempt to import a submodule with that name and then check the imported module again for that attribute
  5. if the attribute is not found, ImportError is raised.
  6. otherwise, a reference to that value is stored in the local namespace, using the name in the as clause if it is present, otherwise using the attribute name

PEP 564 - New Time Functions With Nanosecond Resolution

time 添加了 6 个纳秒级别精度的 API,返回值均为整数

time.clock_gettime_ns()
time.clock_settime_ns()
time.monotonic_ns()
time.perf_counter_ns()
time.process_time_ns()
time.time_ns()

PEP 565 - Show DeprecationWarning in __main__

Python 3.2 起 DeprecationWarning 默认会被忽略,现在直接在 __main__ 中执行的代码可以重新显示这些信息了。

As a result, developers of single file scripts and those using Python interactively should once again start seeing deprecation warnings for the APIs they use, but deprecation warnings triggered by imported application, library and framework modules will continue to be hidden by default.

目前标准库允许开发者选择三种不同的 deprecation warning 行为,分别如下:

  • FutureWarning: always displayed by default, recommended for warnings intended to be seen by application end users (e.g. for deprecated application configuration settings).
  • DeprecationWarning: displayed by default only in __main__ and when running tests, recommended for warnings intended to be seen by other Python developers where a version upgrade may result in changed behaviour or an error.
  • PendingDeprecationWarning: displayed by default only when running tests, intended for cases where a future version upgrade will change the warning category to DeprecationWarning or FutureWarning.

PEP 560 - Core Support for typing module and Generic Types

引入:

  • __class_getitem__()
  • __mro_entries__

用法可以参考 PEP 560 Spec 部分

PEP 552 - Hash-based .pyc Files

Python 通过比对代码文件的元数据(最后修改时间和大小)和 pyc 文件头部的元数据来确定是否需要更新 pyc 文件。当文件系统的时间戳比较粗糙时,Python 可能会错过更新。另外在缓存文件中添加时间戳对于 build reproduciblity 和基于内容的构建系统来说是存在问题的。

PEP 552 扩展了 pyc 格式,可以使用文件的哈希值进行判断。默认情况下,Python 仍然使用基于时间戳的方式检测,并且不会在运行时生成基于哈希的 pyc 文件。基于哈希的 pyc 文件可以用 py_compilecompileall 生成。

基于哈希的 pyc 文件有两种变种: checked 和 uncheckd。下面的说明来源于 文档

For checked hash-based .pyc files, Python validates the cache file by hashing the source file and comparing the resulting hash with the hash in the cache file. If a checked hash-based cache file is found to be invalid, Python regenerates it and writes a new checked hash-based cache file. For unchecked hash-based .pyc files, Python simply assumes the cache file is valid if it exists. Hash-based .pyc files validation behavior may be overridden with the --check-hash-based-pycs flag.

Development Runtime Mode: -X dev

-X dev 命令行选项和 PYTHONDEVMODE 环境变量可以用于启用 CPython 的开发者模式

Other Language Changes

内容比较多,请参考 原文

New library modules

Improved Modules

标准库做了一些修改,内容比较多,请参考 原文