Sans-I/O : 无I/O网络协议实现

Jace Lau

2022-03-15

2022

Page content

Sans-I/O : 无I/O网络协议实现

写在前面

原文地址：Writing I/O-Free (Sans-I/O) Protocol Implementations — Sans I/O 1.0.0 documentation (sans-io.readthedocs.io)

本文档旨在介绍不实现io的网络协议实现方案，并提供如何在python中进行实现的具体说明。

由于作者水平有限，难免出现错误，如发现，请联系指正。

什么是I/O free协议实现方案

An I/O-free protocol implementation (colloquially referred to as a “sans-IO” implementation) is an implementation of a network protocol that contains no code that does any form of network I/O or any form of asynchronous flow control. Put another way, a sans-IO protocol implementation is one that is defined entirely in terms of synchronous functions returning synchronous results, and that does not block or wait for any form of I/O. Examples of this kind of application can be found on the landing page.

Such a protocol implementation is, at the surface level, not very useful. After all, it’s not very helpful to write a network protocol implementation that doesn’t speak to the network! However, it turns out that writing protocol implementations in this manner provides a number of extremely useful benefits to the wider software ecosystem, as well as to the quality of your own code.

In the rest of this document we’ll outline what those benefits are, and also describe some of the techniques that are commonly employed to write protocol stacks in this manner.

无I/O协议实现（通俗地称为“sans IO”实现）是一种网络协议的实现模式，它不包含任何形式的执行网络I/O或任何形式的异步流控制的代码。换句话说，sans IO协议实现完全是根据返回同步结果的同步函数定义的，并且不阻塞或等待任何形式的I/O。此类应用程序的示例可以在主页上找到。

这样的协议实现在表面上不是很有用。毕竟，编写一个不与网络对话的网络协议实现并没有多大帮助！然而，事实证明，以这种方式编写协议实现为更广泛的软件生态系统以及您自己的代码质量提供了许多非常有用的好处。

在本文档的其余部分中，我们将概述这些好处是什么，并描述以这种方式编写协议栈时常用的一些技术。

为什么要选择I/O free实现

Writing sans-IO protocol implementations provides a number of useful benefits, both to the implemenation itself and to the wider software ecosystem. We’ll discuss these in turn.

编写sans-IO风格协议实现为实现协议本身和更广泛的软件生态系统提供了许多有用的好处。我们将依次讨论这些问题。

Simplicity, Testability, and Correctness

简洁性

On a self-serving level, writing a protocol implementation containing no I/O makes writing a high-quality implementation substantially easier. It is no secret that network I/O is particularly prone to a wide variety of unexpected failure modes that can occur at almost any time, even in the simplest cases. When the protocol implementation no longer drives its own I/O but instead has data passed to and from it using in-memory buffers of bytes, the space of possible failures is substantially decreased.

Given that writing to and reading from memory never fails, the implementation has a much simpler time managing its data. The only concern the implementation now has is around buffering incomplete data (that is, data that cannot yet be fully parsed). Buffering data is generally a fairly simple concern compared to dealing with sockets, and is usually required anyway (as anyone who has encountered a short recv response can testify). All of this means that the implementation has a much simpler time managing its input and output data.

This simplicity also ends up stretching to flow control within the implementation as well. Given that the implementation now spends all its time reading from and writing to byte buffers, it is never possible for the implementation to block or need to stop except when it runs out of room in its buffers. There is never any requirement to pause computation in order to wait for more data to arrive or for data to be sent, and it is relatively easy to structure your implementation so it does not have to be safe to re-entrancy. All of this has the effect of vastly limiting the number of possible flows of control through the implementation, which drastically helps your ability to understand the implementation.

从自我服务的角度来看，编写一个不包含I/O的协议实现会使编写一个高质量的实现变得更加容易。众所周知，网络I/O特别容易出现各种各样的意外故障模式，这些故障模式几乎可以在任何时候发生，即使是在最简单的情况下。当协议实现不再驱动自己的I/O，而是使用内存中的字节缓冲区来传递数据时，可能的故障空间会大大减少。

考虑到对内存的写入和读取从来不会失败，该实现在管理数据方面要简单得多。该实现现在唯一关心的是缓冲不完整的数据（即尚未完全解析的数据）。与处理套接字相比，缓冲数据通常是一个相当简单的问题，而且通常是必需的（任何遇到短recv响应的人都可以证明这一点）。所有这些都意味着，实现管理其输入和输出数据的时间要简单得多。

这种简单性最终也扩展到了实现中的流控制。考虑到这种实现方式现在将所有时间都花在读取和写入字节缓冲区上，实现永远不可能阻塞或停止，除非它的缓冲区空间不足。永远不需要暂停计算以等待更多数据到达或发送数据，而且构建实现相对容易，因此无需考虑安全地重新进入这种问题。所有这些都极大地限制了通过这种实现的可能控制流的数量，这大大有助于您理解实现方案。

####　可测试性

From simplicity flows testability. Because there are far fewer possible flows of control through the program, it is much, much easier to hit 100% branch coverage from your tests, because there are far fewer possible entry points and locations in the state space.

Additionally, because the code under test deals only with buffers of bytes for both its input and output, the test code no longer needs to pretend to manage sockets. It becomes extremely simple to write tests that validate the correctness of the implementation because those tests simply shove sequences of bytes in and validate the sequences of bytes that come out. As there is no I/O involved in the implementation (even mocked-out I/O), there is no risk of non-determinism in the tests. Either the test passes or it does not: it will never be a “flaky” test that requires a difficult-to-reproduce test environment.

It also becomes possible to make certain assertions about the code under test that are generally impossible when the implementation is making I/O calls. For example, it is not unreasonable to make the assertion that the tests should be able to attain 100% code and branch coverage using only the public APIs: that is, only by passing in byte sequences or calling public API functions. If a branch of code is not reachable by making those calls, it is quite literally impossible to reach without the user monkeypatching your implementation. Such code is entirely unnecessary, and can be safely removed entirely.

Finally, because you have no mocking and no actual I/O, your tests become extremely fast and safe to run in parallel. It is also pretty easy to provide test fixtures and combinatorial expansion of tests, which makes it possible to provide thousands of test cases. It is also very easy to write new test cases to reproduce bugs and to prevent regressions.

Achieving all of this when the implementation uses asynchronous flow control primitives or actual I/O is much harder. To validate the correctness of the implementation in the face of all possible I/O errors is extremely difficult, and requires essentially triggering all of these possible I/O errors at all locations they could possibly occur in your code, and then validating that the implementation handles that error as you’d expect. A similar notion is true for asynchronous flow control primitives: these need to be fired in all possible orders to make the same guarantees about correctness.

从简单到可测试性。因为通过程序的可能控制流要少得多，所以测试命中100%的分支覆盖率要容易得多，因为状态空间中可能的入口点和位置要少得多。

此外，由于测试代码只处理输入和输出的字节缓冲区，因此测试代码不再需要假装管理套接字。编写验证实现正确性的测试变得非常简单，因为这些测试只需插入字节序列并验证输出的字节序列。由于实现中不涉及I/O（甚至是模拟I/O），因此测试中不存在非确定性的风险。测试要么通过，要么不通过：它永远不会是一个需要一个难以复制的测试环境的“脆弱”测试。

当实现进行I/O调用时，还可以对被测代码做出通常不可能的某些断言。例如，断言测试应该能够仅使用公共API实现100%的代码和分支覆盖率并非不合理：也就是说，仅通过传递字节序列或调用公共API函数。如果通过这些调用无法访问代码的一个分支，那么如果用户不修改您的实现，就不可能访问它。这样的代码完全没有必要，可以安全地完全删除。

最后，因为没有模拟和实际的I/O，所以并行运行的测试变得非常快速和安全。提供测试夹具和测试组合扩展也很容易，这使得提供数千个测试用例成为可能。编写新的测试用例来重现错误和防止回归也非常容易。

当实现使用异步流控制原语或实际I/O时，实现所有这些都要困难得多。面对所有可能的I/O错误，要验证实现的正确性是极其困难的，需要在代码中可能发生的所有位置触发所有这些可能的I/O错误，然后验证实现是否如您所期望的那样处理该错误。异步流控制原语也有一个类似的概念：这些原语需要以所有可能的顺序触发，以对正确性做出相同的保证。

正确性

From simplicity and testability flows correctness. It is possible to develop an extremely high degree of confidence about the correctness of the protocol implementation due to the relative simplicity of the implementation and due to the extremely high quantity of test coverage that you can achieve.

While this level of correctness may not be achievable by the application that uses the implementation, it is nonetheless extremely helpful to know that the protocol implementation is highly likely to behave in a reproducible, consistent way, and to generate well-formed output in all cases.

This is particularly valuable with network protocols, as it allows you to drastically increase the probability that your implementation will be able to interoperate with other implementations in the wild. And in the event that you have an interoperability problem, it will be easy to reproduce that problem in test conditions and confirm whether it is your implementation or the other one that has misunderstood the protocol.

Finally, from a truly selfish perspective, the more correct your implementation is, the fewer bug reports you’ll have to deal with from your users!

从简单性和可测试性流向正确性。由于实现相对简单，并且由于您可以实现极高的测试覆盖率，因此可以对协议实现的正确性建立极高的信心。

虽然使用该实现的应用程序可能无法实现这种级别的正确性，但知道协议实现极有可能以可复制、一致的方式运行，并在所有情况下生成格式良好的输出，这是非常有帮助的。

这对于网络协议尤其有价值，因为它允许您大幅增加实现与其他实现互操作的可能性。如果您有互操作性问题，在测试条件下很容易重现该问题，并确认是您的实现还是另一个实现误解了协议。

最后，从一个真正自私的角度来看，你的实现越正确，你需要处理的用户错误报告就越少！

可重用性

The less selfish improvement that is obtained from writing sans-IO protocol implementations is that they become dramatically more re-useable. The Python ecosystem as it stands in 2016 contains a number of implementations of almost every common network protocol, and to within a rounding error exactly none of them share non-trivial protocol code.

This is an enormous amount of duplicated effort. Writing a protocol stack for a relatively simple protocol is a decent amount of work, and writing one for a complex protocol is an extremely substantial effort that can take hundreds of person-hours. Duplicating this effort is a poor allocation of resources that the open source and free software communities can increasingly ill-afford.

While the duplication of effort is bad enough, we are also repeatedly writing the same bugs. This is somewhat inevitable given the difficulty of producing a correct I/O-based protocol implementation (see Simplicity, Testability, and Correctness), but it is also caused because these various implementations often have no overlap in their development teams. This causes us to repeatedly stumble into the same subtle issues without being able to share knowledge about them, let alone share code to fix the problem. This leads to further multiplicative inefficiencies.

There is obviously plenty of great reasons to write a competing implementation for a network protocol: you may want to learn how the protocol works, or you may believe that the current implementations have poor APIs or poor correctness. However, many reimplementations do not occur for these reasons: instead, they occur because all current implementations either bake their I/O in or they bake their expected flow control mechanisms. For example, aiohttp was not able to use httplib’s parser, because httplib bakes its socket calls into that parser, making it unsuitable for an asyncio environment.

By keeping async flow control and I/O out of your protocol implementation, it provides the ability to use that implementation across all forms of flow control. This means that the core of the protocol implementation is divorced entirely from the way I/O is done or the way the API is designed. This provides the Python community with huge advantages:

people who want to experiment with simpler or better API designs can do so without needing to write a protocol implementation or being constrained by the pre-existing API designs.

those who want to pursue unusual asynchronous flow control approaches (e.g. curio) can obtain new implementations that are compatible with those new approaches with minimal effort and without needing to be an expert in all protocols.

people with unusual or high-performance I/O requirements can take control of their own I/O code without needing to rewrite the entire protocol stack. For example, people wanting to write high-performance HTTP/2 implementations will want to architect their I/O around the TCP_NOTSENT_LOWAT socket option, which is not easily possible with most I/O-included implementations.

This also allows us to centralize our work. If all, or even most, Python libraries centre around the same small number of implementations of popular protocols, that makes it possible for the best protocol experts in the Python community to focus their efforts on fixing bugs and adding features to the core protocol implementations, leading to a “rising tide lifts all boats” effect on the community.

从编写sans-IO协议实现中获得的不那么自私的改进是，它们变得更加可重用。2016年的Python生态系统包含了几乎所有常见网络协议的许多实现，而且在舍入误差范围内，它们都没有共享复杂核心的协议代码。

这是大量重复的工作。为一个相对简单的协议编写一个协议栈是一项相当大的工作，为一个复杂的协议编写一个协议栈是一项极其艰巨的工作，可能需要数百人小时。重复这一努力的是资源分配不善，开源和自由软件社区越来越负担不起。

这是大量重复的工作。为一个相对简单的协议编写一个协议栈是一项相当大的工作，为一个复杂的协议编写一个协议栈是一项极其艰巨的工作，可能需要数百人小时。重复这一努力的是资源分配不善，开源和自由软件社区越来越负担不起。虽然重复工作已经够糟糕的了，但我们也在反复编写相同的bug。鉴于很难生成正确的基于I/O的协议实现（请参见简单性、可测试性和正确性），这在一定程度上是不可避免的，但这也是因为这些不同的实现在其开发团队中通常没有重叠。这导致我们反复遇到同样的微妙问题，却无法分享有关这些问题的知识，更不用说分享解决问题的代码了。

这会导致进一步的倍增效率低下。显然，为一个网络协议编写一个相互竞争的实现有很多很好的理由：您可能想了解该协议是如何工作的，或者您可能认为当前的实现有很差的API或很差的正确性。然而，由于这些原因，许多重新实现并没有发生：相反，它们之所以发生，是因为所有当前的实现要么bake它们的I/O，要么bake它们预期的流控制机制。例如，aiohttp无法使用httplib的解析器，因为httplib将其套接字调用bake到该解析器中，使其不适合异步io环境。

通过将异步流控制和I/O排除在协议实现之外，它提供了在所有形式的流控制中使用该实现的能力。这意味着协议实现的核心完全脱离了I/O的完成方式或API的设计方式。这为Python社区提供了巨大的优势：

想要尝试更简单或更好的API设计的人可以这样做，而无需编写协议实现，也无需受到现有API设计的限制。
那些想要追求不同寻常的异步流控制方法（例如curio）的人可以用最少的努力获得与这些新方法兼容的新实现，而无需成为所有协议的专家。
具有异常或高性能I/O需求的人可以控制自己的I/O代码，而无需重写整个协议栈。例如，想要编写高性能HTTP/2实现的人将希望围绕TCP_NOTSENT_LOWAT socket选项构建他们的I/O，这在大多数包含I/O的实现中是不容易实现的。

这也使我们能够集中工作。如果所有甚至大多数Python库都围绕着相同数量的流行协议实现，那么Python社区中最好的协议专家就有可能将精力集中在修复错误和为核心协议实现添加功能上，这会对社区产生“水涨船高”的影响。

如何实现I/O free网络协议

Assuming that Why Write I/O-Free Protocol Implementations? has convinced you, the logical next question is: how do you write a protocol implementation that does no I/O?

While each protocol is unique, there are several core design principles that can be used to help provide the scaffolding for your sans-IO implementation.

如何编写一个不进行I/O的协议实现？虽然每个协议都是独特的，但有几个核心设计原则可用于帮助为SAN IO实现提供支架。

输入和输出

When it comes to network protocols, at a fundamental level they all consume and produce byte sequences. For protocols implemented over TCP (or any SOCK_STREAM-type socket), they use a byte stream. For protocols implemented over UDP, or over any lower-level protocol than that (e.g. directly over IP), they communicate in terms of datagrams, rather than byte streams.

For byte-stream based protocols, the protocol implementation can use a single input buffer and a single output buffer. For input (that is, receiving data from the network), the calling code is responsible for delivering code to the implementation via a single input (often via a method called receive_bytes, or something similar). The implementation will then append these bytes to its internal byte buffer. At this point, it can choose to either eagerly process those bytes, or do so lazily at the behest of the calling code.

When it comes to generating output, a byte-stream based protocol has two options. It can either write its bytes to an internal buffer and provide an API for extracting bytes from that buffer, as done by hyper-h2, or it can return bytes directly when the calling code triggers events (more on this later), as done by h11. The distinction between these two choices is not enormously important, as one can easily be transformed into the other, but using an internal byte buffer is recommended if it is possible that the act of receiving input bytes can cause output bytes to be produced: that is, if the protocol implementation sometimes automatically responds to the peer without user input.

For datagram based protocols, it is usually important to preserve the datagram boundaries. For this reason, while the general structure of the above points remains the same, the inputs and outputs should be changed to consume and return iterables of bytestrings. Each element in the iterable will correspond to a single datagram.

在网络协议的基础层级，它们都会消费并产生字节序列。通过TCP（或任何SOCK_STREAM流类型套接字）实现的协议，它们使用字节流(byte strea,)。对于通过UDP或任何比UDP更底层的协议（例如，直接通过IP）实现的协议，它们以数据报(datagrams)而不是字节流进行通信。

对于基于字节流的协议，协议实现可以使用单个输入缓冲区和单个输出缓冲区。对于输入（即从网络接收数据），调用代码负责通过单个输入（通常通过名为receive_bytes的方法或类似方法）将代码交付给实现。然后将这些字节追加到其内部字节缓冲区。此时，它可以选择马上处理这些字节，也可以根据调用代码的要求懒处理这些字节。

在生成输出时，基于字节流的协议有两个选项。它既可以像hyper-h2那样将字节写入内部缓冲区，并提供从该缓冲区提取字节的API，也可以像h11那样在调用代码触发事件时直接返回字节（稍后将详细介绍）。这两种选择之间的区别并不十分重要，因为一种选择可以很容易地转换为另一种选择，但如果接收输入字节的行为可能导致产生输出字节，则建议使用内部字节缓冲区：即，如果协议实现有时会自动响应对等方，而无需用户输入。

对于基于数据报的协议，保留数据报边界通常很重要。出于这个原因，虽然上述各点的总体结构保持不变，但输入和输出应该改变，以消费和返回bytestrings的迭代器。迭代器中的每个元素都对应于一个数据报。

事件 Events

The major abstraction used by most of the sans-IO protocol stacks is to translate the bytes received from the network into “events”. Essentially, this abstraction defines a network protocol as a serialization mechanism for a sequence of semantic “events” that can occur on that protocol.

In this abstraction model, both peers in a protocol emit and receive events. In terms of receiving events, events can either be returned to the calling code immediately whenever bytes are provided, or they can be lazily produced in response to the calling code’s request. Both approaches have their advantages and disadvantages, and it doesn’t matter enormously which is chosen.

When it comes to emitting events, there are several possible approaches, but two are in active use. The first, and comfortably the most common, is to emit events using function calls. For example, a HTTP implementation may have a function call entitled send_headers which emits a bytestream that, if received by the same implementation, would cause a RequestReceived event to be emitted. This is the approach used by hyper-h2.

However, an alternative approach is to have a single method that accepts events, the same events that the implementation emits. This is the approach used by h11. This approach has the substantial advantage of symmetry of input and output to the implementation, but the moderate disadvantage of being a slightly uncomfortable programming approach for many developers.

Either approach works well.

大多数SAN IO协议栈使用的主要抽象是将从网络接收的字节转换为“事件”。本质上，这种抽象将网络协议定义为该协议上可能发生的一系列语义“事件”的序列化机制。

在这个抽象模型中，协议中的两个对等方都发射和接收事件。就接收事件而言，可以在提供字节时立即将事件返回给调用代码，也可以根据调用代码的请求延迟生成事件。这两种方法各有优缺点，选择哪种方法无关紧要。

在发送事件方面，有几种可能的方法，但有两种正在积极使用。首先，也是最常见的，是使用函数调用发出事件。例如，HTTP实现可能有一个名为send_headers的函数调用，它会发出一个bytes流，如果被同一个实现接收，会发出RequestReceived事件。这是hyper-h2使用的方法。

这两种方法都很有效。

与I/O集成

At some point, of course, your sans-IO protocol implementation needs to be joined to some actual I/O. There are two obvious possible goals when doing this. The first is to provide a complete native-feeling API for the given I/O model. The second is provide an implementation that can easily be swapped to run in multiple I/O models. Each has different design requirements.

If you are designing for a full native-feeling API for a given I/O model (e.g. Twisted or asyncio), you will want to buy entirely into that platform’s standard design patterns. You can liberally use flow control primitives and the appropriate interfaces and I/O mechanisms to transfer data. This allows you to build a module like, for example, aiohttp without having to reimplement HTTP from the ground up. It also allows you to optimise for common use-cases, and generally provide a no-friction interface.

Another possibility is to try as much as possible to push your I/O and flow control primitives to the edges of the program or library, providing integration points for multiple backends. This requires substantial care and discipline, as it requires that your entire codebase be predicated around sans-IO primitives except for a very tiny nucleus of code that uses the I/O and flow control primitives of the given platform. This allows you to have a single codebase that drops neatly into multiple I/O and flow control paradigms with very little change, though at the cost of quite possibly not feeling very native in some or all of them.

当然，在某些时候，您的SAN IO协议实现需要与一些实际的I/O连接起来。这样做有两个明显的可能目标。第一个是为给定的I/O模型提供一个完整的类似原生API。第二个是提供一个可以轻松交换以在多I/O模型中运行的实现。每个都有不同的设计要求。

如果你正在为一个给定的I/O模型（例如Twisted或asyncio）设计一个完整的类原生API，你会希望完全采用该平台的标准设计模式。您可以自由地使用流控制原语以及适当的接口和I/O机制来传输数据。这允许您构建一个模块，比如aiohttp，而无需从头重新实现HTTP。它还允许您针对常见用例进行优化，并通常提供no-friction界面。

另一种可能性是尽可能地将I/O和流控制原语推送到程序或库的边缘，为多个后端提供集成点。这需要大量的谨慎和规范，因为它要求您的整个代码库都以SAN IO原语为基础，除了使用给定平台的I/O和流控制原语的非常小的代码核心之外。这允许您拥有一个代码库，它可以整齐地放入多个I/O和流控制范例中，几乎不做任何更改，但代价可能是在某些或所有范例中感觉不太自然。