Timeouts

Networking code is inherently complex due to the unpredictable nature of network failures and the possibility of a remote peer that is coded incorrectly—or even maliciously! Therefore, your code needs to deal with unexpected circumstances. One common failure mode that you should guard against is a slow or unresponsive peer.

This page describes the timeout behavior in trio-websocket and shows various examples for implementing timeouts in your own code. Before reading this, you might find it helpful to read “Timeouts and cancellation for humans”, an article written by Trio’s author that describes an overall philosophy regarding timeouts. The short version is that Trio discourages libraries from using internal timeouts. Instead, it encourages the caller to enforce timeouts, which makes timeout code easier to compose and reason about.

On the other hand, this library is intended to be safe to use, and omitting timeouts could be a dangerous flaw. Therefore, this library takes a balanced approach to timeouts, where high-level APIs have internal timeouts, but you may disable them or use lower-level APIs if you want more control over the behavior.

Message Timeouts

As a motivating example, let’s write a client that sends one message and then expects to receive one message. To guard against a misbehaving server or network, we want to place a 15 second timeout on this combined send/receive operation. In other libraries, you might find that the APIs have timeout arguments, but that style of timeout is very tedious when composing multiple operations. In Trio, we have helpful abstractions like cancel scopes, allowing us to implement our example like this:

async with open_websocket_url('ws://my.example/') as ws:
    with trio.fail_after(15):
        await ws.send_message('test')
        msg = await ws.get_message()
        print('Received message: {}'.format(msg))

The 15 second timeout covers the cumulative time to send one message and to wait for one response. It raises TooSlowError if the runtime exceeds 15 seconds.

Connection Timeouts

The example in the previous section ignores one obvious problem: what if connecting to the server or closing the connection takes a long time? How do we apply a timeout to those operations? One option is to put the entire connection inside a cancel scope:

with trio.fail_after(15):
    async with open_websocket_url('ws://my.example/') as ws:
        await ws.send_message('test')
        msg = await ws.get_message()
        print('Received message: {}'.format(msg))

The approach suffices if we want to compose all four operations into one timeout: connect, send message, get message, and disconnect. But this approach will not work if want to separate the timeouts for connecting/disconnecting from the timeouts for sending and receiving. Let’s write a new client that sends messages periodically, waiting up to 15 seconds for a response to each message before sending the next message.

async with open_websocket_url('ws://my.example/') as ws:
    for _ in range(10):
        await trio.sleep(30)
        with trio.fail_after(15):
            await ws.send_message('test')
            msg = await ws.get_message()
            print('Received message: {}'.format(msg))

In this scenario, the for loop will take at least 300 seconds to run, so we would like to specify timeouts that apply to connecting and disconnecting but do not apply to the contents of the context manager block. This is tricky because the connecting and disconnecting are handled automatically inside the context manager open_websocket_url(). Here’s one possible approach:

with trio.fail_after(10) as cancel_scope:
    async with open_websocket_url('ws://my.example'):
        cancel_scope.deadline = math.inf
        for _ in range(10):
            await trio.sleep(30)
            with trio.fail_after(15):
                await ws.send_message('test')
                msg = await ws.get_message()
                print('Received message: {}'.format(msg))
        cancel_scope.deadline = trio.current_time() + 5

This example places a 10 second timeout on connecting and a separate 5 second timeout on disconnecting. This is accomplished by wrapping the entire operation in a cancel scope and then modifying the cancel scope’s deadline when entering and exiting the context manager block.

This approach works but it is a bit complicated, and we don’t want our safety mechanisms to be complicated! Therefore, the high-level client APIs open_websocket() and open_websocket_url() contain internal timeouts that apply only to connecting and disconnecting. Let’s rewrite the previous example to use the library’s internal timeouts:

async with open_websocket_url('ws://my.example/', connect_timeout=10,
        disconnect_timeout=5) as ws:
    for _ in range(10):
        await trio.sleep(30)
        with trio.fail_after(15):
            await ws.send_message('test')
            msg = await ws.get_message()
            print('Received message: {}'.format(msg))

Just like the previous example, this puts a 10 second timeout on connecting, a separate 5 second timeout on disconnecting. These internal timeouts violate the Trio philosophy of composable timeouts, but hopefully the examples in this section have convinced you that breaking the rules a bit is justified by the improved safety and ergonomics of this version.

In fact, these timeouts have actually been present in all of our examples so far! We just didn’t see them because those arguments have default values. If you really don’t like the internal timeouts, you can disable them by passing math.inf, or you can use the low-level APIs instead.

Timeouts on Low-level APIs

In the previous section, we saw how the library’s high-level APIs have internal timeouts. The low-level APIs, like connect_websocket() and connect_websocket_url() do not have internal timeouts, nor are they context managers. These characteristics make the low-level APIs suitable for situations where you want very fine-grained control over timeout behavior.

async with trio.open_nursery():
    with trio.fail_after(10):
        connection = await connect_websocket_url(nursery, 'ws://my.example/')
    try:
        for _ in range(10):
            await trio.sleep(30)
            with trio.fail_after(15):
                await ws.send_message('test')
                msg = await ws.get_message()
                print('Received message: {}'.format(msg))
    finally:
        with trio.fail_after(5):
            await connection.aclose()

This example applies the same 10 second timeout for connecting and 5 second timeout for disconnecting as seen in the previous section, but it uses the lower-level APIs. This approach gives you more control but the low-level APIs also require more boilerplate, such as creating a nursery and using try/finally to ensure that the connection is always closed.

Server Timeouts

The server API also has internal timeouts. These timeouts are configured when the server is created, and they are enforced on each connection.

async def handler(request):
    ws = await request.accept()
    msg = await ws.get_message()
    print('Received message: {}'.format(msg))

await serve_websocket(handler, 'localhost', 8080, ssl_context=None,
    connect_timeout=10, disconnect_timeout=5)

The server timeouts work slightly differently from the client timeouts. The server’s connect timeout measures the time between receiving a new TCP connection and calling the user’s handler. The connect timeout includes waiting for the client’s side of the handshake (which is represented by the request object), but it does not include the server’s side of the handshake. The server handshake needs to be performed inside the user’s handler, e.g. await request.accept(). The disconnect timeout applies to the time between the handler exiting and the connection being closed.

Each handler is spawned inside of a nursery, so there is no way for connect and disconnect timeouts to raise exceptions to your code. (If they did raise exceptions, they would cancel your nursery and crash your server!) Instead, connect timeouts cause the connection to be silently closed, and the handler is never called. For disconnect timeouts, your handler has already exited, so a timeout will cause the connection to be silently closed.

As with the client APIs, you can disable the internal timeouts by passing math.inf or you can use low-level APIs like wrap_server_stream().