Skip to content

Latest commit

 

History

History
89 lines (53 loc) · 4.28 KB

README.md

File metadata and controls

89 lines (53 loc) · 4.28 KB

Crypto VWAP Feed

This is a real-time Volume Weighted Average Price or VWAP calculation engine for some common cryptocurrency trading pairs, such as:

  • BTC-USD (Bitcoin to USD)
  • ETH-USD (Ethereum to USD)
  • ETH-BTC (Ethereum to Bitcoin)
  • ...

How it works

It connects to the Coinbase WebSocket feed and listens to messages that represent trades. Coinbase call these matches:

A trade occurred between two orders. The aggressor or taker order is the one executing immediately after being received and the maker order is a resting order on the book. The side field indicates the maker order side. If the side is sell this indicates the maker was a sell order and the match is considered an up-tick. A buy side match is a down-tick.

For every trading pair we're interested about (eg: BTC-USD), we compute the VWAP with at most 200 data points. It is computed as:

$$P_{VWAP} = \dfrac{\sum_j P_j \cdot Q_j}{\sum_j Q_j}$$

where:

  • $P_{VWAP}$ is the Volume Weighted Average Price.
  • $P_{j}$ is price of trade $j$.
  • $Q_{j}$ is quantity of trade $j$.
  • $j$ is each individual trade that takes place over the defined period of time, excluding cross trades and basket cross trades.

Out-of-order messages

According to Coinbase:

While a websocket connection is over TCP, the websocket servers receive market data in a manner which can result in dropped messages. Your feed consumer should either be designed to expect and handle sequence gaps and out-of-order messages, or use channels that guarantee delivery of messages.

To handle out-of-order messages, we use the sequence numbers provided by Coinbase. We store messages in a priority queue or list that is sorted by these sequences. As messages arrive, the oldest messages, i.e., those with smaller sequences, are dropped.

Limitations and future improvements

Use a Sorted Set Time Series from Redis

We're currently storing these data points in memory. That means, if the application crashes all data points will be lost and it will take a while to accumulate this data again.

As an improvement, we can store this data in a Redis queue. More specifically, we can use Redis to create a time series that is sorted by lexicographic order. This will allow us to efficiently iterate over the time series sorted by the sequence numbers provided by Coinbase.

Cache the VWAP sum or compute it partially

We're also computing the VWAP sum again every time. That's very wasteful, since we iterate over the 200 points in the time series whenever a new message is received from the Coinbase WebSocket. It was done this way just as a quick proof-of-concept, but it should be optimized before going to production.

We cannot simply iterate over the last calculation result, keeping the previous sum in memory while adding the current data point. Because the messages arrive out of order, we need to be a bit more creative.

Fault tolerance

Lastly, we need to have better error support and fault tolerance, in case Coinbase sends us "error" messages or the WebSocket connection simply drops because of a TCP timeout.

Requirements

  • docker or Docker Desktop
  • docker-compose (included with Docker Desktop)
  • make or GNU Make (probably already installed in your OS). It is being used to build the Docker container dependencies.

Setup

This will build the necessary Docker images:

./build.sh

Run the test suite

Tests are run by pytest within the CI Docker containers.

./test.sh

Run lint (code style checks)

Checks are run by Flake8 within the CI Docker containers.

./lint.sh

Run the application

./run.sh