FPGA designs of any complexity can have many parts that need settings, that provide results, or must be controlled. A register interface is a very convenient way to read and write settings, to preload counters, or to reset and trigger subsystems. All registers and 'actionable items' in the user logic are given an address, and these addresses can be read/written with a single interface. We will call this the command interface.
Any suitable frontend can be used for the command interface. Typically this will be a UART, like the fluart that was designed hand-in-hand with cmd_proc. Any other interface (SPI, I2C, ...) can work als well, as long as it is byte-based. cmd_proc only works with octets on the command side.
On the side of the user logic, one address bus, a read data bus and a write data bus are needed. The size of the address and data buses can be specified as a generic. The two data buses are the same size. Both sizes must be multiples of 8 bits. Reads and writes are each acknowledged with a single-cycle strobe on their respective control signals.
Four commands are supported by the cmd_proc: read and write, each in ASCII (readable text) and binary formats. The ASCII commands are suitable for use with a terminal program, allowing you to control the FPGA simply by typing. Both input and output are readable strings. When under program control, the binary commands are probably more convenient. Here, the bytes are not translated to hexadecimal characters as they are with the ASCII commands, but used as-is.
The format is always the same:
- a single character indicates the type of command;
- the address specifies which register must read or written;
- a read command returns the data, while a write command specifies what must be written.
For simplicity, the command processor does not support backspacing or history. You could set your terminal emulator to line-buffered to solve the first issue. Also, there is no remote echo. That means you will have to turn on local echo if you want to see what you're typing. This is because the command processor does not know whether you're sending text or binary commands. It would be unusual to echo back binary data.
- ASCII read
- The command has the form raaaa\n, where r or R
indicates the ASCII read, and aaaa are the address bytes. The
internal address buffer is initialized to 0. Any bytes specified in the
command are shifted in from the right; any remaining leading bits will
be 0. For example, if the address bus is 16 bits wide, four characters
are needed to fully specify an address. To read from address 0x1234, the
command would be r1234\n. To read from address 0x12,
r12\n is sufficient. To read from address 0, just type
r\n. When more bytes are given than the width of the address
bus allows, the older ones simply drop out of the buffer:
r123456\n will read from address 0x3456 in our example. A line
feed character (LF, ASCII 10, 0x0a, ctrl-J) or carriage return (CR,
ASCII 13, 0x0d, ctrl-M) ends the command. The often-used combination
CR/LF (0x0d/0x0a) is also valid, since extra bytes are ignored. Pressing
escape (ESC, ASCII 27, 0x1b, ctrl-[) aborts the command at any time (if
not captured by your terminal emulator). Any whitespace in the command
string is ignored, so if you prefer sending r 12 34\n, that
will work. The hexadecimal bytes are case-insensitive.
The response will consist of hexadecimal bytes only. Leading zeros are not suppressed. The response is ended with a line feed character only. You may need to set your terminal emulator to translate this to CR/LF. - ASCII write
- To write data dd to address aaaa, issue the command
waaaa,dd\n. The same rules apply as for the read command: all
characters are case-insensitive. Everything except valid hexadecimal
characters, comma (ASCII 44, 0x2c), line feed and carriage return is
ignored, allowing you to pipe in nicely-formatted strings. Address and
data buffers are initialised to 0. Address and data bytes are filled
from the right, extra bytes drop out on the left. Escape aborts.
The write command gives no response. - Binary read
- A binary read is started by sending a null byte (NULL, ASCII 0, 0x00).
The command processor now assumes that you know what you're doing, because
the following bytes are used directly as the address. The right number of
bytes must be supplied: one for an 8-bit adress, two for a 16-bit address
and so on. The command does not have to be terminated. There is no way to
abort the command.
The response is issued as an unterminated string of binary bytes. For example, a 32-bit data bus would result in four bytes, with the most significant byte sent first. - Binary write
- A binary write starts with a binary 1 byte (SOH, ASCII 1, 0x01). The
address and data bytes follow without separator and without ending. The
right number of address and data bytes must be supplied. Most
significant bytes go first.
There is no response.
generic (
ADDR_SIZE: natural := 16; -- bits, in multiples of 8!
DATA_SIZE: natural := 16
);
port (
clk: in std_logic;
reset: in std_logic;
-- UART interface
rx_data: in std_logic_vector(7 downto 0);
rx_data_valid: in std_logic;
tx_data: out std_logic_vector(7 downto 0);
tx_req: out std_logic;
tx_busy: in std_logic;
-- user logic interface
address: out std_logic_vector(ADDR_SIZE - 1 downto 0);
rd_data: in std_logic_vector(DATA_SIZE - 1 downto 0);
wr_data: out std_logic_vector(DATA_SIZE - 1 downto 0);
read_req: out std_logic;
write_req: out std_logic
);
- ADDR_SIZE and DATA_SIZE
- Specify the bit widths of the internal address and data bus, respectively. Any size is allowed, as long as it's a multiple of eight. The model is based on byte (octet) transfers only.
- clk and reset
- The system clock and reset. The cmd_proc is a synchronous
design, acting on the rising edge of the clock input. Reset is
synchronous as well and is active high.
The UART interface is designed for the exquisitely suitable fluart, also on this repository. - rx_data and rx_data_valid
- Upon reception of a valid word (i.e., start bit low and stop bit high), rx_data_valid must be high for one clock cycle. The data on rx_data will then be latched.
- tx_data, tx_req and tx_busy
- tx_req is set high for one clock cycle to transmit
tx_data. tx_busy is assumed to be high while a
transfer is in progress. All activity will be halted while waiting for
the interface to be ready; there is no timeout. The fluart's
tx_end signal is not used.
The signals controlling the user logic are kept as generic as possible. An example of the typical usage is also in this repository, as explained below. - address, rd_data and wr_data
- The address, a std_logic_vector having ADDR_SIZE bits, is always sourced by cmd_proc. It is part of the command received on the command interface. In response to a read command, rd_data is the data from the user logic, after any address decoding. When a write command is received, the data to be written will be on wr_data. The two data buses are both DATA_SIZE bits wide. There is no slicing or shifting done in the cmd_proc, since it has no knowledge of what will be done with the data in the user logic.
- read_req and write_req
- In many cases, the data to be read will be selected using an asynchronous multiplexer. As long as it is done within a clock cycle, the output is valid. There may be cases, however, where it is convenient to know when the data is read. The read_req signal is provided for this purpose. When writing data, a strobe is indispensible. The write_req signal indicates that the data on wr_data are valid.
For convenience, the model uses features from the VHDL2008 standard. The all keyword saves a lot of typing and bookkeeping for non-clocked (asynchronous) processes. VHDL2008's maximum is not supported by Quartus yet, although all simulators understand it and the function is only used at compile time. A similar function is given instead.
Quite some thought went into the hexadecimal-to-binary and binary-to-hexadecimal converters. The case statements reduce to less logic than typecasts and calculations.
A testbench is supplied in cmd_proc_tb.vhdl. There is no serial port; the interface is exercised directly. The address and data buses are both 16 bits wide. The four supported commands are given in this order: ASCII read, binary read, ASCII write, binary write.
Use your preferred simulator. For GHDL, the following commands are sufficient:
ghdl -c --std=08 cmd_proc.vhdl cmd_proc_tb.vhdl -r cmd_proc_tb --vcd=cmd_proc_tb.vcd --stop-time=3us
gtkwave cmd_proc_tb.vcd &
You'll find a synthesizable example in example_top.vhdl. The command processor is set up for a 16-bit address bus and a 32-bit data bus. Both are seriously oversized for the purpose, but it shows the flexibility.
A serial interface is provided, with the system clock frequency set to 50MHz and the bit rate to 115.2kb/s. The fluart was designed for this purpose.
The addres layout is as follows:
- address 0 (0x0000)
- Always reads the constant 32-bit value 0x01020304. Writes are ignored. Use it to check the endianness of the receiving program. The most significant end is transmitted first.
- address 1 (0x0001)
- The lower 8 bits of the data word written to this address are placed on output pins. Connect these to your status LEDs. The current state of the pins can be read back from the same address.
- address 2 (0x0002)
- This is a 32-bit scratch register that can read and written. The contents have no effect; it's storage only.
- address 256 (0x0100)
- This is a 32-bit counter that is incremented by one on every system clock cycle. Writes have no effect. If you know the clock frequency, this allows you to assess the performance of the system.
...is a goal in itself. If you have suggestions to simplify the logic and/or improve readability of the VHDL, let me know! There are too many style preferences to keep everyone happy, so please don't focus on indentation, naming etcetera. Live and let live.