Embedded Project Oberon on Altera FPGA

General discussions about using the Astrobe IDE to program the FPGA RISC5 cpu used in Project Oberon 2013
Post Reply
gray
Posts: 143
Joined: Tue Feb 12, 2019 2:59 am
Location: Mauritius

Embedded Project Oberon on Altera FPGA

Post by gray » Fri Jun 02, 2023 6:17 am

I have implemented the Embedded Project Oberon (EPO) RISC5 CPU and environment in an Intel Altera FPGA, specifically, using the Terasic Cyclone V GX Starter Kit board [1]. It's a nice board in the same price range as the Digilent Arty A7 100, with lots of LEDs, switches, buttons, SRAM, seven segment displays, SD card, as well as headers and connectors. In particular useful are the eight green LEDs for the Oberon LED() procedure, especially if you're digging into Oberon's Inner Core and the boot-loader. :)

Terasic provides a useful tool, called SystemBuilder, to generate the pin-allocation files for the different peripherals on the board. You select the peripherals in a GUI, and the tool creates the corresponding file. This replaces Digilent's "master xdc" file approach. The SystemBuilder can be used incrementally, so you can start small, and then just generate the additional file entries later when you add more peripherals.

To configure the FPGA, Quartus Prime replaces Vivado for the Xilinx FPGAs. There's a free version, Quartus Prime Lite [2]. In general, Quartus is easy to use. It's pretty fast, compiling this configuration in under two and a half minutes on my machine. Volatile configuration of the FPGA is also fast, but programming the non-volatile configuration device on the board takes longer compared to Vivado. However, you can continue using the current configuration during the programming process.

This FPGA configuration boots the EPO disk image for the SD card as provided with version 8.0 of Astrobe for RISC5 (but check out the remark about I2C). It is a simplified version of my "standard platform", which also includes devices such as stack overflow monitor, watchdog, logging, calltrace, and more. The structure of the RISC5Top.v file is therefore a bit different, as are the different modules for the processor environment.

RISCTop.v: due to the increasing complexity of my platform, I had to find a more systematic approach to structure the system. I have taken inspiration from THM Oberon [3]. All functionality is implemented strictly in modules, there's no register logic on the top level. All modules have a uniform interface (apart from module-specifics), which allows me to systematically add or remove devices, as well as design them. Hardware functions and features are encapsulated within modules.

Xilinx-specifics: (Embedded) Project Oberon uses three Xilinx-specific components, namely for the clocks, the CPU registers, and the IO tri-state buffers. I have replaced the registers and the IO buffers with generic modules. The clock PLL is always FPGA-specific, but it's now encapsulated within a module. EPO uses specific buffers for the clocks, but I have found that this is not required, as Quartus and also Vivado will recognise the clocks and use the clock distribution infrastructure available on the specific FPGA. The same holds true for the GPIO tri-state buffers, for which the tools will select appropriate building blocks, and place the devices physically near the corresponding IO pins.

RAM: unfortunately, the Cyclone V GX FPGA offers fewer on-chip RAM blocks compared to the Artix-7, I was only able to get 416 kB maximum. (See below about using the SRAM instead.) As with the Artix-7 FPGA, the critical timings of the design are between the CPU registers and the RAM (via the bus), since the latter is implemented on the FPGA as several cell columns across the chip, which results in long paths. When I ran into "timing requirements not met", I had to reduce the system clock frequency to 38 MHz. If you reduce the amount of RAM to, say, 320 kB, you can increase the clock frequency. The RAM module allows to adjust the RAM size in steps of one kB via a parameter. Don't forget to adjust the bootloader accordingly.

RS232: the transmitter and receiver are combined into a single module, including buffers for both directions. The size of the buffers is parameterised. The buffered devices behave as unbuffered ones with EPO's software. To use the, say, buffered transmitting, just act on the “buffer not full” hardware signal in lieu of "buffer empty".

SPI: the SPI device is an extended version of EPO's. It is compatible with EPO's bootloader, Kernel, SPI device driver, and the Real-time clock. I had to introduce a 'epo_compat' parameter, as otherwise the additional configuration bits, which EPO obviously cannot "know", get zeroed out. The extensions include selectable SPI mode (clock polarity and clock phase, ie. CPOL and CPHA), as well as transmission modes for 16 bits and MSByte first, which are useful with some peripherals such as displays.

I2C: since I needed to write a "bus adapter" for the I2C device anyway, I have reduced the number of IO addresses used from six to four in the process. This required two small corresponding changes in the software module I2C, which is provided in the 'oberon/lib' directory. As I had to make these changes anyway, I have also defined a CONST for the system clock frequency, so the I2C serial clock will use the correct frequency as intended via I2C.Init().

IO: the IO address range is extended to 256 bytes, hence allowing for 64 device addresses, while keeping compatibility with the traditional IO addresses of (Embedded) Project Oberon. The overall address map has reserved space for three additional 256 IO address blocks should that be required (cf. bottom of RISC5Top.v for the map). There's a PDF outlining the currently implemented 256 bytes IO address space in the 'doc' directory.

LEDs, buttons, switches, 7-seg displays: the eight green LEDs are operated via Oberon's LED() procedures. The red LEDs can be switched by software as well as direct input signals from devices in the FPGA. The state of the buttons and switches can be read from software, but there are also direct output signals to other devices directly in the hardware. There's a module LSB in 'oberon/lib'.

HSMC connector: this connector can be used for different types of expansion cards. I mostly use it to extend the number of directly usable IO pins using the GPIO-HSTC card [4]. I have the 16 digital channels of my oscilloscope wired up to this card, as well as some other test peripherals, and just move it to my other Terasic board DE2-115 when I work there, without the need to rewire these connections again and again. Very handy indeed. Quartus allows to integrate an external logic analyser interface directly on the chip, which can connect to any signal inside. The connector is not wired for this configuration/project, but I can provide the assignment file if needed.

FPGA pin allocations: refer to the file RISC5Top.qsf in the build directory for how the input/output ports as defined in RISC5Top.v are allocated to the peripherals on the board. Obviously, the connections to the LEDs, switches, buttons, the 7-segment displays, the SD card, and the RS232 transmitter/receiver are given by the board design. GPIO, SPI (chip select 1 and 2), and I2C are allocated on the 40 pin expansion header. There's also a corresponding PDF in the 'doc' folder for the expansion header.

Quartus project: in the directory 'build' is a Quartus project file to build the configuration.

SRAM: I have experimented with using the 512 kB SRAM instead of the FPGA's RAM blocks, but only have a version that runs with a 20 MHz system clock. The SRAM is only 16 bit wide, so two read/write cycles are required. As of now, I have tried to get it running with the two memory cycles happening within one CPU clock cycle, ie. without stalling the CPU. A 40 MHz CPU clock leaves only 12.5 ns per memory cycle, so with the SRAM address set-up time of 10 ns for reading, and combinatorial gate delays through the bus and CPU logic of about 8 ns things are getting (too) tight. I will try to get a version running that stalls the CPU. SRAM should be less prone to path timing issues than the FPGA RAM.

Repository: please find the Verilog and Oberon modules as well as the other files as described above in this GitHub repository [5].

Here's the RISC5Top.v file if you want to get a quick overview without exploring the repo.

Code: Select all

/**
  RISC5 CPU and environment definition for Embedded Project Oberon
  --
  Board and technology: Terasic Cyclone V Starter Kit, Altera Cyclone V GX
  --
  Base/origin:
    * Embedded Project Oberon
    * THM-oberon
  --
  2023 Gray, gray@grayraven.org
  https://oberon-rts.org/licences
**/

`timescale 1ns / 1ps
`default_nettype none

module RISC5Top #(
  parameter
    clock_freq = 38_000_000,  // as set in module 'clocks'
    prom_file = "../bootload/BootLoad-416k-64k.mem",
    rs232_buf_slots = 256,    // RS232 buffer size in bytes, same for for tx and rx
    num_gpio = 26             // number of GPIO pins
  )(
  // clock
  input wire clk_in,
  // RS-232
  input wire rs232_0_rxd,
  output wire rs232_0_txd,
  // SD card (SPI CS = 0)
  output wire sdcard_cs_n,
  output wire sdcard_sclk,
  output wire sdcard_mosi,
  input wire sdcard_miso,
  // SPI CS = 1 and 2
  output wire [2:1] spi_cs_n,
  output wire [2:1] spi_sclk,
  output wire [2:1] spi_mosi,
  input wire [2:1] spi_miso,
  // LEDs, switches, buttons, 7-segment
  output wire [7:0] led_g,
  output wire [9:0] led_r,
  output wire [6:0] hex1_n,
  output wire [6:0] hex0_n,
  input wire [3:0] btn_in_n,
  input wire [9:0] swi_in,
  // GPIO
  inout wire [num_gpio-1:0] gpio,
  // I2C
  inout wire i2c_scl,
  inout wire i2c_sda
);

  // clk
  wire clk_ok;                // clocks stable
  wire clk;
  wire clk_rst;
  // reset
  wire rst;                   // active high
  // cpu
  wire [23:0] adr;            // address bus
  wire [31:0] inbus;          // data to RISC core from RAM or IO
  wire [31:0] codebus;        // code to RISC core from RAM or ROM
  wire [31:0] outbus;         // data from RISC core
  wire rd;                    // CPU read
  wire wr;                    // CPU write
  wire ben;                   // CPU byte enable
  wire irq;                   // interrupt request to CPU
  // prom
  wire prom_stb;
  wire [31:0] prom_dout;
  // ram
  wire ram_stb;
  wire [31:0] ram_dout;
  // i/o
  wire io_en;                 // i/o enable
  wire [31:0] io_out;         // io devices output
  // ms timer
  wire tmr_stb;
  wire [31:0] tmr_dout;       // running milliseconds since reset
  wire tmr_ms_tick;           // millisecond timer tick signal
  wire tmr_ack;
  // lsb
  wire lsb_stb;
  wire [9:0] lsb_leds_r_in;  // direct signals in for red LEDs
  wire [31:0] lsb_dout;      // buttons, switches status
  wire [3:0] lsb_btn;        // button signals out, 'clk' synced
  wire [9:0] lsb_swi;        // switch signals out, 'clk' synced
  wire lsb_ack;
  // rs232
  wire rs232_0_stb;
  wire [31:0] rs232_0_dout;   // received data, status
  wire rs232_0_ack;
  // spi
  wire spi_0_stb;
  wire [31:0] spi_0_dout;     // received data, status
  wire spi_0_sclk_d;          // sclk signal from device
  wire spi_0_mosi_d;          // mosi signal from device
  wire spi_0_miso_d;          // miso signal to device
  wire [2:0] spi_0_cs_n_d;    // chip selects from device
  wire spi_0_ack;
  // gpio
  wire gpio_stb;
  wire [31:0] gpio_dout;      // pin data, in/out control status
  wire gpio_ack;
  // i2c
  wire i2c_stb;
  wire [31:0] i2c_dout;
  wire i2c_ack;

  // clocks
  clocks clocks_0 (
    // in
    .rst(clk_rst),
    .clk_in(clk_in),
    //out
    .clk_ok(clk_ok),
    .clk(clk)
  );

  // reset
  reset reset_0 (
    // in
    .clk(clk),
    .clk_ok(clk_ok),
    .rst_in_n(btn_in_n[3]),
    // out
    .rst_out(rst)
  );

  // CPU
  RISC5 #(.start_addr(24'hFFE000)) risc5_0 (
    // in
    .clk(clk),
    .rst(~rst),
    .irq(irq),
    .codebus(codebus[31:0]),
    .inbus(inbus[31:0]),
    // out
    .rd(rd),
    .wr(wr),
    .ben(ben),
    .adr(adr[23:0]),
    .outbus(outbus[31:0])
  );

  // boot ROM
  prom #(.mem_file(prom_file)) prom_0 (
    // in
    .clk(~clk),
    .en(prom_stb),
    .addr(adr[10:2]),
    // out
    .data_out(prom_dout[31:0])
  );

  // RAM 416k
  ramg5 #(.num_kbytes(416)) ram_0 (
    // in
    .clk(clk),
    .en(ram_stb),
    .we(wr),
    .be(ben),
    .addr(adr[18:0]),
    .data_in(outbus[31:0]),
    // out
    .data_out(ram_dout[31:0])
  );

  // ms timer
  // one IO address
  ms_timer #(.clock_freq(clock_freq)) tmr_0 (
    // in
    .clk(clk),
    .rst(rst),
    .stb(tmr_stb),
    // out
    .data_out(tmr_dout[31:0]),
    .ms_tick(tmr_ms_tick),
    .ack(tmr_ack)
  );

  // LEDs, switches, buttons, 7-seg displays
  // one IO address
  assign lsb_leds_r_in[9:0] = 10'b0; // only 'clk' synced signals
  lsb_s lsb_0 (
    // in
    .clk(clk),
    .rst(rst),
    .stb(lsb_stb),
    .we(wr),
    .leds_r_in(lsb_leds_r_in[9:0]),
    .data_in(outbus[31:0]),
    // out
    .data_out(lsb_dout[31:0]),
    .ack(lsb_ack),
    .btn_out(lsb_btn[3:0]),
    .swi_out(lsb_swi[9:0]),
    // external in
    .btn_in_n(btn_in_n[3:0]),
    .swi_in(swi_in[9:0]),
    // external out
    .leds_g(led_g[7:0]),
    .leds_r(led_r[9:0]),
    .hex1_n(hex1_n[6:0]),
    .hex0_n(hex0_n[6:0])
  );

  // RS232 buffered
  // two consecutive IO addresses
  rs232 #(.clock_freq(clock_freq), .buf_slots(rs232_buf_slots)) rs232_0 (
    // in
    .clk(clk),
    .rst(rst),
    .stb(rs232_0_stb),
    .we(wr),
    .addr(adr[2]),
    .data_in(outbus[7:0]),
    // out
    .data_out(rs232_0_dout[31:0]),
    .ack(rs232_0_ack),
    // external
    .rxd(rs232_0_rxd),
    .txd(rs232_0_txd)
  );

  // SPI
  // two consecutive IO addresses
  spie #(.epo_compat(1'b1), .slow_div(80)) spie_0 (
    // in
    .clk(clk),
    .rst(rst),
    .stb(spi_0_stb),
    .we(wr),
    .addr(adr[2]),
    .data_in(outbus[31:0]),
    // out
    .data_out(spi_0_dout[31:0]),
    .ack(spi_0_ack),
    // external out
    .cs_n(spi_0_cs_n_d[2:0]),
    .sclk(spi_0_sclk_d),
    .mosi(spi_0_mosi_d),
    // external in
    .miso(spi_0_miso_d)
  );

  assign sdcard_cs_n = spi_0_cs_n_d[0];
  assign sdcard_sclk = spi_0_sclk_d;
  assign sdcard_mosi = spi_0_mosi_d;

  assign spi_cs_n[1] = spi_0_cs_n_d[1];
  assign spi_sclk[1] = spi_0_sclk_d;
  assign spi_mosi[1] = spi_0_mosi_d;

  assign spi_cs_n[2] = spi_0_cs_n_d[2];
  assign spi_sclk[2] = spi_0_sclk_d;
  assign spi_mosi[2] = spi_0_mosi_d;

  assign spi_0_miso_d = sdcard_miso & spi_miso[1] & spi_miso[2];

  // GPIO
  // two consecutive IO addresses
  gpio #(.num_gpio(num_gpio)) gpio_0 (
    // in
    .clk(clk),
    .rst(rst),
    .stb(gpio_stb),
    .we(wr),
    .addr(adr[2]),
    .data_in(outbus[num_gpio-1:0]),
    // out
    .data_out(gpio_dout),
    .ack(gpio_ack),
    // external
    .io_pin(gpio[num_gpio-1:0])
  );

  // I2C
  // four consecutive IO addresses
  i2ce i2ce_0 (
    // in
    .clk(clk),
    .rst(rst),
    .stb(i2c_stb),
    .we(wr),
    .addr(adr[3:2]),
    .data_in(outbus[31:0]),
    // out
    .data_out(i2c_dout),
    .ack(i2c_ack),
    // external
    .scl(i2c_scl),
    .sda(i2c_sda)
  );


  // address decoding
  // ----------------
  // cf. memory map below

  // max RAM address space at 000000H to 0FFE000H (16 MB - 8 kB)
  // adr[23:0] = 0FFE000H => adr[23:13] = 11'h7FF
  assign ram_stb = (adr[23:13] != 11'h7FF);

  // codebus multiplexer: RAM or PROM
  // PROM: 2 kB at  0FFE000H => initial code address for CPU
  // PROM uses adr[10:2] (word address)
  assign prom_stb = (adr[23:12] == 12'hFFE && adr[11] == 1'b0);
  assign codebus[31:0] = ~prom_stb ? ram_dout[31:0] : prom_dout[31:0];

  // inbus multiplexer: RAM or IO
  // IO block: 256 bytes (64 words) at 0FFFF00H
  // there's space reserved for three more 256 bytes IO blocks
  // at: 0FFFE00H, 0FFFD00, 0FFFC00
  assign io_en = (adr[23:8] == 16'hFFFF);
  assign inbus[31:0] = ~io_en ? ram_dout[31:0] : io_out[31:0];

  // the traditional 16 IO word addresses of (Embedded) Project Oberon
  assign i2c_stb     = (io_en && adr[7:4] == 4'b1111);    // -16, -12, -8, -4
  assign gpio_stb    = (io_en && adr[7:3] == 5'b11100);   // -32 (data), -28 (ctrl/status)
  assign spi_0_stb   = (io_en && adr[7:3] == 5'b11010);   // -48 (data), -44 (ctrl/status)
  assign rs232_0_stb = (io_en && adr[7:3] == 5'b11001);   // -56 (data), -52 (ctrl/status)
  assign lsb_stb     = (io_en && adr[7:2] == 6'b110001);  // -60 note: system LEDs via LED()
  assign tmr_stb     = (io_en && adr[7:2] == 6'b110000);  // -64

  // extended IO address range
  // eg: calltrace (not implemented)
//  assign cts_stb     = (io_en && adr[7:3] == 5'b10110);  // -80, -76 (ctrl/status)

  // IO data out multiplexing
  // ------------------------
  assign io_out[31:0] =
    i2c_stb     ? i2c_dout[31:0] :
    gpio_stb    ? gpio_dout[31:0] :
    spi_0_stb   ? spi_0_dout[31:0] :
    rs232_0_stb ? rs232_0_dout[31:0] :
    lsb_stb     ? lsb_dout[31:0] :
    tmr_stb     ? tmr_dout[31:0] :
    32'h0;

endmodule

`resetall

/**
FFFFFF  +---------------------------+
        | 64 dev addr (1 word each) |     256 Bytes
FFFF00  +---------------------------+
        | 64 dev addr (unused)      |     256 Bytes
FFFE00  +---------------------------+
        | 64 dev addr (unused)      |     256 Bytes
FFFD00  +---------------------------+
        | 64 dev addr (unused)      |     256 Bytes
FFFC00  +---------------------------+
        |                           |
        |      -- unused --         |     3 kB
        |                           !
FFF000  +---------------------------+
        |                           !
        |     PROM (2k used)        |     4 KB
        |                           |
FFE000  +---------------------------+
        |                           |
        |                           |
        |                           |
        |                           |
        |                           |
        |          max              |
        |          RAM              |     16 MB - 8 kB
        |         space             |
        |                           |
        |                           |
        |                           |
        |                           |
        |                           |
000000  +---------------------------+
**/

[1] https://www.terasic.com.tw/cgi-bin/page ... 0#contents
[2] https://www.intel.com/content/www/us/en ... ource.html
[3] https://github.com/hgeisse/THM-Oberon
[4] https://www.terasic.com.tw/cgi-bin/page ... 2#contents
[5] https://github.com/ygrayne/oberon-epo/t ... cv-sk-base

Post Reply