ECE 4514 Digital Design II 22 January 2019 - INTRODUCTION 5:00 PM Class Introduction Goodwin 145 TR 5PM-6: 15 PM Instructor Introduction Class websites and information 1/ canvas.vt.edu 2/ piazza (link on canvas) 3/ GitHub (link on canvas) Description of the course - Digital Hardware Design * HDL down to technology (ASIC or FPGA) * Familiarity with tools for HDL compilation into hardware HDL verification using simulation HDL optimization to meet constraints * Practice and hands-on experiments Objectives as listed in Syllabus * Apply advanced design strategies that include testing and debugging techniques; * Meet specified design constraints, such as performance, power, and area, using contemporary techniques; * Use multiple clocks and asynchronous system techniques for high-speed data transfer; * Prototype complex digital systems that meet specific design constraints; * Compare and contrast the relative capabilities of various contemporary digital hardware technologies Prereqs * Have taken ECE 3544 with C- or better * I assume you are familiar with basics of Verilog - structural design - behavioral design - dataflow design Equipment * Software: Altera Quartus Prime Lite >= 17.0 Altera Modelsim >= 10.5b Verilator (used as coding style checker) (additional software may be added) * There is no book; however, there will be assigned readings. See 'Communication' below * DE1-SoC kit Loaner board Instructions for loaning out will be provided on canvas Course work * 10 Homework This is the BULK of the effort in this course - You have a weekly (sometimes, biweekly) assignment which will include coding assignments and experiments - We will make use of Github Classroom to distribute and collect assignments - There is no late policy in this course (!) How this works: to 'submit' your homework, you have to commit and push your files to a GitHub repository (terminology will be explained later). When the deadline rolls around, your latest push before the deadline counts as your submission. - All assignments are individual assignments. * Exams (3, equally distributed over the semester) * Quizzes (unknown number, online through canvas) * Weights 60% Homework 33% Exams 7% Quizzes Communication * Lecture notes are posted online (CANVAS) * Questions on Homework and Lectures can be displayed through Piazza (link on CANVAS) * Questions regarding individual arrangements, exams, can be sent to the instructor. Honor Code Special Needs 5:15 Concept of this course Studying Hardware Design involves learning three different facets of an engineering process: - You have to understand hardware design METHODS, techniques that are commonly used to design hardware - You have to understand the use of hardware design TOOLS, software programs that help to automate the design methods. - You have to build a sense of what APPLICATIONS are suitable for hardware design. Lectures will put roughly equal attention to METHODS, TOOLS and APPLICATIONS. For example, let's say we will be building a floating point adder on the DE1-SoC board. - Then the APPLICATION is the definition of floating point addition and the interface of the floating point adder to other components of the DE1-SoC board. - The METHOD may cover aspects that explain how to build a floating point addition algorithm using a datapath and a controller; how the floating point addition can be accelerated using pipeline registers; how the floating point addition is designed into a concrete Verilog Program. - Finally, the TOOLS may cover aspects of how we are going to simulate the floating point adder using a Simulator for Verilog, how that Verilog can be mapped into the DE1-SoC FPGA chip using the Quartus synthesis, mapping, and place-and-route tools, and so on. (Quick culinary analogy: the 'application' is the dish made from raw ingredients; the 'method' is the recipe minus the ingredients; the 'tools' are what you use to implement the recipe.) Hence, doing an excellent job of learning about building a floating point addition in an FPGA chip involves knowing something about the APPLICATION, about the METHODS that you use to convert the application into hardware, and about the TOOLS that help you effectively realize it. In this course, we will cover all three. -> Schedule in the syllabus: METHODS and TOOLS column refer to methods and tools as explained above. APPLICATIONS will be integrated as homework or as in-class examples. 5:20 Digital Hardware Design - What is Digital Hardware Design? "Digital Hardware Design is a systematic engineering activity that uses a design method to convert a high-level behavioral specification into a low-level structural specification, using constraints such as target technology, performance, and cost." - What do we mean by 'Digital'? Two fundamental properties set digital hardware apart from general purpose electric circuits: (1) Discrete voltage levels instead of continuous voltage levels We use a 'high' voltage and a 'low' voltage to represent a logic-0 and a logic-1. low high -------+------------------+-------> Voltage | | ....... ........ logic-0 unknown logic-1 These voltage levels bring NOISE IMMUNITY. Digital hardware can compute without errors, can play the same record over and over without quality loss, can decode the same email over and over with typo's, ... Don't confuse NOISE IMMUNITY with ACCURACY. Digital hardware does not have infinite precision; the accuracy is determined by the number of bits one uses to encode a given value. (2) SYNCHRONOUS DESIGN = Discrete time instead of continuous time We use clock cycles to discretize the computation in digital hardware in small steps. --------> TIME +------+ +------+ +------+ +------+ | | | | | | | -----+ +------+ +------+ +------+ | | | | | cycle 1 | cycle 2 | cycle 3 | To synchronous digital hardware, time is not analog but moves in discrete increments. The signal that marks these increments is called a clock. The period of the clock is called the clock cycle. Not all digital hardware is synchronous; some hardware uses multiple clock signals or sometimes even no clock signal at all (asynchronous logic). However, 95% of the hardware design effort is synchronous design, and that is what we'll focus on. - How do we do 'Digital Hardware Design'? There are two fundamentally different mechanisms in which you do hardware design: BOTTOM UP: create larger components from smaller ones TOP DOWN: decompose larger components into smaller ones Bottom Up, and Top-Down are not particular to digital hardware design; you could use these concepts in software programming, architecture, city planning, ... as well. A finished hardware design looks like a network of smaller components. You are well familiar with a HARDWARE SCHEMATIC. - In a hardware schematic typical primitive components are digital gates (AND, OR, XOR, NOT) and registers. - A hardware schematic has a number of input ports and a number of output ports - A hardware schematic can be a component in a higher-level schematic (ie. hardware schematics can have a hierarchy) Hardware schematics are very useful for bottom-up design. And schematics are still very common today for printed circuit board design. However, for digital hardware design, schematics are not very popular. They are useful for teaching, of course, or to explain a concept. But hardware schematics are not scalable; they are poor in handling complexity. Hardware schematics cannot reflect behavior; they only reflect structure. So hardware schematics are not convenient for TOP DOWN design. To cope with the rising complexity of hardware design, it is now common to use a HARDWARE DESCRIPTION LANGUAGE (or HDL) - Verilog - VHDL The critical difference between designing hardware by schematics and designing hardware by an HDL is that an HDL is better at capturing behavior without expressing structure. Hardware Schematics are rigorously bottom-up; HDL, on the other hand, have both a top-down and a bottom-up flavor. 5:40 Example of hardware design using HDL: HEARTBEAT We're going to discuss a simple hardware design example, walking through the HDL code for the design, converting the HDL into an implementation, and demonstrating the outcome. There are many aspects to the following example that will be discussed later. This only shows an example of a design, illustrating every step of hardware capture as an HDL, simulation, synthesis, and implementing on a DE1SoC board. The design is that of a 'running lights' application. It lights up the 10 LEDs on a DE1SoC board in sequence. The design needs to be such that it completes one round in one second. Let's start with a schematic at the level of the DE1SoC board to see how this will work: +-----------------------------+ 10X | | LED 50 MHz CLK ->+ FPGA CHIP +--->|----- GND KEY ->+ | | | +-----------------------------+ The FPGA chip has a 50MHz clock signal. The FPGA chip has also a pushbutton input (KEY) and a 10 LED output (LED). The pushbutton pulls down the input KEY from '1' to '0' when the button is pressed. We are to write a Verilog module that fits into the FPGA chip. We have a tool, called Quartus, that takes this Verilog module as input, and that converts it into a bitstream (programming file) for the FPGA. That bitstream can be downloaded into the FPGA chip to realize the functionality of the design. 5:45 How to design this running light? Turning an LED on or off is as simple as driving an output to logic '1'. So really what we want is: drive a running '1' across 10 outputs: 0000000001 0000000010 0000000100 0000001000 0000010000 0000100000 0001000000 0010000000 0100000000 1000000000 --> from here, back to start There is also a timing requirement: the entire cycle has to be finished in 1 second. That means that each output state will exist for 100ms before moving to the next output state. A chain of flip-flops, wired back to back, can implement the shifting. (I am showing small snippets of Verilog; this is not a complete program but just an example) reg [9:0] LEDR; always @(posedge clk) begin LEDR <= {LEDR[8:0], LEDR[9]}; end The problem with that, though, is that these flip-flops cannot be updated at the rate of the clock. They have to be updated at a clock rate of 100ms. So the example above is much, much too fast. Rather than slowing down the clock signal (which is 50MHz), we will slow down the rate at which the flops are updated. This maintains a SYNCHRONOUS design (i.e., every flop is clocked at the same rate, 50MHz) always @(posedge clk) begin LEDR <= tick ? {LEDR[8:0], LEDR[9]} : LEDR; end Now, how do we implement the tick signal? This is a signal that is high (1) for a single clock cycle every 100ms. At a 50MHz clock, this means that tick is high (1) every 5,000,000 clock cycles. The easiest to implement such an extremely low-duty cycle signal is to use a counter that wraps around every 5,000,000 cycles: always @(posedge clk) begin count <= tick ? 24'd0 : count + 24'd1; end assign tick = (count == 24'd5000000); Note how we used two processes: an always process to describe the counter update, and an assign statement to capture the combinational logic that describes counter overflow. So putting everything together, we're getting the following design: module heartbeatde1soc( input CLOCK2_50, input CLOCK3_50, input CLOCK4_50, input CLOCK_50, input [3:0] KEY, output [9:0] LEDR ); reg [9:0] ledr_reg; reg [23:0] count_reg; wire tick; always @(posedge CLOCK_50) if (KEY[0] == 1'b0) begin ledr_reg <= 10'b1; count_reg <= 24'd0; end else begin count_reg <= tick ? 24'd0 : count_reg + 24'd1; ledr_reg <= tick ? {ledr_reg[8:0], ledr_reg[9]} : ledr_reg; end assign tick = (count_reg == 24'd5000000); assign LEDR = ledr_reg; endmodule +------------------------------------------------------------+ | | REMARK: The following commands assume that you have installed: | a working Quartus prime lite (edition 17 or later) | | It also assumes that you have correctly set your PATH | so that it can find the Quartus tools from the command | line in PowerShell. | | On my machine, the following two paths are included: | C:\intelFPGA_lite\17.0\quartus\bin64; | C:\intelFPGA_lite\17.0\modelsim_ase\win32aloem; | | If you don't know how to extend the search path on your | machine, post a question on Piazza | 5:55 Simulation To simulate, we will use a tool called Modelsim. Before we can simulate a hardware design, we need to write a TESTBENCH, a Verilog module that encapsulates the hardware design and that mimics the input/output pins of the FPGA (Keys and LEDs). The testbench resets the design by simulating a keypress and then lets the clock run at a clock period of 20ns. `timescale 1ns/ 1ns module heartbeatde1soctb; reg CLOCK_50; reg CLOCK2_50; reg CLOCK3_50; reg CLOCK4_50; wire [9:0] LEDR; reg [3:0] KEY; heartbeatde1soc dut(CLOCK2_50, CLOCK3_50, CLOCK4_50, CLOCK_50, KEY, LEDR); always #10 CLOCK_50 <= ~CLOCK_50; initial begin $monitor("t=%8d LEDR=%10b", $time, LEDR); CLOCK_50 <= 0; CLOCK2_50 <= 0; CLOCK3_50 <= 0; CLOCK4_50 <= 0; KEY <= 4'hf; #500 KEY[0] <= 0; #1000 KEY[0] <= 1; end endmodule How does this work? - Initially, all inputs are tied to 0 - The clock starts running at 50MHz - At 500ns, the reset KEY goes down - At 1000ns, the reset KEY is released - The clock keeps running, thereby exercising the design What makes the following statement a clock of 50MHz? always #10 CLOCK_50 <= ~CLOCK_50; Two things: (1) `timescale 1ns/ 1ns means a time unit (#1) measures 1 ns (2) CLOCK_50 will toggle every 10ns, so with a 20ns period => 50 MHz The $monitor will print changes on LEDR to the console To simulate, use the following commands: (The commands I am illustrating use the command line interface and Powershell. I will mostly AVOID using GUIs and instead demonstrate as much as possible using the command line. In some cases, of course, GUIs WILL be important: when we inspect waveforms from a simulation, when we inspect chip layout, etc...) (Everything between '=' is a command as typed on the command line) (Lines starting with '#' are comment lines) ================================================== # create a work directory vlib work # compile verilog vlog heartbeatde1soc.v vlog heartbeatde1soctb.v # simulate 5 seconds of real-time # without the -c command line parameter, you will start the GUI vsim -c heartbeatde1soctb -do "run 5000ms" =================================================== 6:05 Synthesizing the design We will use Quartus as a synthesis tool. The tool goes through several steps to convert the Verilog into a bitstream. For a quick, automatic compilation, run the following command: ================================================== # compile the design to a bitstream quartus_sh --flow compile heartbeatde1soc ================================================== After the command finishes, you will have a bitstream for this design: heartbeatde1soc.sof Finally, we can download this design to the DE1SoC board with the following command: ================================================== # download the bitstream quartus_pgm -m jtag -o "p;heartbeatde1soc.sof@2 ================================================== After pressing the reset button, the running lights start. 6:10 What we will cover next: Over the next few weeks, I will be doing a brief review of the most essential concepts of digital design I. The objective is to define a hardware design method called 'RTL Design' or 'Register Transfer Level Design.' This will be the basis for our experiments and further learning. I will also be introducing the tools that we're going to use for RTL design and for Homework: Github, Modelsim, Quartus Homework 1 will be posted on THURSDAY. You will have 1 WEEK to solve it. Homework 1 will be a warm-up assignment building from Digital Design I knowledge.