Introduction

The purpose of this homework is to introduce the Hard Processor System on your DE1-SoC board. Before you can start with this assignment, you have to complete the following preparatory tasks.

  • Download the Linux image for your DE1-SoC board
  • Copy the image on an SD-card, and boot Linux on your board
  • Configure the network on your DE1-SoC, so that you can access the system over the network

The steps to achieve this are explained in Lecture 16 and explained in Lecture 17. If you have not set up your board yet to run Linux, do that first before continuing this homework.

In this homework, you will receive a complete hardware implementation of a HPS based system that drives three PIO ports on the FPGA board. The PIO ports are connected to the HEX display, the red LEDs and the push buttons (KEY buttons). Your task will be to (a) synthesize the hardware, (b) download the resulting bitstream to the HPS and configure the FPGA, (c) write a small performance evaluation program and (d) analyze the output of this program.

You have to write a report as part of the Homework. The assignment writeup will include the label REPORT to indicate questions that have to be answered in the report. The questions are summarized at the end of the assignment writeup, as well.

The command line in this assignment uses an environment identical to SoC Command Shell.

Downloading and Compiling the Design

git clone https://github.com/vt-ece4530-f19/homework-6-USERID
  • Inspect the repository and identify the following directories
Directory Purpose
example-hps-pio Hardware reference implementation
software-test Sample software program
software-q1 Software application for Question 1
software_rbfloader Utility program to download bitstreams on FPGA-SoC
conversion Script to convert bitstreams into Raw Bitstream Format (rbf)


  • Go the the example-hps-pio directory. Use the Quartus and Platform Designer to open the examplehpspio design, and the platformhps platform.

  • REPORT: Determine topology of the design and make a diagram of its main components. The diagram does not need to be overly detailed; buses can be drawn as wires and processors, memories, peripherals, .. can be drawn as blocks. However, make sure that the overall system architecture is apparent from the drawing.

  • The platform contains three PIO ports. For each PIO port, find out its width, and find out what element of the DE1SoC board it connects to.

  • REPORT Add a table in your report similar to the following.

PIO port Width Input or Output Purpose
PIO_0 number of bits input or output what board peripheral is driven by it
PIO_1      
PIO_2      


  • You do not have to compile the hardware into a bitstream. You can use the following prebuilt bitstream: examplehpspio.sof.

  • Next, convert the bitstream from sof format to raw bitstream format (rbf). Go to the conversion subdirectory and type make. When the script comletes, you will find two files: examplehpspio.rbf and hps_0.h. The former is a version of the bitstream suitable for download from with the ARM on the FPGA. The latter is an include file used by software applications. It is similar in purpose to the system.h file of Nios systems; it describes the main components from the software.

cd conversion
make
  • Next, compile the software that is used to download the bitstream to the board.
cd  software_rbfloader
make
  • Next, compile the software application to will be used to test the hardware
cd software_test
make
  • You are now ready to run the sample application. Copy all of the following files to the board.

    • The hardware bitstream examplehpspio.rbf
    • The bitstream loader hps_config_fpga
    • The software test application piotest

Here is the command I used to do this in one go. In your case, you may have to use something slightly different, compatible with the network connection available on your DE1-SoC board.

$ scp -P50444 software_test/piotest              \
              conversion/examplehpspio.rbf       \
              software_rbfloader/hps_config_fpga \
              root@172.29.39.72:/home/root
  • Now, log on to the DE1-SoC board.

  • On the DE1-SoC, download the bitstream as follows.

root@socfpga:~# ./hps_config_fpga examplehpspio.rbf
  • On the DE1-SoC, run the program as follows.
root@socfpga:~# ./piotest

You should see the HEX display counting and a counter on the red LEDs. By pressing the leftmost key (KEY3), you can reverse the direction of the counter. By pressing the rightmost key (KEY0), you can make it increment again. If all that is working fine, you are now ready for the main question.

Question: Determine the overhead of FPGA fabric access

  • Investigate the software application in software_q1. It contains a loop that writes 1000 values to the HEX display, as well as 1000 values to the main memory. The key section of that program is the following:
volatile unsigned long *h2p_lw_hex_addr=NULL;

// This function write to the HEX displays using a PIO configured in the FPGA
void printhex(unsigned j) {
  *h2p_lw_hex_addr = j;
}

// This function is identical to the previous one, but writes to a global variable in memory
volatile unsigned memhex;
void printhexmem(unsigned j) {
  memhex = j;
}

#define MEASUREMENTS 1000

int main(int argc, char **argv) {

  ...

  // initialize pointer to PIO port
  h2p_lw_hex_addr = virtual_base +
    ( ( unsigned long  )( ALT_LWFPGASLVS_OFST + PIO_1_BASE ) & ( unsigned long)( HW_REGS_MASK ) );

  // main measurement loop
  for (i = 0; i<MEASUREMENTS; i++) {
    printhex(i);
    printhexmem(i);
  }

 ...

}
  • The question is the following: determine, as accurately as possible, the performance difference between a call to printhex and a call to printhexmem

  • You need to refer to Lecture 17 and the example https://github.com/vt-ece4530-f19/example-hps-hello to see how to measure time.

  • Keep in mind the key points of Lecture 5: You have to handle the accuracy errors (overhead of timekeeping) as well as the precision errors (variations during measurement).

  • You have to make the comparison using two types of measurement. First, as a CPU cycle count (PERF_COUNT_HW_CPU_CYCLES). Next, as a CPU instruction count (PERF_COUNT_HW_INSTRUCTIONS). Again, refer to Lecture 17 for a discussion on performance measurement using the 64-bit timer on the ARM.

  • Refer to the assembly listing hw6q1.lst, which will be generated after you build the software in software_q1. The assembly listing will show you exactly what instructions go into printhex and printhexmem.

  • REPORT In your report, present the analysis of your performance comparison. Note that I am looking for a narrative, not a shortlist of numbers. Your analysis should contain at least the following elements.

    • Cycle count for printhex and cycle count for printhexmem
    • Causes of the possible cycle count difference
    • Instruction count for printhex and printhexmem
    • Causes for the possible instruction count difference.
    • Impact of compiler optimization (-O3 flag added to the CFLAGS macro in the Makefile)

Report Contents

  • REPORT: Determine topology of the design and make a diagram of its main components. The diagram does not need to be overly detailed; buses can be drawn as wires and processors, memories, peripherals, .. can be drawn as blocks. However, make sure that the overall system architecture is apparent from the drawing.

  • REPORT Add a table in your report similar to the following.

PIO port Width Input or Output Purpose
PIO_0 number of bits input or output what board peripheral is driven by it
PIO_1      
PIO_2      
  • REPORT In your report, present the analysis of your performance comparison. Note that I am looking for a narrative, not a shortlist of numbers. Your analysis should contain at least the following elements.

    • Cycle count for printhex and cycle count for printhexmem
    • Causes of the possible cycle count difference
    • Instruction count for printhex and printhexmem
    • Causes for the possible instruction count difference.
    • Impact of compiler optimization (-O3 flag added to the CFLAGS macro in the Makefile)

What to turn in

Write your report as a PDF file, and add it to the root of your repository. In addition push your modifications to software_q1 to the repository.

# add your report

git add myreport.pdf

# clean your implementation

cd example-hps-pio
quartus_sh --clean examplehpspio
cd ..

# push the result on github

git add *
git commit -m 'my homework 6 solution'
git push

Design Rubric

  • Topology Question: 4 points
    • Drawing contains all modules (2 points)
    • Modules are properly labeled (1 point)
    • System architecture shows off-chip elements (memories, switches, etc) (1 point)
  • PIO Table Question: 3 points
    • Width, I/O designation, Purpose is correct (1 point per row)
  • Performance Analysis Question: 13 points
    • Accuracy of cycle measurement is correctly estimated (2 points)
    • Precision of cycle measurement is correctly estimated (2 points)
    • Cycle count difference between printhex and printhexmem is correct (2 points)
    • Cycle count difference is correctly analyze by reference to assembly listing and architecture of the design (2 points)
    • Instruction count difference between printhex and printhexmem is correct (2 points)
    • Impact of compiler optimization on cycle count is analyzed (2 points)
    • Impact of compiler optimization on instruction count is analyzed (1 point)