Lecture 3 - The MSP-430 Software Design Flow

Introduction
MSP-430 C Tidbits
MSP-430 C Compiler
A tight loop
Timing of the inner loop
Conclusions

Introduction

So far, we discussed the architecture of the MPS-430, and we made a brief overview of the MSP-430 instruction set architecture.

The MSP-430 is a 16-bit microcontroller. It has a 16-bit address bus and a 16-bit data bus. It can address 64K address locations as byte-addressable memory. That memory holds the program and the data. Recent versions of the MSP-430 architecture use an extended 20-bit address providing up to 1 MByte of addressable memory. Such extended architectures are called MSP-430X - but they will not be considered in this course.
The MSP-430 memory map uses the following conventions. Special Function Registers and Peripherals are located in the bottom area of the memory, in the lower 512 bytes (address 0x0000 to 0x01FF). The RAM area, used for global and local variables, extends just beyond that into the higher memory addresses. For example, 4 Kbyte of RAM memory (an address range of 0x1000) would be mapped from 0x0200 to 0x1200. The ROM area, used for program storage, grows from the top of memory down. For example, 8Kbyte of ROM memory (an address range of ‘0x2000’) would be mapped from 0xE000 to 0xFFFF. This leaves a hole between the RAM and ROM, in case less then 64Kbyte of memory is physically present in the MSP-430 configuration. The topmost memory locations in the memory space are reserved for an interrupt vector table, where each interrupt vector is a 16-bit address. Thus, to store 16 interrupt vectors, we use 32 bytes in the region 0xFFE0 to ‘0xFFFF’. The most important interrupt vector is the reset vector, which holds the starting address of the program running on power-up. The reset vector is stored at 0xFFFE and 0xFFFF.
The MSP-430 is little endian. This means that the least significant byte is stored in the lower memory address. For example, the 16-bit constant ‘0x1234’, stored at byte memory address ‘0x1000’, is mapped as follows: 0x34 is stored at address 0x1000 and 0x12 is stored at address 0x1001. This becomes apparent when looking at a memory dump or at assembler listing files. For example:

0000a004 <__crt0_start>:
    a004:       31 40 00 42     mov     #16896, r1      ;#0x4200

The opcodes for this instruction consists of two 16-bit words, 0x431 and 0x4200. But the assembly listing file shows that addresses 0xa004 to 0xa007 hold the byte sequence 0x31, 0x40, 0x00 and 0x42.

Today, we will discuss the software design tools for the MSP-430 in further detail. We break down the meaning of the command line, review the process of compiling, linking, and loading, and observe the execution of an MSP-430 program. As an example, we will determine the ‘spin rate’ of a tight loop written in C. We will use static analysis on the assembly listing to answer the question on the spin rate.

MSP-430 C Tidbits

Writing C for the MSP-430 follows the same rules as writing C for any other processor. There are a few machine dependencies, including the size of variables. A 16-bit microcontroller like MSP-430 defines a 16-bit integer. Within the 16-bit memory space, variables are aligned on byte-boundaries or word boundaries.

Data Type	Size (bits)	Alignment (bits)
char	8	8
bool	8	8
short	16	16
int	16	16
long	32	16
long long	64	16
float	32	16
double	64	16
pointer (small)	16	16

MSP-430 GCC follows the MSP430 Application Binary Interface specifications defined by Texas Instruments. An Application Binary Interface (ABI) defines how a compiler organizes the output code, and how it implements functions and variables. For example, the ABI defines the size of data types such as integers and chars (section 2), the mechanics of a function call (section 3), the organization of the data and the stack (section 4) and the organization of the code (section 5), among other things. A detailed study of the ABI is not needed to understand the compiler output of MSP-430 GCC, but the document is helpful to help you navigate the assembly listing produced by the MSP-430 GCC.

MSP-430 C Compiler

The MSP-430 C compiler converts a source code program in C into an executable program for the MSP-430. The compiler flow typically involves three activities.

The compiler converts the C source code into an object file, a file which contains instructions for the MSP-430 that correspond to the functions of the C code. Along the process of generating C code, the compiler takes additional steps such as optimizing the sequence of instructions, expand complex operations into simple operations supported by the MSP-430 instruction set, and organizing the C variables into MSP-430 data memory and MSP-430 registers. Technically, the transformation of C source code into MSP-430 instructions is done in two steps. The first step, the compiler front-end, converts the high-level C language into an intermediate format that describes the C program in terms of generic register-to-register transfers. The second step, the compiler back-end, converts the intermediate register-transfer code into specific machine instructions. While the front-end is specific to the programming language used (C, C++, Ada, ..), the back-end is specific to the processor targeted.
The linker combines all object files, and possibly system libraries, together into an executable format. The main task of the linker is to resolve all unknown references in each object file. Furthermore, the linker also handles the organization of code into physical memory locations, following the guidelines and references given through the linker description file. Resolving an unknown reference (a function or a global variable) means that the linker will determine the address of the reference, such that the function can be called or the variable can be read. For example, a C program may make use of a function printf. The object code for this C program will contain a call to the printf function, but the call target address remains unknown until the object code is linked into an executable.
The loader converts the executable format into a memory-image format, a concrete representation of every part of the program precisely as it appears in the program and data memory of the MSP-430.

Each object file contains several sections, each of them serving a specific purpose for the compiled program. The .text holds all the instructions, the .data section holds initialized global variables, and the .bss section holds un-initialized global variables. There is no restriction on how many sections a compiler can create; .text, .data and .bss are the three most important ones, but MSP-430 Application Binary Interface defines several others. As the linker inspects the sections of each object file to be linked, it will group sections of the same type together.

As the linker groups sections together, the overall program is formed, and eventually, all unknown references have to be resolved. When, at the end of the linking process, there are still unknown references, the linking process will abort with an error indicating the unresolved reference. Such an error may occur, for example, when an incomplete list of object files is used to create the executable, or when there are missing system libraries.

The executable will eventually contain a .text, .data and .bss section which is the combination of sections of the linked object files and libraries. The linker then maps each section into a designated memory area, under the direction of the linker description file. It is very well possible that the linker runs out of space and finds that the available memory space is too small to hold the entire .text, .data or .bss section. In that case, the linker will abort with an error indicating that the available memory in the target MSP-430 is too small.

The linker can only detect static memory usage, such as the memory used for program text and for global variables. Dynamic memory use, such as the maximum depth of the stack or the amount of heap, cannot be determined by the linker as it is oblivious to the execution path of the program.

A tight loop

We will study the execution time of the following program in C. The program flips the bits of the P1 output port, each time calling a function delay() between every flip. We will work through each step of the compiling process, studying the output produced by each compilation step.

#include "omsp_system.h"

void delay() {
  volatile int k = 0;
  k = k + 1;
}

int main(void) {
  P1DIR = 0xFF;              // initialize for output
  WDTCTL = WDTPW | WDTHOLD;  // Disable watchdog timer

  P1OUT = 0xFF;
  while (1) {
    P1OUT = ~P1OUT;
    delay();
  }

  return 0;
}

You can download and compile the example as the other examples in the course, using git and make.

$ git clone https://github.com/vt-ece4530-f19/example-tightloop
$ cd example-tightloop/
$ make

Before discussing the compilation steps, it is helpful to consult the omsp_system.h file as it contains the definitions of special variables such as P1DIR and P1OUT. These are defined as initialized volatile pointers, where the initialization is an address in the 64K address range of the MSP-430. Accessing such a pointer by referencing it will result in an read or write operation at an absolute memory address.

#define  P1OUT       (*(volatile unsigned char *) 0x0021)
#define  P1DIR       (*(volatile unsigned char *) 0x0022)

The address is the same as the one defined for the MSP430 port P1 addresses, as defined in the MSP430x1x Family Users Guide.

Figure: MSP-430x1x P1 Address Definitions msp430p1address

Compiling with msp430-elf-gcc

/cygdrive/c/ti/msp430-gcc/bin/msp430-elf-gcc  -Wall -Os -mmcu=msp430c1111 -c main.c -o tightloop.o

Compiler Flag	Meaning
-Wall	Generate every warning possible. This is a cautious approach: inspect the messages from your compiler; they often hint at problems in the code
-Os	Optimize for size (Os). Ask the compiler to generate smallest possible code even at the expense of performance
-mmcu=msp430c1111	Use msp430c1111 as the target microcontroller. A flag like this instructs the compiler to select an instruction set specific to msp430c1111
-c main.c	The input C file; -c flag means compile-only
-o tightloop.o	The output object file

Many additional flags and options are possible; inspect them with man msp430-elf-gcc or with msp430-elf-gcc --help.

The utility msp43o-elf-objdump helps you to inspect the output file tightloop.o. The sections generated by the compiler can be dumped with the -s (or -h) flag. In this case, tightloop.o has a .text section, as well as a .comment section and an .MSP430.attributes section. Only the .text section will contribute actual MSP-430 instructions.

$ msp430-elf-objdump -h tightloop.o

tightloop.o:     file format elf32-msp430

Contents of section .text:
 0000 21838143 00009153 00002153 30410a12  !..C...S..!S0A..
 0010 0912f243 2200b240 805a2001 f2432100  ...C"..@.Z ..C!.
 0020 7a402100 39400000 fae30000 89123040  z@!.9@........0@
 0030 0000                                 ..
Contents of section .comment:
 0000 00474343 3a20284d 6974746f 20537973  .GCC: (Mitto Sys
 0010 74656d73 204c696d 69746564 202d206d  tems Limited - m
 0020 73703433 302d6763 6320382e 322e302e  sp430-gcc 8.2.0.
 0030 35322920 382e322e 3000               52) 8.2.0.
Contents of section .MSP430.attributes:
 0000 41180000 006d7370 61626900 010d0000  A....mspabi.....
 0010 00040106 0108010a 01                 .........

It’s possible to disassemble the .text section using the -d flag. This code is not yet processed by the linker, and this can be spotted in several locations.

The disassembly starts at address 0, while the actual program in memory should start at a higher address (e.g., for 8KB program memory, the starting address should be 0xE000 or above).
The delay() function is called in the main function, using the instruction call r9. However, the register r9 is clearly initialized to zero before the inner loop starts. The absolute target address of the delay function is still unknown.
The branch instruction at the end of the inner loop branches to address 0, while it should branch three instructions back. The absolute branch target address is still unknown.

$ msp430-elf-objdump -d tightloop.o

tightloop.o:     file format elf32-msp430


Disassembly of section .text:

00000000 <delay>:
   0:   21 83           decd    r1              ;
   2:   81 43 00 00     mov     #0,     0(r1)   ;r3 As==00
   6:   91 53 00 00     inc     0(r1)           ;
   a:   21 53           incd    r1              ;
   c:   30 41           ret

0000000e <main>:
   e:   0a 12           push    r10             ;
  10:   09 12           push    r9              ;
  12:   f2 43 22 00     mov.b   #-1,    &0x0022 ;r3 As==11
  16:   b2 40 80 5a     mov     #23168, &0x0120 ;#0x5a80
  1a:   20 01
  1c:   f2 43 21 00     mov.b   #-1,    &0x0021 ;r3 As==11
  20:   7a 40 21 00     mov.b   #33,    r10     ;#0x0021
  24:   39 40 00 00     mov     #0,     r9      ;

00000028 <.L3>:
  28:   fa e3 00 00     xor.b   #-1,    0(r10)  ;r3 As==11
  2c:   89 12           call    r9              ;
  2e:   30 40 00 00     br      #0x0000         ;

It is useful to dig a little deeper in this code and explain what it does. The delay function starts by decrementing the register r1 which serves for the MSP-430 as the stack pointer. Hence, this operation makes room on the stack for a new local variable. The variable is cleared on the next instruction and incremented in the instruction after that. Finally, register r1 (the stack pointer) is incremented again, making room on the stack, and the function returns.

A similar analysis can be made for the main function. The MSP-430 has a rich set of addressing modes, which allows for very compact code. Consider for example the bit-flip instruction:

  28:   fa e3 00 00     xor.b   #-1,    0(r10)  ;r3 As==11

At this point in the code, register r10 contains 0x21, the absolute address of port P1. This instruction will read a byte value from address 0x21, flip all the bits by xoring them with the immediate constant -1 (all-1 in two’s complement), and writing the result back to address 0x21. In a single instruction, we can find see immediate addressing (#01), indexed addressing (0(r10)), in-place operation, and size-specialized operation (xor.b)!

Linking with msp430-elf-gcc

/cygdrive/c/ti/msp430-gcc/bin/msp430-elf-gcc -mmcu=msp430c1111 -T linker.msp430.x tightloop.o -o tightloop.elf

Compiler Flag	Meaning
-mmcu=msp430c1111	Use msp430c1111 as the target microcontroller. A flag like this instructs the compiler to use a memory organization specific to msp430c1111
-T linker.msp320.x	Selects the linker description file, which describes how to map compiler sections to physical memory
tightloop.o	The input object file
-o tightloop.elf	The output executable file

The Executable Linkable Format (ELF) output file can be inspected using the msp430-elf-objdump utility and the same command line switches as before. The content of the file is slightly increased because the executable contains, beside the main function and the delay function, code to initialized and start the program. The section __reset_vector contains the reset vector, set at 0xa004.

$ msp430-elf-objdump.exe -s tightloop.elf

tightloop.elf:     file format elf32-msp430

Contents of section __reset_vector:
 fffe 04a0                                 ..
Contents of section .rodata2:
 a000 00000000                             ....
Contents of section .text:
 a004 31400042 b01248a0 0c43b012 22a03041  1@.B..H..C..".0A
 a014 21838143 00009153 00002153 30410a12  !..C...S..!S0A..
 a024 0912f243 2200b240 805a2001 f2432100  ...C"..@.Z ..C!.
 a034 7a402100 394014a0 fae30000 89123040  z@!.9@........0@
 a044 3ca03041 b01246a0 b01246a0 3041b012  <.0A..F...F.0A..
 a054 12a03041                             ..0A
Contents of section .MSP430.attributes:
 0000 41180000 006d7370 61626900 010d0000  A....mspabi.....
 0010 00040106 0108010a 01                 .........
Contents of section .comment:
 0000 4743433a 20284d69 74746f20 53797374  GCC: (Mitto Syst
 0010 656d7320 4c696d69 74656420 2d206d73  ems Limited - ms
 0020 70343330 2d676363 20382e32 2e302e35  p430-gcc 8.2.0.5
 0030 32292038 2e322e30 00                 2) 8.2.0.

The disassembled output reveals several additional functions, including __crt0_start, __crt0_call_init_then_main, __crt0_run_fini_array and __msp430_fini. They are used to set up the C runtime environment (crt), such as initialization of the stack pointer, clearing, and initialization of memory areas, and low-level processor initialization. Eventually, the C runtime calls the main function.

Unlike the disassembled object file, the disassembled executable file has correctly resolved all unknown references. For example the starting address of the code is within a valid program memory space, the delay function is correctly called at address 0xa014, and the absolute branches in the main function use the correct target.

$ msp430-elf-objdump.exe -d tightloop.elf

tightloop.elf:     file format elf32-msp430


Disassembly of section .text:

0000a004 <__crt0_start>:
    a004:       31 40 00 42     mov     #16896, r1      ;#0x4200

0000a008 <__crt0_call_init_then_main>:
    a008:       b0 12 48 a0     call    #-24504 ;#0xa048

0000a00c <.Loc.203.1>:
    a00c:       0c 43           clr     r12             ;

0000a00e <.Loc.204.1>:
    a00e:       b0 12 22 a0     call    #-24542 ;#0xa022

0000a012 <__crt0_run_fini_array>:
    a012:       30 41           ret

0000a014 <delay>:
    a014:       21 83           decd    r1              ;
    a016:       81 43 00 00     mov     #0,     0(r1)   ;r3 As==00
    a01a:       91 53 00 00     inc     0(r1)           ;
    a01e:       21 53           incd    r1              ;
    a020:       30 41           ret

0000a022 <main>:
    a022:       0a 12           push    r10             ;
    a024:       09 12           push    r9              ;
    a026:       f2 43 22 00     mov.b   #-1,    &0x0022 ;r3 As==11
    a02a:       b2 40 80 5a     mov     #23168, &0x0120 ;#0x5a80
    a02e:       20 01
    a030:       f2 43 21 00     mov.b   #-1,    &0x0021 ;r3 As==11
    a034:       7a 40 21 00     mov.b   #33,    r10     ;#0x0021
    a038:       39 40 14 a0     mov     #-24556,r9      ;#0xa014

0000a03c <.L3>:
    a03c:       fa e3 00 00     xor.b   #-1,    0(r10)  ;r3 As==11
    a040:       89 12           call    r9              ;
    a042:       30 40 3c a0     br      #0xa03c         ;

0000a046 <__crt0_run_init_array>:
    a046:       30 41           ret

0000a048 <__msp430_init>:
    a048:       b0 12 46 a0     call    #-24506 ;#0xa046

0000a04c <.Loc.19.1>:
    a04c:       b0 12 46 a0     call    #-24506 ;#0xa046

0000a050 <.Loc.20.1>:
    a050:       30 41           ret

0000a052 <__msp430_fini>:
    a052:       b0 12 12 a0     call    #-24558 ;#0xa012

0000a056 <L0>:
    a056:       30 41           ret

Measuring the size of the sections

msp430-elf-size is a utility that tells how large a program is. It lists the size of each section, in this case .text (instructions), .data (initialized global var), .bss (uninitialized global var). msp430-elf-size will tell you the size of the static portion of your program.

/cygdrive/c/ti/msp430-gcc/bin/msp430-elf-size tightloop.elf
   text    data     bss     dec     hex filename
     86       4       4      94      5e tightloop.elf

Loading the executable

To run the program on the MSP-430 (or to simulate it on a Verilog model of the MSP-430), we have to convert the executable file into a memory image, a copy of the initial memory contents of the MSP-430 right before the fetch of the first instruction at the reset vector. This activity is also called loading the executable. For a small embedded processor, computing the load image is straightforward.

/cygdrive/c/ti/msp430-gcc/bin/msp430-elf-objcopy -O ihex tightloop.elf tightloop.a43

The msp430-elf-objcopy command creates a MSP-430 memory dump of the executable program and writes it into the tightloop.a43 file. The contents of the file is added next, with additional spacing added for readability. Each line is a record indicating the number of data bytes contained, the starting address where they should be stored, the record type (with 00 marking a valid data record), the data bytes, and finally a checksum for error detection.

% num_bytes address rectype data_bytes_and_checksum
:04         A000    00      0000 0000 5C
:10         A004    00      3140 0042 B012 48A0 0C43 B012 22A0 3041 AB
:10         A014    00      2183 8143 0000 9153 0000 2153 3041 0A12 EF
:10         A024    00      0912 F243 2200 B240 805A 2001 F243 2100 77
:10         A034    00      7A40 2100 3940 14A0 FAE3 0000 8912 3040 2C
:10         A044    00      3CA0 3041 B012 46A0 B012 46A0 3041 B012 3C
:04         A054    00      12A0 3041 E5
:02         FFFE    00      04A0 5D
:04         0000    03      0000 A004 55
:00         0000    01      FF

The first few lines of the memory dump are the instructions that make up the program, starting at address A000 and going up to A057. Next, the reset vector is stored at address FFFE, indicating A004 as the reset entry point. Finally, the line with the 03 record serves as an end-of-file marker.

Timing of the inner loop

Finally, we will determine the timing for the inner loop, which contains just three instructions, one of which is a function call. The three instructions, and the contents of the function call, are shown next.

0000a03c <.L3>:
    a03c:       fa e3 00 00     xor.b   #-1,    0(r10)  ;r3 As==11
    a040:       89 12           call    r9              ;
    a042:       30 40 3c a0     br      #0xa03c         ;

0000a014 <delay>:
    a014:       21 83           decd    r1              ;
    a016:       81 43 00 00     mov     #0,     0(r1)   ;r3 As==00
    a01a:       91 53 00 00     inc     0(r1)           ;
    a01e:       21 53           incd    r1              ;
    a020:       30 41           ret

We recall that the MSP-430 instruction timing only depends on the type of instruction (no-operand, single-operand or two-operands), and on the addressing modes used by those instructions. Refer to Table 3-15 of the MSP430x1x Family Users Guide.

Assignment: Determine the round-trip time of one iteration of the innner loop.

Conclusions

We discussed the implementation of the MSP-430 software design flow in more detail, covering compilation, linking, and loading. These steps are also used for more complex processors such as Nios-II and ARM-A9. Of course, the software environment, such as the presence of a real-time OS, will affect the precise implementation of the linking and loading step. Regardless of the context, the insight into the software design flow is valuable to an embedded designer.