Lecture 3 - The MSP-430 Software Design Flow
Introduction
So far, we discussed the architecture of the MPS-430, and we made a brief overview of the MSP-430 instruction set architecture.
-
The MSP-430 is a 16-bit microcontroller. It has a 16-bit address bus and a 16-bit data bus. It can address 64K address locations as byte-addressable memory. That memory holds the program and the data. Recent versions of the MSP-430 architecture use an extended 20-bit address providing up to 1 MByte of addressable memory. Such extended architectures are called MSP-430X - but they will not be considered in this course.
-
The MSP-430 memory map uses the following conventions. Special Function Registers and Peripherals are located in the bottom area of the memory, in the lower 512 bytes (address
0x0000
to0x01FF
). The RAM area, used for global and local variables, extends just beyond that into the higher memory addresses. For example, 4 Kbyte of RAM memory (an address range of0x1000
) would be mapped from0x0200
to0x1200
. The ROM area, used for program storage, grows from the top of memory down. For example, 8Kbyte of ROM memory (an address range of ‘0x2000’) would be mapped from0xE000
to0xFFFF
. This leaves a hole between the RAM and ROM, in case less then 64Kbyte of memory is physically present in the MSP-430 configuration. The topmost memory locations in the memory space are reserved for an interrupt vector table, where each interrupt vector is a 16-bit address. Thus, to store 16 interrupt vectors, we use 32 bytes in the region0xFFE0
to ‘0xFFFF’. The most important interrupt vector is the reset vector, which holds the starting address of the program running on power-up. The reset vector is stored at0xFFFE
and0xFFFF
. -
The MSP-430 is little endian. This means that the least significant byte is stored in the lower memory address. For example, the 16-bit constant ‘0x1234’, stored at byte memory address ‘0x1000’, is mapped as follows:
0x34
is stored at address0x1000
and0x12
is stored at address0x1001
. This becomes apparent when looking at a memory dump or at assembler listing files. For example:
0000a004 <__crt0_start>:
a004: 31 40 00 42 mov #16896, r1 ;#0x4200
The opcodes for this instruction consists of two 16-bit words, 0x431
and 0x4200
. But the assembly listing file shows that addresses 0xa004
to 0xa007
hold the byte sequence 0x31
, 0x40
, 0x00
and 0x42
.
Today, we will discuss the software design tools for the MSP-430 in further detail. We break down the meaning of the command line, review the process of compiling, linking, and loading, and observe the execution of an MSP-430 program. As an example, we will determine the ‘spin rate’ of a tight loop written in C. We will use static analysis on the assembly listing to answer the question on the spin rate.
MSP-430 C Tidbits
Writing C for the MSP-430 follows the same rules as writing C for any other processor. There are a few machine dependencies, including the size of variables. A 16-bit microcontroller like MSP-430 defines a 16-bit integer. Within the 16-bit memory space, variables are aligned on byte-boundaries or word boundaries.
Data Type | Size (bits) | Alignment (bits) |
---|---|---|
char | 8 | 8 |
bool | 8 | 8 |
short | 16 | 16 |
int | 16 | 16 |
long | 32 | 16 |
long long | 64 | 16 |
float | 32 | 16 |
double | 64 | 16 |
pointer (small) | 16 | 16 |
MSP-430 GCC follows the MSP430 Application Binary Interface specifications defined by Texas Instruments. An Application Binary Interface (ABI) defines how a compiler organizes the output code, and how it implements functions and variables. For example, the ABI defines the size of data types such as integers and chars (section 2), the mechanics of a function call (section 3), the organization of the data and the stack (section 4) and the organization of the code (section 5), among other things. A detailed study of the ABI is not needed to understand the compiler output of MSP-430 GCC, but the document is helpful to help you navigate the assembly listing produced by the MSP-430 GCC.
MSP-430 C Compiler
The MSP-430 C compiler converts a source code program in C into an executable program for the MSP-430. The compiler flow typically involves three activities.
-
The compiler converts the C source code into an object file, a file which contains instructions for the MSP-430 that correspond to the functions of the C code. Along the process of generating C code, the compiler takes additional steps such as optimizing the sequence of instructions, expand complex operations into simple operations supported by the MSP-430 instruction set, and organizing the C variables into MSP-430 data memory and MSP-430 registers. Technically, the transformation of C source code into MSP-430 instructions is done in two steps. The first step, the compiler front-end, converts the high-level C language into an intermediate format that describes the C program in terms of generic register-to-register transfers. The second step, the compiler back-end, converts the intermediate register-transfer code into specific machine instructions. While the front-end is specific to the programming language used (C, C++, Ada, ..), the back-end is specific to the processor targeted.
-
The linker combines all object files, and possibly system libraries, together into an executable format. The main task of the linker is to resolve all unknown references in each object file. Furthermore, the linker also handles the organization of code into physical memory locations, following the guidelines and references given through the linker description file. Resolving an unknown reference (a function or a global variable) means that the linker will determine the address of the reference, such that the function can be called or the variable can be read. For example, a C program may make use of a function
printf
. The object code for this C program will contain a call to theprintf
function, but the call target address remains unknown until the object code is linked into an executable. -
The loader converts the executable format into a memory-image format, a concrete representation of every part of the program precisely as it appears in the program and data memory of the MSP-430.
Each object file contains several sections, each of them serving a specific purpose for the compiled program. The .text
holds all the instructions, the .data
section holds initialized global variables, and the .bss
section holds un-initialized global variables. There is no restriction on how many sections a compiler can create; .text
, .data
and .bss
are the three most important ones, but MSP-430 Application Binary Interface defines several others. As the linker inspects the sections of each object file to be linked, it will group sections of the same type together.
As the linker groups sections together, the overall program is formed, and eventually, all unknown references have to be resolved. When, at the end of the linking process, there are still unknown references, the linking process will abort with an error indicating the unresolved reference. Such an error may occur, for example, when an incomplete list of object files is used to create the executable, or when there are missing system libraries.
The executable will eventually contain a .text
, .data
and .bss
section which is the combination of sections of the linked object files and libraries. The linker then maps each section into a designated memory area, under the direction of the linker description file
. It is very well possible that the linker runs out of space and finds that the available memory space is too small to hold the entire .text
, .data
or .bss
section. In that case, the linker will abort with an error indicating that the available memory in the target MSP-430 is too small.
The linker can only detect static memory usage, such as the memory used for program text and for global variables. Dynamic memory use, such as the maximum depth of the stack or the amount of heap, cannot be determined by the linker as it is oblivious to the execution path of the program.
A tight loop
We will study the execution time of the following program in C. The program flips the bits of the P1 output port, each time calling a function delay()
between every flip. We will work through each step of the compiling process, studying the output produced by each compilation step.
#include "omsp_system.h"
void delay() {
volatile int k = 0;
k = k + 1;
}
int main(void) {
P1DIR = 0xFF; // initialize for output
WDTCTL = WDTPW | WDTHOLD; // Disable watchdog timer
P1OUT = 0xFF;
while (1) {
P1OUT = ~P1OUT;
delay();
}
return 0;
}
You can download and compile the example as the other examples in the course, using git
and make
.
$ git clone https://github.com/vt-ece4530-f19/example-tightloop
$ cd example-tightloop/
$ make
Before discussing the compilation steps, it is helpful to consult the omsp_system.h
file as it contains the definitions of special variables such as P1DIR
and P1OUT
. These are defined as initialized volatile
pointers, where the initialization is an address in the 64K address range of the MSP-430. Accessing such a pointer by referencing it will result in an read or write operation at an absolute memory address.
#define P1OUT (*(volatile unsigned char *) 0x0021)
#define P1DIR (*(volatile unsigned char *) 0x0022)
The address is the same as the one defined for the MSP430 port P1 addresses, as defined in the MSP430x1x Family Users Guide.
Figure: MSP-430x1x P1 Address Definitions
Compiling with msp430-elf-gcc
/cygdrive/c/ti/msp430-gcc/bin/msp430-elf-gcc -Wall -Os -mmcu=msp430c1111 -c main.c -o tightloop.o
Compiler Flag | Meaning |
---|---|
-Wall | Generate every warning possible. This is a cautious approach: inspect the messages from your compiler; they often hint at problems in the code |
-Os | Optimize for size (Os). Ask the compiler to generate smallest possible code even at the expense of performance |
-mmcu=msp430c1111 | Use msp430c1111 as the target microcontroller. A flag like this instructs the compiler to select an instruction set specific to msp430c1111 |
-c main.c | The input C file; -c flag means compile-only |
-o tightloop.o | The output object file |
Many additional flags and options are possible; inspect them with man msp430-elf-gcc
or with msp430-elf-gcc --help
.
The utility msp43o-elf-objdump
helps you to inspect the output file tightloop.o
. The sections generated by the compiler can be dumped with the -s
(or -h
) flag. In this case, tightloop.o
has a .text
section, as well as a .comment
section and an .MSP430.attributes
section. Only the .text
section will contribute actual MSP-430 instructions.
$ msp430-elf-objdump -h tightloop.o
tightloop.o: file format elf32-msp430
Contents of section .text:
0000 21838143 00009153 00002153 30410a12 !..C...S..!S0A..
0010 0912f243 2200b240 805a2001 f2432100 ...C"..@.Z ..C!.
0020 7a402100 39400000 fae30000 89123040 z@!.9@........0@
0030 0000 ..
Contents of section .comment:
0000 00474343 3a20284d 6974746f 20537973 .GCC: (Mitto Sys
0010 74656d73 204c696d 69746564 202d206d tems Limited - m
0020 73703433 302d6763 6320382e 322e302e sp430-gcc 8.2.0.
0030 35322920 382e322e 3000 52) 8.2.0.
Contents of section .MSP430.attributes:
0000 41180000 006d7370 61626900 010d0000 A....mspabi.....
0010 00040106 0108010a 01 .........
It’s possible to disassemble the .text
section using the -d
flag.
This code is not yet processed by the linker, and this can be spotted
in several locations.
-
The disassembly starts at address 0, while the actual program in memory should start at a higher address (e.g., for 8KB program memory, the starting address should be
0xE000
or above). -
The
delay()
function is called in themain
function, using the instructioncall r9
. However, the register r9 is clearly initialized to zero before the inner loop starts. The absolute target address of the delay function is still unknown. -
The branch instruction at the end of the inner loop branches to address 0, while it should branch three instructions back. The absolute branch target address is still unknown.
$ msp430-elf-objdump -d tightloop.o
tightloop.o: file format elf32-msp430
Disassembly of section .text:
00000000 <delay>:
0: 21 83 decd r1 ;
2: 81 43 00 00 mov #0, 0(r1) ;r3 As==00
6: 91 53 00 00 inc 0(r1) ;
a: 21 53 incd r1 ;
c: 30 41 ret
0000000e <main>:
e: 0a 12 push r10 ;
10: 09 12 push r9 ;
12: f2 43 22 00 mov.b #-1, &0x0022 ;r3 As==11
16: b2 40 80 5a mov #23168, &0x0120 ;#0x5a80
1a: 20 01
1c: f2 43 21 00 mov.b #-1, &0x0021 ;r3 As==11
20: 7a 40 21 00 mov.b #33, r10 ;#0x0021
24: 39 40 00 00 mov #0, r9 ;
00000028 <.L3>:
28: fa e3 00 00 xor.b #-1, 0(r10) ;r3 As==11
2c: 89 12 call r9 ;
2e: 30 40 00 00 br #0x0000 ;
It is useful to dig a little deeper in this code and explain what it does.
The delay function starts by decrementing the register r1
which serves for the MSP-430
as the stack pointer. Hence, this operation makes room on the stack for a new
local variable. The variable is cleared on the next instruction and incremented in the instruction after that. Finally, register r1
(the stack pointer) is incremented
again, making room on the stack, and the function returns.
A similar analysis can be made for the main
function. The MSP-430 has a rich set
of addressing modes, which allows for very compact code. Consider for example
the bit-flip instruction:
28: fa e3 00 00 xor.b #-1, 0(r10) ;r3 As==11
At this point in the code, register r10
contains 0x21
, the absolute address of port P1. This instruction will read a byte value from address 0x21
, flip all the bits by xoring them with the immediate constant -1 (all-1 in two’s complement), and writing the result back to address 0x21
. In a single instruction, we can find see immediate addressing (#01
), indexed addressing (0(r10)
), in-place operation, and size-specialized operation (xor.b
)!
Linking with msp430-elf-gcc
/cygdrive/c/ti/msp430-gcc/bin/msp430-elf-gcc -mmcu=msp430c1111 -T linker.msp430.x tightloop.o -o tightloop.elf
Compiler Flag | Meaning |
---|---|
-mmcu=msp430c1111 | Use msp430c1111 as the target microcontroller. A flag like this instructs the compiler to use a memory organization specific to msp430c1111 |
-T linker.msp320.x | Selects the linker description file, which describes how to map compiler sections to physical memory |
tightloop.o | The input object file |
-o tightloop.elf | The output executable file |
The Executable Linkable Format (ELF) output file can be inspected using the msp430-elf-objdump
utility and the same command line switches as before. The content of the file is slightly increased because the executable contains, beside the main
function and the delay
function, code to initialized and start the program. The section __reset_vector
contains the reset vector, set at 0xa004
.
$ msp430-elf-objdump.exe -s tightloop.elf
tightloop.elf: file format elf32-msp430
Contents of section __reset_vector:
fffe 04a0 ..
Contents of section .rodata2:
a000 00000000 ....
Contents of section .text:
a004 31400042 b01248a0 0c43b012 22a03041 1@.B..H..C..".0A
a014 21838143 00009153 00002153 30410a12 !..C...S..!S0A..
a024 0912f243 2200b240 805a2001 f2432100 ...C"..@.Z ..C!.
a034 7a402100 394014a0 fae30000 89123040 z@!.9@........0@
a044 3ca03041 b01246a0 b01246a0 3041b012 <.0A..F...F.0A..
a054 12a03041 ..0A
Contents of section .MSP430.attributes:
0000 41180000 006d7370 61626900 010d0000 A....mspabi.....
0010 00040106 0108010a 01 .........
Contents of section .comment:
0000 4743433a 20284d69 74746f20 53797374 GCC: (Mitto Syst
0010 656d7320 4c696d69 74656420 2d206d73 ems Limited - ms
0020 70343330 2d676363 20382e32 2e302e35 p430-gcc 8.2.0.5
0030 32292038 2e322e30 00 2) 8.2.0.
The disassembled output reveals several additional functions, including __crt0_start
, __crt0_call_init_then_main
, __crt0_run_fini_array
and __msp430_fini
. They are used
to set up the C runtime environment (crt), such as initialization of the stack pointer, clearing, and initialization of memory areas, and low-level processor initialization. Eventually, the C runtime calls the main
function.
Unlike the disassembled object file, the disassembled executable file has correctly resolved all unknown references. For example the starting address of the code is within a valid program memory space, the delay
function is correctly called at address 0xa014
, and the absolute branches in the main function use the correct target.
$ msp430-elf-objdump.exe -d tightloop.elf
tightloop.elf: file format elf32-msp430
Disassembly of section .text:
0000a004 <__crt0_start>:
a004: 31 40 00 42 mov #16896, r1 ;#0x4200
0000a008 <__crt0_call_init_then_main>:
a008: b0 12 48 a0 call #-24504 ;#0xa048
0000a00c <.Loc.203.1>:
a00c: 0c 43 clr r12 ;
0000a00e <.Loc.204.1>:
a00e: b0 12 22 a0 call #-24542 ;#0xa022
0000a012 <__crt0_run_fini_array>:
a012: 30 41 ret
0000a014 <delay>:
a014: 21 83 decd r1 ;
a016: 81 43 00 00 mov #0, 0(r1) ;r3 As==00
a01a: 91 53 00 00 inc 0(r1) ;
a01e: 21 53 incd r1 ;
a020: 30 41 ret
0000a022 <main>:
a022: 0a 12 push r10 ;
a024: 09 12 push r9 ;
a026: f2 43 22 00 mov.b #-1, &0x0022 ;r3 As==11
a02a: b2 40 80 5a mov #23168, &0x0120 ;#0x5a80
a02e: 20 01
a030: f2 43 21 00 mov.b #-1, &0x0021 ;r3 As==11
a034: 7a 40 21 00 mov.b #33, r10 ;#0x0021
a038: 39 40 14 a0 mov #-24556,r9 ;#0xa014
0000a03c <.L3>:
a03c: fa e3 00 00 xor.b #-1, 0(r10) ;r3 As==11
a040: 89 12 call r9 ;
a042: 30 40 3c a0 br #0xa03c ;
0000a046 <__crt0_run_init_array>:
a046: 30 41 ret
0000a048 <__msp430_init>:
a048: b0 12 46 a0 call #-24506 ;#0xa046
0000a04c <.Loc.19.1>:
a04c: b0 12 46 a0 call #-24506 ;#0xa046
0000a050 <.Loc.20.1>:
a050: 30 41 ret
0000a052 <__msp430_fini>:
a052: b0 12 12 a0 call #-24558 ;#0xa012
0000a056 <L0>:
a056: 30 41 ret
Measuring the size of the sections
msp430-elf-size
is a utility that tells how large a program is. It
lists the size of each section, in this case .text
(instructions),
.data
(initialized global var), .bss
(uninitialized global var).
msp430-elf-size will tell you the size of the static portion
of your program.
/cygdrive/c/ti/msp430-gcc/bin/msp430-elf-size tightloop.elf
text data bss dec hex filename
86 4 4 94 5e tightloop.elf
Loading the executable
To run the program on the MSP-430 (or to simulate it on a Verilog model of the MSP-430), we have to convert the executable file into a memory image, a copy of the initial memory contents of the MSP-430 right before the fetch of the first instruction at the reset vector. This activity is also called loading the executable. For a small embedded processor, computing the load image is straightforward.
/cygdrive/c/ti/msp430-gcc/bin/msp430-elf-objcopy -O ihex tightloop.elf tightloop.a43
The msp430-elf-objcopy
command creates a MSP-430 memory dump of the executable program and writes it into the tightloop.a43
file. The contents of the file is added next, with additional spacing added for readability. Each line is a record indicating the number of data bytes contained, the starting address where they should be stored, the record type (with 00
marking a valid data record), the data bytes, and finally a checksum for error detection.
% num_bytes address rectype data_bytes_and_checksum
:04 A000 00 0000 0000 5C
:10 A004 00 3140 0042 B012 48A0 0C43 B012 22A0 3041 AB
:10 A014 00 2183 8143 0000 9153 0000 2153 3041 0A12 EF
:10 A024 00 0912 F243 2200 B240 805A 2001 F243 2100 77
:10 A034 00 7A40 2100 3940 14A0 FAE3 0000 8912 3040 2C
:10 A044 00 3CA0 3041 B012 46A0 B012 46A0 3041 B012 3C
:04 A054 00 12A0 3041 E5
:02 FFFE 00 04A0 5D
:04 0000 03 0000 A004 55
:00 0000 01 FF
The first few lines of the memory dump are the instructions that make up the program, starting at address A000
and going up to A057
. Next, the reset vector is stored at address FFFE
, indicating A004
as the reset entry point. Finally, the line with the 03
record serves as an end-of-file marker.
Timing of the inner loop
Finally, we will determine the timing for the inner loop, which contains just three instructions, one of which is a function call. The three instructions, and the contents of the function call, are shown next.
0000a03c <.L3>:
a03c: fa e3 00 00 xor.b #-1, 0(r10) ;r3 As==11
a040: 89 12 call r9 ;
a042: 30 40 3c a0 br #0xa03c ;
0000a014 <delay>:
a014: 21 83 decd r1 ;
a016: 81 43 00 00 mov #0, 0(r1) ;r3 As==00
a01a: 91 53 00 00 inc 0(r1) ;
a01e: 21 53 incd r1 ;
a020: 30 41 ret
We recall that the MSP-430 instruction timing only depends on the type of instruction (no-operand, single-operand or two-operands), and on the addressing modes used by those instructions. Refer to Table 3-15 of the MSP430x1x Family Users Guide.
Assignment: Determine the round-trip time of one iteration of the innner loop.
Conclusions
We discussed the implementation of the MSP-430 software design flow in more detail, covering compilation, linking, and loading. These steps are also used for more complex processors such as Nios-II and ARM-A9. Of course, the software environment, such as the presence of a real-time OS, will affect the precise implementation of the linking and loading step. Regardless of the context, the insight into the software design flow is valuable to an embedded designer.