ECE 2534 Spring 2018 Bit Manipulation
-------------------------------------

10:10 Bit Manipulation

What is 'bit manipulation' and why do we need it?

'bit manipulation' is the term used when we work the individual
bits of variables in C program.

In microcontroller programs, we often work with variables where
the individual bits of the variable have a different meaning.

For example, think of GPIO ports: there are several control
bits for each chip pin. For pin P1.0, for example:

    P1OUT.0     the output bit
    P1IN.0      the input bit
    P1DIR.0     the direction bit
    P1REN.0     the pullup bit
    (as well as a few others)
    
The microcontroller organizes these bits in bytes, which work
on 8 chip pins at the same time.

    P1OUT       the output bits for P1.7 down to P1.0
    P1IN        the input bits for P1.7 down to P1.0
    P1DIR       the direction bits for P1.7 down to P1.0
    P1REN       the pullup bits for P1.7 down to P1.0
    
These variables, P1OUT, P1IN, P1DIR and P1REN are available as
pseudo-variables in C code. They have an equivalent type
of unsigned char (i.e., a byte).

In this lecture, we will learn:

    - how to read individual bits in a C variable
    - how to set/reset individual bits in a C variable
    - how to shift/rotate individual bits in a C variable

10:14 Data Types

The C programming language supports multiple data types.
We will work with the following.

                bits        smallest    largest

                
signed char     8           -128        127
unsigned char   8           0           255
signed short    16          -32768      32767
unsigned short  16          0           65535
signed long     32          -2^31       2^31-1
unsigned long   32          0           2^32-1
signed long long 64         -2^63       2^63-1
unsigned long long 64       0           2^64-1

signed int      32          -2^31       2^31-1
unsigned int    32          0           2^32-1

- Note that 'signed' is default.

  So, writing
  
     signed short p;
     
  is the same as writing
  
     short p;
     
  It's good practice to always write signed/unsigned.
  It keeps your attention focused on exactly what
  variable type you work with.
  
  Some programmers will select variable names that
  reflect the data type. For example
  
        unsigned char ucLedValues;

  This is a form of defensive programming.
        
- Note that there is no 1-bit data type.

  We can store a single bit in an unsigned char
  
    unsigned char c;
    
    c = 1;
    
    if (c)  ...  // tests if bit is set
    if (!c) ...  // tests is bit is reset
  
    This code leaves 7 bits unused

10:20 Overflow

- When a variable 'overflows', C will not cause an error.
  Instead, it only drops the excess bits and then carries on.
  
  So
  
    unsigned char b;
    b = 255;
    b++;
    
  will clear b. To see why, rewrite the value in b in binary
  and perform a binary addition. Then, re-interpret the binary
  value in the selected data type.
  
    b = 255 = (11111111)2
    b + 1             1
               --------
              100000000

  So you get a carry bit and 8 LSB equal to 0. Reading the 8 lower
  bits as an unsigned char, you read 0.
    
  Similarly
  
    signed char b;
    b = -128;
    b = b - 1;
    
    How much is b?
    
    b = -128 = (10000000)
    b - 1       11111111  <- two's complement of -1
                --------
               101111111

  So you get a carry bit, a 0 and 7 '1'.
  Reading the 8 lower bits as signed char, you read 127.
  So oddly, -128 - 1 = 127. C doesn't bother.

  When you choose a data type for your application, you will always
  pay attention that the data type can hold all the values that you're
  interested in.
  
  Eg. if you wish to count until 1000, then an unsigned char is not a good
  choice because it can only count until 255.
  
10:25 Hexadecimal, Decimal, Binary

  Each number can be represented in multiple number systems hex, decimal, binary.
  In the implementation, all these numbers have an identical representation
  in bits.
  
  Hence, printing or writing something in decimal, hex, binary is a matter of
  convention.
  
  One nibble, or 4 bits, is the smallest quantity that can show all the hex values
  
    Decimal Hex     Binary
    ----------------------
    0       0       0000
    1       1       0001
    2       2       0010
    3       3       0011
    4       4       0100
    5       5       0101
    6       6       0110
    7       7       0111
    8       8       1000
    9       9       1001
    10      A       1010
    11      B       1011
    12      C       1100
    13      D       1101
    14      E       1110
    15      F       1111
    
  When you write constants in C, you can write them in decimal or hex (not binary,
  unfortunately).
  
  So
    unsigned char b = 11;
  is the same as
    unsigned char b = 0xb;
    
  Hex is convenient because it offers quick conversion between binary and
  hex.
  
  For example, how to capture 11001110 as a hex number?
  Split the binary number in groups of four bits:
  
        1100 1110
        
  and replace with a hex digit from the table abobe
  
        C    E
        
  Therefore, 11001110 as a C constant can be written as 0xce
  
10:30 Bit mask

A bit mask is a pattern that marks bit positions in a word.
For example, to indicate the third bit in a char, the bit mask
would be:

    0000 0100
    
Note that we count bits from the right (lsb position).

Here is mask for the third and fourth bit in a char:

    0000 1100
    
The driverlib (technically, the include file with low-level
definitions for the MSP432P4 microcontroller) defines a few
macros for us that are useful when we construct bit masks.

    BIT0    is  0x1
    BIT1    is  0x2
    BIT2    is  0x4
    BIT3    is  0x8
    BIT4    is  0x10
    BIT5    is  0x20
    BIT6    is  0x40
    BIT7    is  0x80
    BIT8    is  0x100
    BIT9    is  0x200
    BITA    is  0x400
    BITB    is  0x800
    BITC    is  0x1000
    BITD    is  0x2000
    BITE    is  0x4000
    BITF    is  0x8000

Another notation for bit masks is this:

    bit x can be selected using (1 << x)
    
So  BIT5 can also be written as (1 << 5)

'<<' is the left-shift operator 

10:35 Set, reset and flip bits

* Setting bits

  You can set a selected bit in a variable using bitwise-OR with
  a bitmask for the corresponding bits

Eg. Set bit 4 in integer q:

char q = 0x5;  // some value

q = q | BIT4;

The '|' is bitwise or

    q = 00000101;  // 0x5
           1       // BIT4
    q = 00010101;  // 0x15    

It's easy to extend this to setting multiple bits

q = q | BIT4 | BIT5 | BIT6;

    => q = 01110101;

* Resetting bits

  You can reset a selected bit in a variable using bitwise-AND with
  a the COMPLEMENT of the bitmask for the corresponding bits

MASK: BIT5  or (1<<5)
Complementary MASK: ~BIT5  or ~(1<<5)

The '~' operator is the bitwise complement operator: flip all the bits
in a variable

So BIT_5 would be 00100000
and ~BIT_5 would be (11011111)

To reset a bit in a variable:

    q = q & (~BIT5);

    q = 11110011; // 0xF3
        11011111; // ~BIT_5
      & 11010011; // 0xD3

To reset multiple bits, just chain the and operations, eg:

q = q & (~BIT5) & (~BIT2) & (~BIT0);

We could also write

q = q & ~(BIT0 | BIT2 | BIT5);

* Flipping bits

  You can flip a selected bit in a variable using bitwise-XOR with
  the bitmask of the corresponding bits

In this case we can use a bitwise XOR operation with a bit mask

    indeed, the xor operation:

    bit  mask    q
    0    0       0 
    0    1       1 -> if bit=0, q = 1
    1    0       1
    1    1       0 -> if bit=1, q = 0

Let's say

  q = 0x0;

to flip bit 5:

  q = q ^ BIT5;

    q = 00000000;
        00100000;  // BIT_5

        00100000;  // XOR result

To flip multiple bits, just chain the operations, eg

q = q ^ BIT5 ^ BIT2 ^ BIT0;

10:45 Shifting left and right

It's often needed to extract the value of a subfield and to do
operations with it. In that case, we need to use bit masking
and, typically, shifting.

Example - let's say a byte contains a 3-bit subfield, on bit 3, 4, and
          5, which can have the value 0 to 7. How can we quickly read
          the value of the subfield and set it?

          b7 b6 b5 b4 b3 b2 b1 b0
                 *  *  *           -> we're interested in this bit-field
                                      - what value is stored there?
                                      - how to update that value?

  To READ the bitfield:
  
  unsigned char q;  //  a byte
  unsigned mask = BIT_3 | BIT_4 | BIT_5; // 00111000
  unsigned field = (q & mask) >> 3;

  This returns the value of field, which can be 0 to 7
  
  To UPDATE the bitfield:
  
  Let's say that we now have a new value for it, newfield.
  We set this into q as follows.

  First, clear the value of the field in q
  Then, patch the newfield into it;

  unsigned char q;  //  a byte
  unsigned mask = BIT_3 | BIT_4 | BIT_5; // 00111000
  unsigned newfield = 4;  // some new value for the 3-bit field

  q = (q & ~mask) | (newfield << 3);

  Thus, shifting left and right is useful for alignment of bit fields.

  Caveat: be aware of sign extension when shifting signed numbers.
  The '>>' operator in C is an arithmetic shift. 
  It copies the sign bit onto itself.

  For example, let's say that 

  char q = 0x80; // q = -128

  Then, q >> 1 would result in the value -64 (= -128/2)
  But in two's complement, -64 = 11000000 binary = 0xC;

  A logical shift on a two's complement variable would ignore the sign,
  and it would compute 0x40.

  C does not have an operator for logical shift. It only can
  do arithmetic shift.

  To avoid sign-extension, cast the variable to unsigned before
  shifting.
  
        char q = 0x80;
        char v = ((unsigned char) q) >> 1;  // v = 0x40
  
10:55 Bit-Rotation

  This may occur when working with hardware peripherals;
  for example, you can build rotating led patterns, or you may
  need to rotate bits among two variables.
 
  Let's study the effect of rotation. We'll rotate left

     11101000  q

    rotl(q,x) means: rotate left over x bits 
     
    rotl(q,1)   ((q << 1) & 0xFE) | ((q >> 7) & 0x1)
    rotl(q,2)   ((q << 2) & 0xFC) | ((q >> 6) & 0x3)
    rotl(q,3)   ((q << 3) & 0xF8) | ((q >> 5) & 0x7)
    rotl(q,4)   ((q << 4) & 0xF0) | ((q >> 4) & 0xF)

  So any rotate left can be obtained by ORing two shifted/masked
  operations.

  In general we can write (for unsigned integers):

    rotate left n:  (x << n)  | (x >> (32-n))    
    rotate right n: (x >> n)  | (x << (32-n))

10:55 Summary

- C data types
    char, int, long
    signed, unsigned
- Overflow
- Hex, Decimal, Binary
- Setting, Resetting, Flipping Bits
- Shifting Bits
- Rotating Words