Lecture 09 - 02/14/19 - The CORDIC Algorithm CORDIC) 5:00 Outline Continuing on from last week, we will now discuss the design and implementation of the CORDIC algorithm, and its use in Direct Digital Synthesis. (last week - 1. Verilog operators 2. Signed computations 3. Precision 4. Synthesis of expressions ) 5. Fixed point computations 6. Verilog lookup tables 7. CORDIC 8. Direct Digital Synthesis (DDS) 5. FIXED POINT computations ================================================================= A fixed-point number is a bit vector of W bits, from which L are fractional bits. Signed integers are a subset of fixed-point numbers. A signed integer can be thought of as a <32,0> number. Example: <10,4> data type W <---------------------------------------> L <--------------> +---+---+---+---+---+---+---+---+---+---+ | | | | | | | | | | | +---+---+---+---+---+---+---+---+---+---+ | 2^0 | 2^-1 2^1 | 2^-2 2^2 | 2^-3 2^3 | 2^-4 2^4 | 2^5 | binary point Fixed-point numbers of length W smaller than or equal to 32 bits can be implemented as integer data types. (a) Fractional Constant -> To map a fractional constant v into a fixed point number implemented as an integer k, we compute: k = (v << L); For example, in a <32,4> data type, the constant 0.25 would be represented as the integer k = (0.25 << 4) = 4 (b) Value of a fixed-point number Similarly, we can compute the value of a fixed point number v represented in an integer k as follows: v = (k / (1 << L)); For example, in a <32,8> data type, the value integer 15 represents the value v = 15 / 256 = 0.05859375 Operations using fixed-point numbers The advantage of fixed-point numbers is that the operations on fixed-point numbers are integer operations, and hence they are efficient to implement. Fixed-point numbers do not use normalization logic or exponent-computation logic such as is found in floating-point hardware. However, fixed-point operations have to handle the binary point at position L, mainly if an operation would affect it. The rules are as follows. 1. The addition of two fixed-point numbers yields a similar fixed point number = + 2. The multiplication of two fixed-point numbers creates a fixed-point number of twice the size <2W, 2L> = * 3. The convert a fixed-point number into , where L1 > L2, use a shift operation: = () >> (L1 – L2) 4. To apply rule 3 for multiplication, we can produce a result from operands as follows: = ( * ) >> L Note that this formula assumes that no overflow will occur, or in other words, that there are no significant bits in the result between the position W+1 and 2W. This behavior corresponds to what we are used to for integer arithmetic. Example Let’s compute 0.25 * 0.35 + 1.15 using <32,8> arithmetic. #define FRACBITS 8 int a = (int) (0.25 * (1 << FRACBITS)); int b = (int) (0.35 * (1 << FRACBITS)); int c = (int) (1.15 * (1 << FRACBITS)); int tmp1, tmp2; tmp1 = (a * b) >> FRACBITS; tmp2 = tmp1 + c; This code produces the following integer values. a 64 b 89 c 294 tmp1 22 tmp2 316 The resulting value, tmp2, corresponds to the value 316/(1 << FRACBITS) = 1.234375. Note that the floating-point result of the expression (0.25 * 0.35 + 1.15) is 1.2375. There is some precision loss using fixed-point computation because we do not keep all fractional bits required to store the result of 0.25 * 0.35, This precision loss is called the quantization error. In fixed-point computations, we try to minimize the quantization error. At the same time, we try to avoid overflow. For example, the fixed-point data type cannot represent values bigger than 0.99... or smaller than -1. 6. VERILOG lookup table ================================================================= Example of a combinational lookup table: module cordicconst(input wire clk, input wire reset, input wire [3:0] address, output wire [17:0] data); reg [17:0] q; always @(*) begin q = 18'h0; case (address) 4'h0: q = 18'h6487; 4'h1: q = 18'h3b58; 4'h2: q = 18'h1f5b; 4'h3: q = 18'hfea; 4'h4: q = 18'h7fd; 4'h5: q = 18'h3ff; 4'h6: q = 18'h1ff; 4'h7: q = 18'hff; 4'h8: q = 18'h7f; 4'h9: q = 18'h3f; 4'ha: q = 18'h1f; 4'hb: q = 18'hf; 4'hc: q = 18'h7; 4'hd: q = 18'h3; 4'he: q = 18'h1; 4'hf: q = 18'h0; default: q = 18'h0; endcase end // always @ (*) assign data = q; endmodule 7. CORDIC = Coordinate Rotation Digital Computer ================================================================= CORDIC is an algorithm to compute trigonometric functions using low-complexity hardware. Similar to a bit-serial addition, it computes one bit of the result at a time. In a nutshell, given an angle theta, the CORDIC algorithm evaluates sin(theta) and cosine(theta) using successive approximations. The algorithm was first used in airborne digital computation in the '50s, but it has since found widespread use in robotics and communications. Assume an angle alpha, 0 < alpha < pi/2 Then the CORDIC algorithm will rotate the vector (1, 0) to (x1, y1) such as x1 = cos(alpha) and y1 = sin(alpha). More general, the CORDIC algorithm will rotate (x0,y0) to (x1, y1) over an angle alpha: x1 = x0.cos(alpha) - y0.sin(alpha) y1 = x0.sin(alpha) + y0.cos(alpha) (*) The CORDIC idea The idea of CORDIC is to decompose the rotation by alpha in a series of rotations by alpha0, alpha1, alpha2, ... such that the sum of all these alpha's is close to the target alpha. To get the iterative formulas, divide them by cos(alpha) to obtain: x1 = (x0 - y0.tan(alpha)) . (1/cos(alpha)) y1 = (x0.tan(alpha) + y0) . (1/cos(alpha)) The term 1/cos(alpha) can further be written as 1/cos(alpha) = sqrt(1 + tan(alpha)*tan(alpha)) = k Now, imagine that we choose alpha0, alpha1, .. such that alpha0 = arctan(1) alpha1 = arctan(1/2) alpha2 = arctan(1/4) .. And assume that we can write alpha as: alpha = alpha0 +/- alpha1 +/- alpha2 +/- alpha3 +/- ... where the signs +/- are chosen such that we approximate, step by step, the resulting value alpha (x1, y1) = rotate(x0, y0, alpha0) (x2, y2) = rotate(x1, y1, alpha1) ... because of our choice of alphai, this becomes: x1 = (x0 - y0).k0 y1 = (x0 + y0).k0 x2 = (x1 - (+/- 1/2) y1).k1 y2 = ((+/- 1/2) x1 + y1).k1 x3 = (x2 - (+/- 1/4) y2).k2 y3 = ((+/- 1/4) x2 + y2).k2 ... Each time we choose the sign to get closer to the target alpha. Also, the k0, k1, k2 can be precomputed as a single factor. K = k0.k1.k2. .. = product(sqrt(1 + tan(alphai)*tan(alphai))) The main observation is that the algorithmic kernel is extremely simple: x(i+1) = x(i) +/- (y(i) >> i) y(i+1) = +/- (x(i) >> i) + y(i) We collect all the k'1 together and initialize the algorithm as (k.x0, k.y0) How to control the sign used in the iterations? To decide on the +/- sign, we use an angle 'accumulator' that decides, for each iteration of the algorithm, in what direction to rotate: accumulator = 0 if (target > accumulator) accumulator = accumulator + arctan(2^-i) else accumulator = accumulator + arctan(2^-i) The factors arctan(2^i) are fixed and are stored in a lookup table. (*) Computing sinus and cosine To compute sine and cosine, we initialize the algorithm as: x0 = k y0 = 0 And then do n iterations. With each iteration, the precision increases with a factor arctan(2^-i). Roughly speaking, you get one bit per iteration. Algorithm in C fixed cordicsine(fixed inangle) { fixed X, Y, TargetAngle, CurrAngle; unsigned Step; X=FIXED(AG_CONST); /* AG_CONST * cos(0) */ Y=0; /* AG_CONST * sin(0) */ TargetAngle = angleadj(inangle); CurrAngle=0; for(Step=0; Step < 16; Step++) { fixed NewX; if (TargetAngle > CurrAngle) { NewX = X - (Y >> Step); Y = (X >> Step) + Y; X = NewX; CurrAngle += Angles[Step]; } else { NewX = X + (Y >> Step); Y = -(X >> Step) + Y; X = NewX; CurrAngle -= Angles[Step]; } } if (quadrant(inangle) < 2) return Y; else return -Y; } (*) How to deal with arbitrary angles: 0->2PI? The CORDIC algorithm is defined in the first quadrant. if (0 < alpha <= PI/2) then sine = cordic(alpha) The other quadrants are obtained by symmetry arguments: if (PI/2 < beta <= PI) then beta' = PI - beta is in the first quadrant and sin(beta') == sin(beta) therefore sine = cordic(PI - beta) if (PI < beta <= 3PI/2) then beta' = beta - PI is in the first quadrant and sin(beta') == -sin(beta) therefore sine = -cordic(beta - PI) if (3PI/2 < beta <= 2PI) then beta' = 2PI - beta is in the first quadrant and sin(beta') == -sin(beta) therefore sine = -cordic(2PI - beta) So we can capture this in two C functions: fixed quadrant(fixed inangle) // input: inangle in the range 0 .. 2PI // output: 0-3 for quadrant 0-3 and fixed angleadj(fixed inangle) // input angle is 0 .. 3pi // output angle in first quadrant such that // abs(sin(inangle)) = sin(outangle) // if inangle >2pi, subtract 2pi. // This brings inangle in the range 0 - pi and keeps the same quardrant 8. DIRECT DIGITAL SYNTHESIS ================================================================= The idea of direct digital synthesis (DDS) is to compute waveforms at real time to generate them. DDS contains a phase accumulator, a phase-to-output converter, and a DAC. phase accumulator --> phase-to-output-converter --> DAC (analog waveform) To generate a sine-wave, we will create a CORDIC-based DDS. Parameters: - 16 bit resolution for DAC samples FIX<16,14> - 16 bit resolution for phase FIX<18,15> - CORDIC with 16-bit resolution FIX<18,15> C Reference Implementation #include #include #define AG_CONST 1/1.6467602578655 #define FIXED(X) ((long int)((X) * 32768.0)) #define FLOAT(X) ((X) / 32768.0) #define PI2 FIXED(atan(1.0) * 2.0) typedef long int fixed; /* <18,15> fixed-point */ static const fixed Angles[]={ FIXED(0.7853981633974483L), FIXED(0.4636476090008061L), FIXED(0.2449786631268641L), FIXED(0.1243549945467614L), FIXED(0.0624188099959574L), FIXED(0.0312398334302683L), FIXED(0.0156237286204768L), FIXED(0.0078123410601011L), FIXED(0.0039062301319670L), FIXED(0.0019531225164788L), FIXED(0.0009765621895593L), FIXED(0.0004882812111949L), FIXED(0.0002441406201494L), FIXED(0.0001220703118937L), FIXED(0.0000610351561742L), FIXED(0.0000305175781155L), FIXED(0.0000152587890613L), FIXED(0.0000076293945311L), FIXED(0.0000038146972656L) }; //----------------------------------------------------- void showtable() { unsigned Step; printf("Angles Table\n"); for (Step = 0; Step < 15; Step++) printf("16'h%x\n", Angles[Step]); } //----------------------------------------------------- fixed quadrant(fixed inangle) { // output: 0-3 for quadrant 0-3 // we assume step-angle is smaller than pi // so that there are always at least two samples per period (alias-free) // hence the max input angle can be (2pi + pi) unsigned q = 0; // if inangle >2pi, subtract 2pi. // This brings inangle in the range 0 - pi and keeps the same quardrant if (inangle > 4*PI2) inangle = inangle - 4*PI2; if (inangle > 3*PI2) return 3; else if (inangle > 2*PI2) return 2; else if (inangle > PI2) return 1; return 0; } //----------------------------------------------------- fixed angleadj(fixed inangle) { // input angle is 0 .. 3pi // output angle in first quadrant such that // abs(sin(inangle)) = sin(outangle) // if inangle >2pi, subtract 2pi. // This brings inangle in the range 0 - pi and keeps the same quardrant if (inangle > 4*PI2) inangle = inangle - 4*PI2; if (inangle > 3*PI2) return (4*PI2 - inangle); else if (inangle > 2*PI2) return (inangle - 2*PI2); else if (inangle > PI2) return (2*PI2 - inangle); return inangle; } //----------------------------------------------------- fixed accumulator(fixed inangle, fixed inangleadd) { inangle = inangle + inangleadd; if (inangle > 4*PI2) inangle = inangle - 4*PI2; return inangle; } //----------------------------------------------------- fixed cordicsine(fixed inangle) { fixed X, Y, TargetAngle, CurrAngle; unsigned Step; X=FIXED(AG_CONST); /* AG_CONST * cos(0) */ Y=0; /* AG_CONST * sin(0) */ TargetAngle = angleadj(inangle); CurrAngle=0; for(Step=0; Step < 16; Step++) { fixed NewX; if (TargetAngle > CurrAngle) { NewX = X - (Y >> Step); Y = (X >> Step) + Y; X = NewX; CurrAngle += Angles[Step]; } else { NewX = X + (Y >> Step); Y = -(X >> Step) + Y; X = NewX; CurrAngle -= Angles[Step]; } } if (quadrant(inangle) < 2) return Y; else return -Y; } //----------------------------------------------------- int main(void) { fixed angle, angleadd; fixed sine; unsigned i; double fsine; printf("2pi %8x\n", 4*PI2); printf("3pi/2 %8x\n", 3*PI2); printf(" pi %8x\n", 2*PI2); printf(" pi/2 %8x\n", PI2); angleadd = PI2/11; printf(" inc %8x\n", angleadd); angle = angleadd; for (i=0; i<100; i++) { sine = cordicsine(angle); if (0) printf("a %5x s %10x ( sin(%8.5f) = %8.5f ) sin %8.5f err %18.15f\n", angle, sine, FLOAT(angle), FLOAT(sine), sin(FLOAT(angle)), FLOAT(sine) - sin(FLOAT(angle))); else printf("%x\n", sine); angle = accumulator(angle, angleadd); } } CONCLUSIONS ========================================================== In homework 4 you have to design a CORDIC verilog implementation starting from a reference design in C. The assignment is simulation-only (Modelsim), but it has to match the results from the C implementation. In homework 5, we will implement this module on the DE1-SoC. Hence, the module created in homework 4 needs to be synthesizable.