ECE 4514 Lecture 20 Designing with Multiple Clocks 5:00 CLOCK SKEW IN SYNCRHONOUS CIRCUITS Consider the following setup: register1 --> logic --> register2 clk1 clk2 The registers are characterized by: - tsetup - tcq - thold (we previously assumed thold = 0) In an ideal case, clk1 and clk2 are identical copies. What relationship can we derive with respect to the proper operation of the circuit? +------------+ +-----------+ | | | | clk1 --------+ +-----------+ +------- |---------------->| tcq + tlogic +------------+ +-----------+ | | | | clk2 --------+ +-----------+ +------- |<-| tsetup |->| |->| thold We can derive two conditions: (1) the logic should complete computation before the next edge of clk2 T > tcq + tlogic + tsetup (2) the logic should NOT change during the hold time of register2 thold < tcq + tlogic Due to imperfections in clock distribution, the clock signal may be slightly offset for each flop. The offset from clk1 to clk2 is called tskew With a positive skew, clk2 shifts to the right (later than clk1) With a negative skew, clk2 shifts to the left (earlier than clk1) POSITIVE SKEW: +------------+ +-----------+ | | | | clk1 --------+ +-----------+ +------- |---------------->| tcq + tlogic |-->| tskew +------------+ +-----------+ | | | | clk2 --------+ +-----------+ +------- |<-| tsetup |->| |->| thold This affects our two conditions as follows: (1) the logic should complete computation before the next edge of clk2 T + tskew > tcq + tlogic + tsetup or T > tcq + tlogic + tsetup - tskew This seems to be a good thing. The skew relaxes the clock period constraint, such that we can use a smaller clock period. (2) the logic should NOT change during the hold time of register2 thold + tskew < tcq + tlogic or thold < tcq + tlogic - tskew This is problematic. Assume that the skew increases, then at some point even a very small hold time (in the limit, 0) will no longer be met. This means that the skew is so large that register2 captures the data on the CURRENT clock edge rather than the next one. In the circuit, it appears as if the signal 'flies right through' a register. NEGATIVE SKEW: Describe the situation when skew is negative: T > tcq + tlogic + tsetup + |tskew| (1) thold < tcq + tlogic + |tskew| (2) (1) -> the clock period will need to increase to cope with negative skew (2) -> safe. thold will not be violated. Therefore, controlling skew is important. Unfortunately, for realistic circuits, clock skew can be positive as well as negative. For example: +---------------------------------------------+ | | V | register1 --> logic --> register2 --> logic2 --+ clk1 clk2 Assume in this circuit, that the clock is distributed from clk1 to clk2. Then we will see a positive skew for register 2, but a negative skew for register 1. 5:20 CLOCK DISTRIBUTION Clock distribution is therefore an important aspect of chip design. An FPGA or ASIC may use a specific clock network. A common type of clock network is the H-tree: + + + + | | | | +---+--+ +---+--+ | | | | | | + | + + | + | | +------1------+ | | + | + + | + | | | | | | +---+--+ +---+--+ | | | | + + + + The distance from (1) to any leaf of the H-tree is equally long. Therefore, registers placed in each others' neighbourhood will experience similar skew with respect to (1). An alternate is a grid-like or pane-like structure, which aims to minimize the overall delay at the expense of increased power 1 D D D D D D D D | | | | | | | | D +--+--+--+--+--+--+--+ D | | | | | | | | D +--+--+--+--+--+--+--+ D | | | | | | | | 1 D +--+--+--+--+--+--+--+ D 1 | | | | | | | | D +--+--+--+--+--+--+--+ D | | | | | | | | D +--+--+--+--+--+--+--+ D | | | | | | | | D D D D D D D D 1 5:25 CLOCK DOMAINS Maintaining a single-clock strategy is not always useful. Even though we advocated single-clock throughout the course when we were doing RTL design, for several applications the use of multiple clocks is useful: - passing data between systems that run at different clock frequency - adjusting the clock frequency of each part of a system to the strict minimum (for low power) - gating clocks (turning them off) when subsystems are not used When we say a 'clock domain', we refer to a region of a circuit where all synchronous elements are attached to the same net (wire). If we are using multiple clock inputs, and each clock input drives a different net, then we get a circuit with 2 clock domains. Naturally, in a system with N clock domains, there are N.(N-1) transitions. First, let's consider what can do wrong when we go from one clock domain to the other. register1 --> logic --> register2 clk1 clk2 +------------+ +-----------+ | | | | clk1 --------+ +-----------+ +------- | |---| TT | +-------+ +-------+ | | | | clk2 ------------+ +------+ +------- TT > tcq + tlogic + tsetup If two clocks are unrelated, it's quite possible that you get a setup time violation. If that happens, then register2 can experience metastability. 5:35 METASTABILITY A metastable flop is a flop in an invalid logic state - neither 1 nor 0 +---------------------------- | Din -------------+ data changes within |/////////| setup/hold region around clock edge +------------------- | Clk ---------------+ may result in metastability +------------- ??????????? metastable region Dout ---------------+ Electrically, a metastable flop is 'floating' between ground and vdd Eventually, the flop may converge to a stable 1 or a stable 0. HOWEVER: 1/ the time needed to recover from metastability is unkown 2/ the eventual value (0 or 1) is unknown In fact, metastable flops are sometimes used as a source of randomness in random number generators! Metastability is an adverse operating condition for digital circuits, and should be avoided at all costs. The classic solution to the problem of metastability is to use a synchronizer circuit: (1) (2) async_signal ---> register1 ----> register2 --> clk clk - async_signal is asynchronous and may cause metastability on register1 from time to time. However, register2 always has a valid bit. - As long as the metastability resolves within a clock cycle, then the above circuit will convert the async_signal into a sync_signal within a latency of two cycles. Sometimes additional registers are used (to cope with the rare case that a metastable event extends over more than a clock cycle): (1) (2) async_signal ---> register1 ----> register2 --> register3 --> clk clk clk 5:45 CLOCK DOMAIN CROSSING When we design circuits with multiple clock regions, we need to use techniques that help us cross clock domains. Solution 1: synchronizers on every clock transition register1 ------> register2 ----> register3 ---> logic_region_2 clk clk2 clk2 In this case, we minimize the chance of injecting metastable values into the logic of region2, by inserting a synchronizer circuit before every clock domain transition. This is, however, a rather rough approach, since it is not clear when exactly the value will transition from region1 to region2; there's uncertainty of 1 clock period. Solution 2: phase control of clocks When we have several related clocks (eg a 1X and a 2X clock), it may be possible to minimize the change of setup/hold violation by carefully controlling the relative phase of the clocks register1 -----> logic -----> register2 clk1 clk2 clk1 --> DLL circuit --> clk2 +------------+ +-----------+ | | | | clk1 --------+ +-----------+ +------- +-----+ +----+ | | | | clk2 --------+ +------+ +------- Since the phase of clk2 is precisely controlled with respect to clk1, a/ the hold time of register2 can be met (logic changes after upedge clk1) b/ the setup time of register2 can be met (logic stable before upedge clk2) However, this is an expensive solution Solution 3: using handshake circuits Remember the two-way handshake logic: +--------------------+ | | Req ---------+ +-------------------- +-----------------------+ | | Ack --------------+ +------------ Slow clock domain Fast clock domain | ----> data ---> | SLOW FSM | ----> req ----> | FAST FSM | <---- ack ----> | 1/ write data 2/ req = 1 3/ read data ack = 1 4/ req = 0 5/ ack = 0 We can build the transition from slow domain to fast domain using synchronizer circuits: For req, data: req/data ----> register1 ------> register2 ----> register3 ---> slowclk fastclk fastclk For ack: ack --------> register1 ------> register2 ----> register3 -+-> fastclk slowclk slowclk | | chkack <---- register4 <---- register5 ------------------+ Challenge? the fast clock domain may remove ack to quickly, i.e. before the slow clock domain has had a chance to read it. Solution? The chkack signal helps the Fast clock domain to detect when the slow clock domain has received the ack signal 6:05 CLOCK GATING Clock gating creates a clock region where the clock can be effectively reduced to 0. This is common in ASIC design, and complex SoC. It is not common in FPGA, because FPGA chips have their own optimized clock nets, separate from the 'compute' nets from FPGA. Basic idea of clock gating: basic clock ----+ +---| clkenable1 ---------|AND ---> clk1 | +---| clkenable2 ---------|AND ---> clk2 | +---| clkenable3 ---------|AND ---> clk3 Each of these clocks should be treated as a different clock region. In an FPGA, the above circuit is not a good idea. It is better to use flops with clock-enable inputs module d_ff_en_1seg ( input wire clk, reset, input wire en, input wire d, output reg q ); // body always @(posedge clk, posedge reset) if (reset) q <= 1'b0; else if (en) q <= d; endmodule 6:10 CONCLUSIONS - CLOCK SKEW IN SYNCRHONOUS CIRCUITS - CLOCK DOMAINS - METASTABILITY - CLOCK DOMAIN CROSSING Synchronizer Phase Control Handshake protocol - CLOCK GATING