+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
Clockless design language - ilia greenblat
1. clockless design language
First steps from language to silicon
Ilia Greenblat
greenblat@mac.com
+972-54-4927322
skype: igreenblat
M a a y2 2 ,
M y ,
1
2 2 0 12
0 12
2. So no clocks, now what?
Let’s focus on just one operation (like multiply)
We specify and control the sequence
same in pseudo-Verilog
1. wait till inputs become available
2. execute the operation input passive channel [7:0] A,B;
3. latch result, release inputs latch [15:0] Creg;
4. send the result down the stream. output active channel [15:0] C;
5. wait until accepted.
6. start over always begin
fork
wait(A); wait(B);
join
Creg = B*A;
release A; release B;
C = Creg;
end
2
May 2 ,
2 0 12
3. loop example
Bunch of modules communicating
asynchronously using channels
and tokens.
Each module gets datum or token
to work on, arbitrates for shared
resources, processes data and
when ready, passes the result
down the stream.
Each building block implements
the protocol of request/ack.
Each module “knows” when the
inputs become valid, when it’s
own computation completed, and
when it’s outputs are used and
new cycle can begin.
3
May 2 ,
2 0 12
5. Why bother?
• advanced nodes easier implementation.
• low power, low leakage, high VT cells
• clocking, power grid, ocv, sleep modes
• aging, voltage swings
• modularity & integration
5
May 2 ,
2 0 12
6. Lot of activity
• LARD, Balsa, Tangram : CSP languages
• VerilogCSP - verilog + macros : modeling of ..
• Handshake, Philips : out of business.
• Tiempo - SystemVerilog + Library : full flow, IP
provider
• + many point tools, not whole language+flow.
6
May 2 ,
2 0 12
7. So what the problem i am trying
to solve
• Good, Solid, Comprehensive and Free
• Entry level language
• Show tentative flow to silicon.
7
May 2 ,
2 0 12
8. The proposal
use slightly extended Verilog as clockless design
entry language, because:
expresses parallelism well.
describes hardware well.
expresses design intent.
is timing aware.
has all the needed language structures.
hierarchical and “objective”.
few additions make the life easier
8
May 2 ,
2 0 12
9. Procedural Verilog
Verilog written like C describes the sequence of
operations.each “always” is sequential
ays begin wait (dt); wait (!dt); counter = counter +1; if (counter==100) -> beep;endalwa
process.many “always” blocks run
concurrently.the basic storage element is latch.
Extensions:
latch, flipflop
release, wait overload
in/out channel
passive/active channel
active latch
several $system functions
9
May 2 ,
2 0 12
11. Arbitration
Arbitration is needed where access to shared resource
(like ram memory) is needed and it comes from two not
directly related “always” blocks.
So for example:
The compiler inserts arbiters to separate reads from
writes, and writes from writes.
So read value will not get ruined by write.
Another example:
output passive channel [15:0] readval;
always begin
wait readval; //passive channel waits for request.
readval = mcounter; // wins arb, reads latch, drives data out
release readval; // when request negated, we can de-assert the data
end
11
May 2 ,
2 0 12
12. Usable Verilog Elements
always begin initial begin
for (i=0; i<100; i=i+1) begin ... ... #(delay_time);
... end end
end
fork while (xx>10) begin always @eventA begin
-> eventA; ... ...
-> eventB end end
-> eventC;
join task do_something;
...
wait(till_something);
endtask
release some;
also: if, if-else, case, assign and more
12
May 2 ,
2 0 12
13. Tentative flow
common testbench
for all views
13
May 2 ,
2 0 12
14. Side Stepping
Suppose we see code like this:
always @(computeIt) begin
data=0;
for (i=0;i<100;i=i+1) begin
adat = ram[i];
#1;
We can use the same extended
bdat = ram[i+100]; verilog syntax to produce clocked
#1; synthesizable RTL.
ram[i]=adat-bdat;
end Instead of Clock-Reset-Data basic
end synthesizable rtl, We get to write the
design in procedural fasion.
14
May 2 ,
2 0 12
15. goes on..
netlist from regular adder synthesis is “dual-
railed” here.
packager creates best hierarchy and
inserts correct breakers
flow assembles timing constraints.
15
May 2 ,
2 0 12
16. side stepping (again)
Fpga validation needs another tricks to fool
the fpga software to create fastest circuit.
16
May 2 ,
2 0 12
18. Status
• First implementation of the compiler is working.
• Cadence toolchain is used to assemble the flow to gds.
• Several modules were designed and run through:
• Like: Uart, Pwm, Fir, picoblaze cpu.
• The only validation was with sdf verilog and fast spice in tester-
like setup.
• The subset appears to be powerful enough to implement these
modules. It is still evolving.
• Optimization at various stages is needed to reach the
performance.
18
May 2 ,
2 0 12
19. What’s next?
• more code examples to verify the usefulness of the
language
• identify kind of designs where this flow can have
biggest advantage, biggest impact.
• select a comprehensive proving ground project.
• add optimizations to reach performance/area/power
goals.
• cover missing validation steps in the flow
19
May 2 ,
2 0 12