23. 3
Source
Abstraction
Level
Ease of
Impl.
Learning
curve
Document-
ation
Floating/
Fixed Pt.
Design
Exploration
Testbench
generation
Verification
Size FPGA
Slices
AccelDSP Matlab untimed ++ ++ +
Auto
conversion
+ yes ++ 2411
Agility
Compiler
SystemC
Cycle
accurate
+ - + Fixed pt + no -
AutoPilot
C, C++,
SystemC
Cycle+
untimed
++ ++ ++ yes ++ yes ++ 232
Bluespec BSV
Cycle
accurate
+/- - + no limited no - 120
Catapult C
C, C++,
SystemC
Cycle+
untimed
++ ++ ++ Fixed pt ++ yes ++ 370/560
Compaan C, Matlab untimed + + - automatic ++ no + 3398
CyberWork
Bench
SystemC,
C, BDL
Cycle +
untimed
+ + + yes ++ yes + 243/292
DK Design
Suite
HandelC
Cycle
accurate
+/- - +/- Fixed pt limited no +/- 311/945
Impluse
CoDeveloper
ImpulseC
Cycle +
unlimited
- + - Fixed pt + no + 4611
ROCCC C subst untimed - +/- + no + no -
Synphony C C, C++ untimed ++ + + Fixed pt + yes ++ 1047
いろいろな高位合成
Wim Meeus, Kristof Van Beeck, Toon Goedemé, Jan Meel, Dirk Stroobandt, "An overview of today’s high-level synthesis tools",
Design Automation for Embedded Systems, September 2012, Volume 16, Issue 3, pp 31-51
24. 4
高位合成処理系,躍進のわけ
✔ Embedded processors are in almost every SoC
✔ Huge silicon capacity
✔ Behavioral IP reuse
✔ Verification drives the acceptance of HLS Specification
✔ Accelerators and heterogeneous SoC
✔ Less pressure for formal verification
✔ Ideal for platform-based synthesis
✔ More pressure for time-to-market
✔ Accelerated or reconfigurable computing
Cong, J.; Bin Liu; Neuendorffer, S.; Noguera, J.; Vissers, K.; Zhiru Zhang, "High-Level Synthesis for FPGAs: From
Prototyping to Deployment," in Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions
on , vol.30, no.4, pp.473-491, April 2011
35. 9
Javaプログラムへのアクセス
/** Generate code and emit a class file for a given class
* @param env The attribution environment of the outermost class
* containing this class.
* @param cdef The class definition from which code is generated.
*/
JavaFileObject genCode(Env<AttrContext> env, JCClassDecl cdef) throws IOException {
synthesijer.jcfrontend.Main.newModule(env, cdef); // add hook for synthesijer
try {
if (gen.genClass(env, cdef) && (errorCount() == 0))
return writer.writeClass(cdef.sym);
} catch (ClassWriter.PoolOverflow ex) {
log.error(cdef.pos(), "limit.pool");
} catch (ClassWriter.StringOverflow ex) {
log.error(cdef.pos(), "limit.string.overflow",
ex.value.substring(0, 20));
} catch (CompletionFailure ex) {
chk.completionError(cdef.pos(), ex);
}
return null;
}
JavaCompiler.javaのgenCode内に処理をぶら下げればよい
46. 20
Synthesijer でのHDL生成: 例題
package blink_led;
public class BlinkLED {
public boolean led;
public void run(){
while(true){
led = true;
for(int i = 0; i < 5000000; i++) ;
led = false;
for(int i = 0; i < 5000000; i++) ;
}
}
}
47. 21
Synthesijer でのHDL生成: 大枠
class
entity blink_led_BlinkLED is
port (
clk : in std_logic;
reset : in std_logic;
field_led_output : out std_logic;
field_led_input : in std_logic;
field_led_input_we : in std_logic;
run_req : in std_logic;
run_busy : out std_logic
);
end blink_led_BlinkLED;
class BlinkLED {
public boolean led;
public void run(){
while(true){
…
}
}
}
module
クラス毎にHWモジュールを生成
51. 25
Synthesijer でのHDL生成: 状態遷移
状態遷移機械に相当するステートマシンをベタに作る
always @(posedge clk) begin
if(reset == 1'b1) begin
run_method <= run_method_IDLE;
run_method_delay <= 32'h0;
end else begin
case (run_method)
run_method_IDLE : begin
run_method <= run_method_S_0000;
end
run_method_S_0000 : begin
run_method <= run_method_S_0001;
run_method <= run_method_S_0001;
end
run_method_S_0001 : begin
if (run_req_flag == 1'b1) begin
run_method <= run_method_S_0002;
end
end
run_method_S_0002 : begin
if (tmp_0003 == 1'b1) begin
run_method <= run_method_S_0004;
end else if (tmp_0004 == 1'b1) begin
run_method <= run_method_S_0003;
end
end
run_method_S_0003 : begin
...
53. 27
Synthesijer でのHDL生成: 出力(データ操作)
メソッド毎にスケジュール表を生成 → 文/式相当のデータ操作
run_0000: op=METHOD_EXIT, src=, dest=, next=0001
run_0001: op=METHOD_ENTRY, src=, dest=, next=0002 (name=run)
run_0002: op=JT, src=true:BOOLEAN, dest=, next=0004, 0003
run_0003: op=JP, src=, dest=, next=0023
run_0004: op=ASSIGN, src=true:BOOLEAN, dest=class_led_0000:BOOLEAN, next=0005
run_0005: op=ASSIGN, src=0:INT, dest=run_i_0001:INT, next=0006
run_0006: op=LT, src=run_i_0001:INT, 5000000:INT, dest=binary_expr_00002:BOOLEAN, next=0007
run_0007: op=JT, src=binary_expr_00002:BOOLEAN, dest=, next=0012, 0008
run_0008: op=JP, src=, dest=, next=0013
run_0009: op=ADD, src=run_i_0001:INT, 1:INT, dest=unary_expr_00003:INT, next=0010
run_0010: op=ASSIGN, src=unary_expr_00003:INT, dest=run_i_0001:INT, next=0011
run_0011: op=JP, src=, dest=, next=0006
run_0012: op=JP, src=, dest=, next=0009
run_0013: op=ASSIGN, src=false:BOOLEAN, dest=class_led_0000:BOOLEAN, next=0014
run_0014: op=ASSIGN, src=0:INT, dest=run_i_0004:INT, next=0015
run_0015: op=LT, src=run_i_0004:INT, 5000000:INT, dest=binary_expr_00005:BOOLEAN, next=0016
run_0016: op=JT, src=binary_expr_00005:BOOLEAN, dest=, next=0021, 0017
run_0017: op=JP, src=, dest=, next=0022
run_0018: op=ADD, src=run_i_0004:INT, 1:INT, dest=unary_expr_00006:INT, next=0019
run_0019: op=ASSIGN, src=unary_expr_00006:INT, dest=run_i_0004:INT, next=0020
run_0020: op=JP, src=, dest=, next=0015
run_0021: op=JP, src=, dest=, next=0018
run_0022: op=JP, src=, dest=, next=0002
run_0023: op=JP, src=, dest=, next=0000
...
assign tmp_0010 = run_i_0001 + 32'h00000001;
...
always @(posedge clk) begin
if(reset == 1'b1) begin
run_i_0001 <= 32'h00000000;
end else begin
if (run_method == run_method_S_0004) begin
run_i_0001 <= 32'h00000000;
end else if (run_method == run_method_S_0010) begin
run_i_0001 <= unary_expr_00003;
end
end
end
...
always @(posedge clk) begin
if(reset == 1'b1) begin
unary_expr_00003 <= 0;
end else begin
if (run_method == run_method_S_0009) begin
unary_expr_00003 <= tmp_0010;
end
end
end
...
54. 28
最適化の例
べたなプログラムを用意
public class SimpleProgram{
public int sum(int a, int b, int c,
int d, int e, int f,
int g, int h, int i,
int j, int k, int l){
c = a + b;
f = d + e;
i = g + h;
l = j + k;
c = c + f;
l = l + i;
c = c + l;
return c;
}
}
60. 34
インスタンス生成
クラスのインスタンスを生成/利用できる
public class Test002 {
public static final int DEFAULT_VALUE = 0x20;
public int[] a = new int[128];
public int x, y;
public void init(){
for(int i = 0; i < a.length; i++){
a[i] = 0;
}
}
public void set(int i, int v){
a[i] = v;
}
…
}
ex. sample/test/Test002.javaとsample/test/Test003.java
public class Test003 {
private final Test002 t = new Test002();
public void test(){
t.init();
t.set(0, 100); // 0 <- 100
…
}
…
}
64. 38
インスタンス生成の例
乱数生成のハードウェアをJavaから使う
// xorshift RNG, period 2^128-1
// cf. http://www.jstatsoft.org/v08/i14/paper
module xor128
(
input wire clk, input wire reset, output reg [63:0] q
);
reg[31:0] x = 123456789, y = 362436069, z = 521288629;
reg[31:0] w = 88675123;
reg[31:0] t;
always @(posedge clk) begin
q[63:32] <= 32'h0;
if(reset == 1'b1) begin
x <= 123456789;
y <= 362436069;
z <= 521288629;
w <= 88675123;
end else begin
t = x ^ (x << 11);
x = y; y = z; z = w;
w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));
q[31:0] <= w;
end
end
endmodule
68. 42
クラスの継承も可
継承したクラスのメソッドも利用可能
public class Test012 extends Test010{
public void run(){
test();
}
}
public class Test010{
public void test(){
int a = 20;
int b = 30;
int c = 0;
c = a & b;
c = a | b;
c = a ^ b;
}
}
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.numeric_std.all;
entity Test012 is
port (
clk : in std_logic;
reset : in std_logic;
run_busy : out std_logic;
run_req : in std_logic;
test_busy : out std_logic;
test_req : in std_logic
);
end Test012;
architecture RTL of Test012 is
...
83. 7
サンプル1: 文字の入出力ルーチン
public void read() {
prompt();
for (int i = 0; i < CODESIZE; i++) {
byte b;
b = io.getchar();
//io.putchar(b);
if (b == 'n' || b == 'r') {
prog[i] = (byte) 0;
break;
} else {
prog[i] = b;
}
}
}
public void print() {
boolean flag = true;
for (int i = 0; i < CODESIZE; i++) {
byte b = prog[i];
if (b == 0) {
break;
}
io.putchar(b);
}
io.putchar((byte) 'n');
}
84. 8
サンプル1: SW版とHW版の入出力
public class IO {
private rs232c obj = new rs232c();
public void putchar(byte c) {
obj.write(c);
}
public byte getchar() {
byte b;
b = obj.read();
return b;
}
}
public class IO{
public void putchar(byte c){
System.out.print((char)c);
}
public byte getchar(){
try{
return (byte)(System.in.read());
}catch(Exception e){
throw new RuntimeException(e);
}
}
}
85. 9
サンプル2: ディスプレイに絵を書く
public class BitMapTest extends Thread{
...
public void run(){
// 座標(100, 100)の位置にサイズ20の正方形を描画
canvas.rect(100, 100, 20, 20, BitMapCanvas.RED, 0);
int c0 = 0, c = 0;
// 画面上にサイズ40の正方形を色を変えながら敷き詰める
for(int i = 0; i < 16; i++){ // (/ 640 40)16
c = c0;
for(int j = 0; j < 12; j++){ // (/ 480 40)
canvas.rect(40*i, 40*j, 40, 40, color(c), 1);
c = (c + 1) & 0x7;
}
c0 = (c0 + 1) & 0x7;
}
// 線を描画するメソッドのテスト
canvas.line(0, 0, 639, 479, color(7), 1);
canvas.line(639, 0, 0, 479, color(7), 1);
// フレームバッファ1をスクロールさせる
int v = 0, h = 0;
while(true){
canvas.set_offset(0, v, 1);
v = (v == 479) ? 0 : v + 1;
t.sleep(1); // 適当なウェイト
}
}
133. 26
RTLでの設計の場合
…
type stateType is (NONE, CHARGE_5, CHARGE_10, CHARGE_15, OK)
signal state, next_state: stateType := NONE;
…
case (state) is
when NONE =>
if nickel then
next_state <= CHARGE_5;
elsif dime then
next_state <= CHARGE_10;
else
next_state <= NONE;
end if;
when CHARGE_5 =>
if nickel then
next_state <= CHARGE_10;
elsif dime then
next_state <= CHARGE_15;
else
next_state <= NONE;
end if;
…
134. 27
ベンディングマシンの例
class VendingMachine(n:String,c:String,r:String) extends Module(n,c,r){
val nickel = inP("nickel")
val dime = inP("dime")
val rdy = outP("rdy")
val seq = sequencer("main")
val s5,s10,s15,s_ok = seq.add()
rdy <= (seq.idle, LOW)
rdy <= (s_ok, HIGH)
seq.idle -> (nickel, s5)
seq.idle -> (dime, s10)
s5 -> (nickel,s10)
s5 -> (dime, s15)
s10 -> (nickel, s15)
s10 -> (dime, s_ok)
s15 -> (nickel, s_ok)
s15 -> (dime, s_ok)
s_ok -> seq.idle
}
val m = new VendingMachine("machine", "clk", "reset")
m.genVHDL()
m.visuallize_statemachine()
138. 31
設計資産の再利用
process (data)
begin
case (data) is
when 0 => segment <= X"7e";
when 1 => segment <= X"30";
…
when 9 => segment <= X"79";
when others => null;
end case;
end process;
4
0
2
51
6
7
3
毎回このパターンを書かないといけない.
従来のHDLによるRTL設計の場合
142. 35
自分なりの実装,例
継承とオーバーライドを使って,自然に表現可能
val x0 = inP("x0")
val y0 = inP("y0")
val z0 = outP("z0")
val z1 = outP("z1")
val z2 = outP("z2")
z0 := x0 and y0
val x1c = new CARRY4Sig(x0)
val y1c = new CARRY4Sig(y0)
z1 := x1c and y1c
z2 := z0 or z1
151. 7
Microsoft Kiwi (Distributing C#)
Greaves, D.; Singh, S., "Distributing C# methods and threads over Ethernet-connected FPGAs using Kiwi," in Formal Methods and Models
for Codesign (MEMOCODE), 2011 9th IEEE/ACM International Conference on , vol., no., pp.1-9, 11-13 July 2011