Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS
1. Run-Time Reconfiguration for
HyperTransport coupled FPGAs using
ACCFS
Jochen Strunk, Andreas Heinig, Toni Volkmer,
Wolfgang Rehm, Heiko Schick
Chemnitz University of Technology
Computer Architecture Group
Prof. Wolfgang Rehm
WHTRA 2009, Heidelberg / February 12th 2009
2. Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 2/23
3. Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 3/23
4. Introduction
FPGAs are used as accelerators in host coupled systems.
Hot-plug functionality of plug-in-cards are not supported by
most motherboards, BIOS’s, operating systems.
For continuous host link connectivity todays plug-in-cards
with FPGAs need a second chip which handles host
communication, most common: a second FPGA.
Uploading further accelerator modules / compute kernels is
not possible during run-time although sufficient space would
be available on the FPGA.
Run-time reconfiguration (RTR)
⇒ On the other side FPGAs offer run-time reconfiguration support
(DPR capable FPGAs, e.g. Xilinx Virtex, -2, -4, -5, -6).
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 4/23
5. Goals
Provide a solution for host coupled FPGAs
where only one FPGA is needed in a host coupled system,
i.e. no additional chip for host communication
which allows to change the functionality during run-time
e.g. for uploading further compute kernels
where a software framework does exist for user applications,
which allows easy handling without restricting the possibilities
of run-time reconfigurable FPGAs
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 5/23
6. Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 6/23
7. Run-Time Reconfiguration on FPGAs
Dynamic partial reconfiguration (DPR) is available on Xilinx
Virtex,-2,-4,-5,-6 FPGAs.
The functionality is divided into static and dynamic parts.
Dynamic parts are called Run-Time Reconfigurable Modules
(RTRMs).
Granularity of partially reconfigurable region (PRR) is directly
related to configuration frames.
Three different interfaces are available for reconfiguration
JTAG, SelectMAP, ICAP.
A design flow for ”Module based Partial Reconfiguration” is
applied.
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 7/23
8. Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 8/23
9. HT-Cave with RTR-Support
To support RTRMs the standard HT-Cave-IP-Core must be
enhanced.
The overall design must comply with module based partial
design flow.
To ease porting the infrastructure to other interconnects, e.g.
PCIe, the functionality is divided into:
host interface specific part:
HT Cave, HT Packet Engine
host interface independent part:
RTRM, RTRM Controller, Reconfig Unit, Internal Routing
Unit
To generate a RTRM bit stream file a framework is provided
to the user.
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 9/23
10. HT-Cave with RTR-Support
Infrastructure of a HT-Cave with RTR-Support:
static dynamic
RTRM
P P Controller
N N
RTRM
HTX R R HT Internal
HT
host Packet Routing Reconfig
Cave
connection Engine Unit Unit
Core
P P
N N
R R
host interface specific host interface independent
FPGA
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 10/23
11. Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 11/23
12. ACCFS-Introduction
Applications
Syscall−API
Process Management Virtual File System Virtual Memory Socket
– ACCFS –
logical: virtual: ACCFS
Accelerator File System
Char
ext2 ext3 ... vfat proc sysfs accfs Dev
Device Handler
Device Handler
Device Handler
Device Handler
Device Handler
ClearSpeed
ac97 driver
Block Devices AMD−Ati
Nvidia
FPGA
Open generalized interface
SPU
...
Vendor Interface
for integrating accelerators
Disk Controller Drivers Bus Drivers into Linux based systems
Hardware
PCIe Host Bridge
SPE FPGA ...
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 12/23
13. ACCFS-Hardware Integration Concepts
Virtualization
⇒ Optimize hardware usage
Generic interface
⇒ Establish interface based on well known standard: VFS
Separation of functionalities
⇒ Ease integration of new accelerator types in ACCFS
Host initiated DMA
⇒ Avoid page translation issues on the accelerator system
Asynchronous context execution
⇒ No need for threading when running multiple instances in parallel
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 13/23
14. ACCFS-FPGA Usage
Application Device Handler
Create Context
sys_acc_create
Establish Context
− FPGA available?
− First initialization
Configure Context − Returns: context descriptor
write (ctx/config)
Configure FPGA
− Validate bit stream
− Space available?
Data Exchange − Programm device
read / write (ctx/*)
Validate Request
Execute Design
sys_acc_run
State Transition
state goes into running
Wait for Finish
read (ctx/status)
Wait for ’STOP’
Destroy Context
close (ctx)
Wait for ’STOP’
...
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 14/23
15. Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 15/23
16. Case Study
As prove of concept we implemented 2 different compute
kernels as RTRMs:
a pattern matcher, which finds patterns in a byte stream
a Mersenne Twister, which is a pseudo random number
generator
A vendor device driver supporting the UoH HTX Virtex-4
XC4VFX60 FPGA Card was implemented.
An user application was implemented, which uploads and
exchanges the RTRMs during run-time with the use of
ACCFS.
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 16/23
17. Case Study
// Pattern Matcher offload function // Mersenne Twister offload function
int matcher_run (void * search_db_in, int db_size int run_compute_kernel (double * results_out,
void * patterns_in, int pattern_count, int results_count) {
void * results_out, int results_size) { // create context of our FPGA design
int ret; int fd_ctx = (int)acc_create("example", V_ID,
char bufstatus[12]; D_ID, 0750, NULL);
// create context of our static FPGA design // configure the design
int fd_ctx = (int)acc_create("example", V_ID, int fd_cfg = openat(fd_ctx, "config", O_WRONLY);
D_ID, 0750, NULL); configure_fpga(fd_cfg, MERSENNE_RTRM_BITSTREAM);
// configure the design // open memory
int fd_cfg = openat(fd_ctx, "config", O_WRONLY); int fd_mem = openat(fd_ctx, "memory/FPGA MEM1",
configure_fpga(fd_cfg, MATCHER_RTRM_BITSTREAM); O_RDWR);
// open memory and status // allocating buffer
int fd_mem = openat(fd_ctx, "memory/FPGA MEM1", int32_t * buffer = (int32_t *) mmap(NULL,
O_RDWR); MEM_SIZE, PROT_READ | PROT_WRITE,
int fd_status = openat(fd_ctx, "status", MAP_SHARED, fd_mem, 0);
O_RDONLY);
// fill memory with data (DMA bulk transfer)
pwrite(fd_mem, search_db_in, db_size, DB_OFFSET); int32_t * mt32_numbers = buffer + NUMBERS_OFFSET;
pwrite(fd_mem, patterns_in, 4 * pattern_count,
PATTERN_OFFSET);
// start the matcher // start the Mersenne twister MT32
acc_run(fd_ctx, 0); acc_run(fd_ctx, 0);
// check status
// (wait until context execution finished) // Example C function that uses random numbers
read(fd_status, bufstatus, 12); c_kernel_function(results_out, results_count,
mt32_numbers);
// read results of operation (DMA bulk transfer) // unmap buffer
ret = pread(fd_mem, results_out, munmap((void *) buffer, MEM_SIZE);
results_size, RESULTS_OFFSET);
// close files // close files
close(fd_mem); close(fd_status); close(fd_cfg); close(fd_mem); close(fd_cfg);
return ret; return 0;
} }
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 17/23
18. Case Study
placed and routed
HT-Cave with
RTR-support and
pattern-matcher as
RTRM
Resource utilization of
XC4VFX60:
4 clock regions for
HT Cave with
RTR-support
12 clock regions for
RTRM
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 18/23
19. Outline
1 Introduction / Goals
2 Run-Time Reconfiguration on FPGAs
3 HyperTransport Cave with Run-Time Reconfiguration Support
4 The Accelerator File System (ACCFS) as Software Framework
5 Case Study - two RTRMs
6 Conclusion
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 19/23
20. Conclusion
By using the ability of run-time reconfiguration of FPGAs it is
possible to build single FPGA chip solutions for host
coupled accelerators.
A design of RTR-capable infrastructure was shown which
allows to manage RTR modules during run-time.
The implementation was done for FPGA directly coupled to
the HyperTransport processor bus of the host system.
The software framework ACCFS provides a generic
interface to user applications which is able to satisfy the
demand of RTR computing.
The concept provided is applicable to other processor and
peripheral bus coupled FPGAs.
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 20/23
21. End
The End.
Thank you for your attention!
Questions?
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 21/23
22. ACCFS-Project Status
ACCFS 0.5 alpha available
http://www.tu-chemnitz.de/cs/ra/projects/accfs
Features
Host support for x86 and x86 64 (ppc32/64 available soon!)
Support for recent Linux kernels
Fully operational VFS interface
Device handler support for UoM HTX Virtex-4 FPGA Card
TODO
Resource discovery interface (via proc or sysfs)
Extend vendor interface for better virtualization support
Other device handlers?: Cell/B.E., Clearspeed, ...
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 22/23
23. ACCFS Development Road Map 2009
111111111
000000000
111111111
000000000
Extended support for 111111111
000000000
Case study ?
111111111
000000000
111111111
000000000
UoH HTX FPGA card ClearSpeed / GPGPUs
111111111
000000000
111111111
000000000
Support for
Virtex−5 PCIe board
IBM QS21 PCIe coupled
Cell/B.E. SPE integration
Virtualization facilitating functions
Q4 2008 Q1 2009 Q2 2009 Q3 2009 Q4 2009 Q1 2010
WHTRA 2009, Heidelberg Run-Time Reconfiguration for HyperTransport coupled FPGAs using ACCFS Jochen Strunk 23/23