Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
PLASTER
PYNQ-based abandoned object detection using a
map-reduce approach on a multi-FPGA cluster
Daniele Valentino De Vin...
Context
2
Neural networks running on
embedded devices are :
▪ Computationally intensive
▪ Strongly memory bound
▪ Resource...
Our solution
3
▪ PYNQ-based multi-FPGA cluster
for the:
○ Flexibility of the infrastructure
○ Reliability and redundancy
○...
Multi-FPGA cluster
4
▪ Distributed system of accelerators
▪ Self-managed cluster of PYNQ-Z1s
▪ On-the-fly reconfiguration ...
Cluster design
5
Rendering of a 3-PYNQ-Z1s configuration of the cluster, with the boards facing outwards to improve heat d...
Application
6
▪ Video input is gathered and split
into frames
▪ Frame chunks are sent to multiple
board for classification...
Node manager
7
https://creativemarket.com/Becris
▪ Fast underlying communication
layer
▪ Easy reconfigurability of nodes f...
Python libraries
8
▪ Python APIs to build applications
running on the cluster
▪ Easy configuration and assignment
of bit-s...
Results / user APIs
9
UML class diagrams of the base class representing a task of a distributed application and specifying...
Results / transfer times
10
File size Plaster transfer time Python transfer time
50 MB 194.094 ms 1.0 s
100 MB 300.680 ms ...
Thank you
Lorenzo Farinelli
MSc Computer Science & Engineering
lorenzo.farinelli@mail.polimi.it
11
Luca Stornaiuolo
PhD - ...
Prochain SlideShare
Chargement dans…5
×

PLASTER - PYNQ-based abandoned object detection using a map-reduce approach on a multi-FPGA cluster

- Daniele Valentino de Vincenti, B.Sc. graduate in Biomedical Engineering @Politecnico di Milano
- Lorenzo Farinelli, B.Sc. graduate in Computer Science and Engineering @Politecnico di Milano

Plaster is a multi-layered infrastructure (based on C++) aimed at supporting the development of multi-FPGA systems and the management of large data flows between the nodes. In particular, the goal of the project is to provide the end-user with a set of tools (by the means of a Python library and a C++ service) to easily assign bitstreams to nodes and route data between them, in the context of a PYNQ-based cluster suitable for distributed acceleration of computation-intensive tasks. Using this platform, an abandoned objects detection tool is implemented, designed as a Multi-FPGA distributed system exploiting an hardware accelerated version of the YOLO neural network for image detection.

  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

PLASTER - PYNQ-based abandoned object detection using a map-reduce approach on a multi-FPGA cluster

  1. 1. PLASTER PYNQ-based abandoned object detection using a map-reduce approach on a multi-FPGA cluster Daniele Valentino De Vincenti   danielevalentino.devincenti@mail.polimi.it Lorenzo Farinelli                                lorenzo.farinelli@mail.polimi.it Luca Stornaiuolo                     luca.stornaiuolo@polimi.it Rolando Brondolin                     rolando.brondolin@polimi.it July 19th 2020 VNGC project presentation
  2. 2. Context 2 Neural networks running on embedded devices are : ▪ Computationally intensive ▪ Strongly memory bound ▪ Resource hungry ▪ Power consuming https://www.flaticon.com/authors/freepik
  3. 3. Our solution 3 ▪ PYNQ-based multi-FPGA cluster for the: ○ Flexibility of the infrastructure ○ Reliability and redundancy ○ Portability and ease of setup (e.g. events) ○ High computational power ○ Heterogeneous design ○ Embedded system ▪ Abandoned object detection using accelerated YOLO detector https://www.flaticon.com/authors/eucalyp ▪ C++ node manager for fast communication ▪ End-user Python libraries for ease of use https://www.flaticon.com/free-icon/purchase-summary_1949624 https://github.com/dhm2013724/yolov2_xilinx_fpga
  4. 4. Multi-FPGA cluster 4 ▪ Distributed system of accelerators ▪ Self-managed cluster of PYNQ-Z1s ▪ On-the-fly reconfiguration of FPGAs to start new tasks and jobs
  5. 5. Cluster design 5 Rendering of a 3-PYNQ-Z1s configuration of the cluster, with the boards facing outwards to improve heat dissipation. A fan will be placed on top, and on the side the cluster will be surrounded by a plexiglass pane to force the airflow to go from bottom to top. In the central hole a network switch will be hosted, to connect together the boards.
  6. 6. Application 6 ▪ Video input is gathered and split into frames ▪ Frame chunks are sent to multiple board for classification ▪ Results from the classification stage are sent to a second analyzing stage ▪ Final results are aggregated and sent back to the user User Cluster
  7. 7. Node manager 7 https://creativemarket.com/Becris ▪ Fast underlying communication layer ▪ Easy reconfigurability of nodes for different tasks ▪ Simple rearrangement of the cluster in case of failures ▪ Ease of use through a set of Python APIs
  8. 8. Python libraries 8 ▪ Python APIs to build applications running on the cluster ▪ Easy configuration and assignment of bit-streams to the boards ▪ Dedicated functions for the communication and file exchange between nodes ▪ Transparent C++ management layer
  9. 9. Results / user APIs 9 UML class diagrams of the base class representing a task of a distributed application and specifying the APIs provided by the Python library to interact with the cluster and manage the execution of apps. When developing a distributed application, a user only has to import PlasterTask in the app’s Python code and implement a concrete subclass to define a custom behavior for the task. ExecutorWrapper is used to expose methods to let tasks interact with the cluster (e.g. send/receive data)
  10. 10. Results / transfer times 10 File size Plaster transfer time Python transfer time 50 MB 194.094 ms 1.0 s 100 MB 300.680 ms 2.0 s 200 MB 600.392 ms 3.976 s 500 MB 1866.889 ms 9.679 s Transfer times to move a file of the specified size from one board to another. The values refers to the time between the 1st file transfer request (from recipient to owner) and the reception (and disk writing) of the last chunk of the file, comparing the results obtained by using the communication APIs of the cluster (left) and by explicitly defining a file transfer function from scratch in Python (right)
  11. 11. Thank you Lorenzo Farinelli MSc Computer Science & Engineering lorenzo.farinelli@mail.polimi.it 11 Luca Stornaiuolo PhD - Computer Science & Engineering luca.stornaiuolo@polimi.it Daniele Valentino De Vincenti Ba Biomedical Engineering danielevalentino.devincenti@mail.polimi.it Project code available at: https://bitbucket.org/necst/pynq-cluster/ Rolando Brondolin PhD - Information Technology rolando.brondolin@polimi.it

×