SlideShare une entreprise Scribd logo
1  sur  40
글로벌 격전지에서의 승부:
개념 설계 중심으로의 대전환
June Paik
FuriosaAI
Contents
Introduction to AI chips
How do we build AI chips?
What is the right team to build the great winner AI chips?
뜻으로 보는 기술 스타트업
Introduction to AI chips
Neural Network In A Minute
Popular Graph: ResNet-50 conv1, input data tensors to the 7x7 convolution on the right of the image in green and
yellow are processed by the convolution vertices into partials (light blue). Reductions (orange) process the partials
and pass on the non-linearity (blue). (Source: Graphcore)
The mark-1 perceptron machine
Google TPU Pod
64 2nd-gen TPUs
11.5 petaflops
4 terabytes of memory
2-D toroidal mesh network
AI Chip Scale of Computation
> 1 Tops > 10 Tops > 100 Tops
1 Tops = 1,000, 000, 000, 000 OP per Second
Scale of Storage: Size
Speech/ Vision/Translation High Accuracy Model
> 100MB
Mobile Model
> 1 MB
Recommendation Systems
> 1GB
Mixture of Experts
> 1TB
Scale of Storage : Bandwidth
R = 3, W = 112,
N=64, p = 0.2
Fully Connected
3 x 3 Conv
Depthwise
Seperarable Conv
Batch Norm
Layer Norm
Compute
Data Access
Compute / Access
BW per 125 TFLOP
W4 n2
W2 n
W2 n
3 Gb/s 3 Tb/s 30 Tb/s 800 Tb/s
R2 n R2 + n 5
W2 n
W2 r2 n2 W2 r2 n+ W2 n2
W2 n W2 n
5W2 n
(Source : Cerebras)
Scale of Model Diversities
1.Perceptron (P)
2.Feed forward (FF)
3.RadialBasisNetwork(RBF)
4.RecurrentNeuralNetwork(RNN)
5.Long/ShortTermMemory(LSTM)
6.Gated RecurrentUnit(GRU)
7.AutoEncoder(AE)
8.Variational AE (VAE)
9.DenoisingAE (DAE)
10.Sparse AE (SAE)
11.Markov Chain (MC)
12.Hopfield Network(HN)
13.Boltzmann Machine (BM)
14.Restricted BM(RBM)
15.Deep Brief Network(DBN)
16.Deep ConvolutionalNetwork(DCN)
17.Deconvolutional Network(DN)
18.Deep ConvolutionalInverse Graphics Network(DCIGN)
19.Generative AdversarialNetwork(GAN)
20.Liquid State Machine (LSM)
21.Extreme LearningMachine (ELM)
22.Echostate Network(ESN)
23.Deep ResidualNetwork(DRN)
24.Kohonen Network(KN)
25.SupportVectorMachine (SVM)
26.NeuralTuringMachine (NTM)
Scale of Services
AI Chip
What is the AI chip?
AI chip은 AI computation을 가장 고성능 효율적으로 처리하기 위한 반도체칩이다.
AI chip은 Application + Algorithm + Software + Hardware가 유기적으로 집약된 미래
엔지니어링의 결정체이며, AI 산업의 근본 경쟁력을 결정짓는 요소 기술이다.
Ex: Google TPU, Tesla Autopilot, Alexa AI Speaker
AI Chip Global Competition
Competition heating up: vertical & regional
AI chip 시장은 글로벌 기술 격전지이며,
국가별 기업별 vertical 한 방향으로 가고 있다.
Ex: Nvidia, Intel, Google, Amazon, Facebook, Samsung, Qualcomm, ARM, Baidu, Alibaba,
Graphcore, Cerebras, Groq, Cambricon, Horizontal Robotics, Habana…
How do we build AI chips?
Weakness & Strength
.
How do we build AI chips?
AI 칩 엔지니어링은 많은 요소 기술들이 복합적으로 적용되는 정밀 공학.
Application
Algorithm
Software
Microarchitecture
Verification
Physical Implementation
Manufacturing
Packaging
Testing
Board Design
What is the strength of our ecosystem?
AI 칩 제조 경쟁력은 갖추고 있음. Caution: Very Captial Intensive.
Application
Algorithm
Software
Microarchitecture
Verification
Physical Implementation K
Manufacturing J
Packaging J
Testing J
Board Design K
What is the weakness of ecosystem?
AI 칩 설계 경쟁력은 글로발 기업에 비해 매우 취약함.
Application L
Algorithm L
Software L
Microarchitecture L
Verification L
Physical Implementation
Manufacturing
Packaging J
Testing J
Board DesignJ
Weakness example: Microarchitecture
Microarchitecture가 취약하다는 말은 무엇을 의미하나?
Microarchitecture는 근본 개념설계의 영역이다.
한국 산업 미래를 위한 제언인 책 “축적의 시간”은 근본 개념설계가 우리 인더스터리에 가장 취약한 문제 영역이고,
반드시 극복해야 할 과제로 규정하고 있다.
근본 개념설계는 지성의 힘을 바탕으로 하며 부가가치가 높은 상품으로 이어지는 핵심이다.
퓨리오사는 근본 개념설계에 도전하는 회사이다.
이 다음 슬라이드에서는 Microarchitecture를 정의하고,
근본 개념 설계의 정수인 Microarchitecture 설계 방법론에 대해서 이야기한다.
Illustration of physical chips
Zoom into a microchip
Microarchitecture = micro + architecture
Chip Design회사의 Architecture Blueprint에 기반한 상세 설계도를
FAB회사에 전달 칩을 제조한다.
Great architecture needs great architect.
Microarchitecture는 근본 개념설계의 영역이다.
Great building servers people to enable the best human activities in the most humane manner possible given the
building material.
Great microarchitecture serves computation process that enables the best applications in the most efficient
manner possible given the silicon/power/budget
§ Real estate in the micro world
§ Great architect should know in and out of everything and is able to implement the chip as scheduled with
the given budgets
Microarchitect’s toolkit
근본 개념 설계는 필드의 근본 개념에 근거해야 한다.
§ Instruction Set Architecture
§ VLIW, SIMD, VECTOR, Systolic Array
§ SuperScale, Multithreading, DataFlow
§ Pipelining
§ Virtualization
§ Prefetching, Caching
§ IO, Memory subsystem
§ Finite State Machine
§ …
Key Question:
What is the great winner architecture for
AI computation?
More important questions
How can we explore and find the best
architecture and build it?
Build the performance modeling simulator
It’s a so called cycle accurate-simulator which can simulate both behavior and performance of
machine we’re building at the very fine granularity and abstraction level which is usually at the
level of clock cycle. This enforces the discipline of
§ Concrete and precise thinking
§ Data-Driven evaluation for important trade-off of design choices
Architect should have strong (or reasonable) SW skill to build this simulator.
OOP language and Event-Driven programming paradigm is the natural fit for this job. C++ is the
standard choice.
Arch exploration takes time and experiences.
Korean industries have neglected this part because we didn’t (or couldn’t afford to) allocate
enough time for defining and exploring the design space to come up with the solid architecture
specification. It takes time because
§ Workload characterization and prediction takes time.
§ Simulation needs supercomputer-scale computation.
§ Understanding very detailed design trade-off just takes time.
In other words, cultivating intuition by refining it iteratively by methodically taking good measures
takes time
Time Schedule
So let’s say it takes 1.5~2 years to build commercial AI chips from concept to production. We need
to allocate at least 6~8 month for performance modeling that goes in parallel to the
implementation
Performance Modeling /
Architecturing
RTL Implementation
Software Architecturing / Implementation
Verification
Physical Design / Manufacturing
Arch Examples: : Quantization (suggested by Google)
§ Aggressive operator fusion: Performing as many operations as possible in a single pass can
lower the cost of memory accesses and provide significant improvements in run-time and
power consumption
§ Compressed memory access: One can optimize memory bandwidth by supporting on the fly
de-compression of weights (and activations). A simple way to do that is to support lower
precision storage of weights and possibly activations.
§ Lower precision 4/8/16 bit arithmetic processing
§ Per-layer selection of bitwidths
§ Per-channel quantization
Example of AI chips: Google TPU
Example of AI chips: Furiosa MadRun
Team building:
What’s the right team to build great
winner AI chips?
New Organization Essential
Application (Business) + Algorithm+ Software Driven된 기존과 다른 조직 구성이
필수적으로 필요하다.
Any orgnization that designs a system… will inevitably produce a design whose structure
is a copy of the organization’s communication structure. – Conway, cliff young
• 큰 기업이 스타트업보다 불리함.
• 스타트업에게도 쉽지 않음.
FuriosaAI: organization structure
Application + Algorithm + Software Driven
• Application Partners: Naver, BinaryVR, Molocos, Neosapience, Seoul Smart City
Project…
• Algorithm (2)
• Software (6): Compiler, Runtime, Driver, Tool Chains
• Microarchitect (4): NPU Core, NoC, DRAM subsystem
• Logic Design (3)
• Physical Design (1): Outsourcing to SiFive or China Partners
• Manufacturing / Packaging / Board: TSMC or Samsung and Design house.
뜻으로 보는 기술 스타트업:
Furiosa Perspective
뜻으로 본 한국 역사 (함석헌)
수난의 여왕
치욕과
분열과
압박과
상실과
좌절을 극복해나가는 역사
한국 기술 스타트업은 글로발 그리고 국내 생태계의 험준한 위치에서
가파른 수난의 지형을 뚫고 올라가는 도전의 걸음임과 동시에
수많은 실패속에서도 결국 분명히 우뚝 서겠다는 강한 의지와 희망이다.
뜻으로 보는 기술 스타트업
새로운 창조를 위한 씨알
씨알의 역사적 의미 - 씨알이란 말은 민(people)이란 뜻인데, 우리 자신을 역사적 악에서
해방시키고, 새로운 창조를 위한 자격을 스스로 닦아 내기 위한 씨알.
기술 스타트업의 생태계적 의미 – 지성(People + AI)을 바탕으로 근본 문제를 해결하여 우리 생태계
(ecosystem)를 기존 관성에서 해방시키고, 혁신적 비지니스 모델 을 창조하기 위한 자격을 스스로
닦아 내기 위한 씨알.
Keyword: 주체성, 근본성, 순수성, 생동성, 관계성
Final Word: ecosystem
We should do deep research on local and global Ecosystem.
기술 기업은 필수적으로 기술을 필요로 하는 파트너와 적극 협력하는 관계가 중요하며 이는
국내와 글로발 생태계에 대한 깊은 이해와 더불어 자신에 대한 철저한 인식을 바탕으로 해야
한다.
Thank you

Contenu connexe

Similaire à [TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회

Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesJeff Bertman
 
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...Edge AI and Vision Alliance
 
VLSI industry - Digital Design Engineers - draft version
VLSI industry - Digital Design Engineers - draft versionVLSI industry - Digital Design Engineers - draft version
VLSI industry - Digital Design Engineers - draft versionMahmoud Abdellatif
 
vlsiindustry-fordigitaldesignengineers-draftversion-160429203845.pdf
vlsiindustry-fordigitaldesignengineers-draftversion-160429203845.pdfvlsiindustry-fordigitaldesignengineers-draftversion-160429203845.pdf
vlsiindustry-fordigitaldesignengineers-draftversion-160429203845.pdfAMohan12
 
Introducing the Vitis Unified Software Platform for Programming FPGAs
Introducing the Vitis Unified Software Platform for Programming FPGAsIntroducing the Vitis Unified Software Platform for Programming FPGAs
Introducing the Vitis Unified Software Platform for Programming FPGAsinside-BigData.com
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTechgeetachauhan
 
Softare is still eating the world - Challenges in connected product design a...
Softare is still eating the world - Challenges in connected product design a...Softare is still eating the world - Challenges in connected product design a...
Softare is still eating the world - Challenges in connected product design a...Oliver Koeth
 
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at ScaleInfrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at ScaleRobb Boyd
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Research
 
SMB Auto Piloting-IT In The Auto Component Sector
SMB Auto Piloting-IT In The Auto Component SectorSMB Auto Piloting-IT In The Auto Component Sector
SMB Auto Piloting-IT In The Auto Component SectorChirantan Ghosh
 
Embracing Failure - AzureDay Rome
Embracing Failure - AzureDay RomeEmbracing Failure - AzureDay Rome
Embracing Failure - AzureDay RomeAlberto Acerbis
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...James Serra
 
On premise ai platform - from dc to edge
On premise ai platform - from dc to edgeOn premise ai platform - from dc to edge
On premise ai platform - from dc to edgeConference Papers
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Ganesan Narayanasamy
 
Smart Camera for Non-Intrusive Heart Detection
Smart Camera for Non-Intrusive Heart DetectionSmart Camera for Non-Intrusive Heart Detection
Smart Camera for Non-Intrusive Heart Detectionitaistam
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableRebekah Rodriguez
 

Similaire à [TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회 (20)

The future of AI is hybrid
The future of AI is hybridThe future of AI is hybrid
The future of AI is hybrid
 
Maximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and PracticesMaximize Big Data ROI via Best of Breed Patterns and Practices
Maximize Big Data ROI via Best of Breed Patterns and Practices
 
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
“The Future of AI is Here Today: Deep Dive into Qualcomm’s On-Device AI Offer...
 
Introduction to Chaos Engineering
Introduction to Chaos EngineeringIntroduction to Chaos Engineering
Introduction to Chaos Engineering
 
VLSI industry - Digital Design Engineers - draft version
VLSI industry - Digital Design Engineers - draft versionVLSI industry - Digital Design Engineers - draft version
VLSI industry - Digital Design Engineers - draft version
 
OA centre of excellence
OA centre of excellenceOA centre of excellence
OA centre of excellence
 
vlsiindustry-fordigitaldesignengineers-draftversion-160429203845.pdf
vlsiindustry-fordigitaldesignengineers-draftversion-160429203845.pdfvlsiindustry-fordigitaldesignengineers-draftversion-160429203845.pdf
vlsiindustry-fordigitaldesignengineers-draftversion-160429203845.pdf
 
Introducing the Vitis Unified Software Platform for Programming FPGAs
Introducing the Vitis Unified Software Platform for Programming FPGAsIntroducing the Vitis Unified Software Platform for Programming FPGAs
Introducing the Vitis Unified Software Platform for Programming FPGAs
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 
Softare is still eating the world - Challenges in connected product design a...
Softare is still eating the world - Challenges in connected product design a...Softare is still eating the world - Challenges in connected product design a...
Softare is still eating the world - Challenges in connected product design a...
 
China AI Summit talk 2017
China AI Summit talk 2017China AI Summit talk 2017
China AI Summit talk 2017
 
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at ScaleInfrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
Infrastructure Solutions for Deploying AI/ML/DL Workloads at Scale
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
 
SMB Auto Piloting-IT In The Auto Component Sector
SMB Auto Piloting-IT In The Auto Component SectorSMB Auto Piloting-IT In The Auto Component Sector
SMB Auto Piloting-IT In The Auto Component Sector
 
Embracing Failure - AzureDay Rome
Embracing Failure - AzureDay RomeEmbracing Failure - AzureDay Rome
Embracing Failure - AzureDay Rome
 
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
AI for an intelligent cloud and intelligent edge: Discover, deploy, and manag...
 
On premise ai platform - from dc to edge
On premise ai platform - from dc to edgeOn premise ai platform - from dc to edge
On premise ai platform - from dc to edge
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture
 
Smart Camera for Non-Intrusive Heart Detection
Smart Camera for Non-Intrusive Heart DetectionSmart Camera for Non-Intrusive Heart Detection
Smart Camera for Non-Intrusive Heart Detection
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
 

Plus de NAVER D2 STARTUP FACTORY

Plus de NAVER D2 STARTUP FACTORY (20)

[NAVER D2SF][TMS2019] 키노트_
[NAVER D2SF][TMS2019] 키노트_[NAVER D2SF][TMS2019] 키노트_
[NAVER D2SF][TMS2019] 키노트_
 
[NAVER D2SF][TMS2019] 스튜디오씨드
[NAVER D2SF][TMS2019] 스튜디오씨드[NAVER D2SF][TMS2019] 스튜디오씨드
[NAVER D2SF][TMS2019] 스튜디오씨드
 
[NAVER D2SF][TMS2019]데이블
[NAVER D2SF][TMS2019]데이블[NAVER D2SF][TMS2019]데이블
[NAVER D2SF][TMS2019]데이블
 
[NAVER D2SF][TMS2019] 뤼이드
[NAVER D2SF][TMS2019] 뤼이드[NAVER D2SF][TMS2019] 뤼이드
[NAVER D2SF][TMS2019] 뤼이드
 
[NAVER D2SF][TMS2019] 직토
[NAVER D2SF][TMS2019] 직토[NAVER D2SF][TMS2019] 직토
[NAVER D2SF][TMS2019] 직토
 
[NAVER D2SF][TMS2019] 휴이노
[NAVER D2SF][TMS2019] 휴이노[NAVER D2SF][TMS2019] 휴이노
[NAVER D2SF][TMS2019] 휴이노
 
[NAVER D2SF][TMS2019] 가우디오랩
[NAVER D2SF][TMS2019] 가우디오랩[NAVER D2SF][TMS2019] 가우디오랩
[NAVER D2SF][TMS2019] 가우디오랩
 
[NAVER D2SF][TMS2019] 크라우드웍스
[NAVER D2SF][TMS2019] 크라우드웍스[NAVER D2SF][TMS2019] 크라우드웍스
[NAVER D2SF][TMS2019] 크라우드웍스
 
[NAVER D2SF][TMS2019] 모빌테크
[NAVER D2SF][TMS2019] 모빌테크[NAVER D2SF][TMS2019] 모빌테크
[NAVER D2SF][TMS2019] 모빌테크
 
[NAVER D2SF][TMS2019] 스트라드비젼
[NAVER D2SF][TMS2019] 스트라드비젼[NAVER D2SF][TMS2019] 스트라드비젼
[NAVER D2SF][TMS2019] 스트라드비젼
 
[NAVER D2SF][TMS2019] 우아한형제들
[NAVER D2SF][TMS2019] 우아한형제들[NAVER D2SF][TMS2019] 우아한형제들
[NAVER D2SF][TMS2019] 우아한형제들
 
[NAVER D2SF][TMS2019] 야놀자
[NAVER D2SF][TMS2019] 야놀자[NAVER D2SF][TMS2019] 야놀자
[NAVER D2SF][TMS2019] 야놀자
 
[NAVER D2SF][TMS2019] 쏘카
[NAVER D2SF][TMS2019] 쏘카[NAVER D2SF][TMS2019] 쏘카
[NAVER D2SF][TMS2019] 쏘카
 
[NAVER D2SF][TMS2019] 센드버드
[NAVER D2SF][TMS2019] 센드버드[NAVER D2SF][TMS2019] 센드버드
[NAVER D2SF][TMS2019] 센드버드
 
[NAVER D2SF][TMS2019] 사운더블 헬스
[NAVER D2SF][TMS2019] 사운더블 헬스[NAVER D2SF][TMS2019] 사운더블 헬스
[NAVER D2SF][TMS2019] 사운더블 헬스
 
[NAVER D2SF][TMS2019] 비프로11
[NAVER D2SF][TMS2019] 비프로11[NAVER D2SF][TMS2019] 비프로11
[NAVER D2SF][TMS2019] 비프로11
 
[NAVER D2SF][TMS2019] 올거나이즈
[NAVER D2SF][TMS2019] 올거나이즈[NAVER D2SF][TMS2019] 올거나이즈
[NAVER D2SF][TMS2019] 올거나이즈
 
[NAVER D2SF][TMS2019] 더.웨이브.톡
[NAVER D2SF][TMS2019] 더.웨이브.톡[NAVER D2SF][TMS2019] 더.웨이브.톡
[NAVER D2SF][TMS2019] 더.웨이브.톡
 
[NAVER D2SF][TMS2019] 아드리엘
[NAVER D2SF][TMS2019] 아드리엘[NAVER D2SF][TMS2019] 아드리엘
[NAVER D2SF][TMS2019] 아드리엘
 
Genesislab demoday (2019.05.30)
Genesislab demoday (2019.05.30)Genesislab demoday (2019.05.30)
Genesislab demoday (2019.05.30)
 

Dernier

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Dernier (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

[TMS 2018] 기술개발 / FuriosaAI 백준호 CEO, 글로벌 격전지에서 발견한 기회

  • 1. 글로벌 격전지에서의 승부: 개념 설계 중심으로의 대전환 June Paik FuriosaAI
  • 2. Contents Introduction to AI chips How do we build AI chips? What is the right team to build the great winner AI chips? 뜻으로 보는 기술 스타트업
  • 4. Neural Network In A Minute
  • 5.
  • 6. Popular Graph: ResNet-50 conv1, input data tensors to the 7x7 convolution on the right of the image in green and yellow are processed by the convolution vertices into partials (light blue). Reductions (orange) process the partials and pass on the non-linearity (blue). (Source: Graphcore)
  • 8. Google TPU Pod 64 2nd-gen TPUs 11.5 petaflops 4 terabytes of memory 2-D toroidal mesh network
  • 9. AI Chip Scale of Computation > 1 Tops > 10 Tops > 100 Tops 1 Tops = 1,000, 000, 000, 000 OP per Second
  • 10. Scale of Storage: Size Speech/ Vision/Translation High Accuracy Model > 100MB Mobile Model > 1 MB Recommendation Systems > 1GB Mixture of Experts > 1TB
  • 11. Scale of Storage : Bandwidth R = 3, W = 112, N=64, p = 0.2 Fully Connected 3 x 3 Conv Depthwise Seperarable Conv Batch Norm Layer Norm Compute Data Access Compute / Access BW per 125 TFLOP W4 n2 W2 n W2 n 3 Gb/s 3 Tb/s 30 Tb/s 800 Tb/s R2 n R2 + n 5 W2 n W2 r2 n2 W2 r2 n+ W2 n2 W2 n W2 n 5W2 n (Source : Cerebras)
  • 12. Scale of Model Diversities 1.Perceptron (P) 2.Feed forward (FF) 3.RadialBasisNetwork(RBF) 4.RecurrentNeuralNetwork(RNN) 5.Long/ShortTermMemory(LSTM) 6.Gated RecurrentUnit(GRU) 7.AutoEncoder(AE) 8.Variational AE (VAE) 9.DenoisingAE (DAE) 10.Sparse AE (SAE) 11.Markov Chain (MC) 12.Hopfield Network(HN) 13.Boltzmann Machine (BM) 14.Restricted BM(RBM) 15.Deep Brief Network(DBN) 16.Deep ConvolutionalNetwork(DCN) 17.Deconvolutional Network(DN) 18.Deep ConvolutionalInverse Graphics Network(DCIGN) 19.Generative AdversarialNetwork(GAN) 20.Liquid State Machine (LSM) 21.Extreme LearningMachine (ELM) 22.Echostate Network(ESN) 23.Deep ResidualNetwork(DRN) 24.Kohonen Network(KN) 25.SupportVectorMachine (SVM) 26.NeuralTuringMachine (NTM)
  • 14. AI Chip What is the AI chip? AI chip은 AI computation을 가장 고성능 효율적으로 처리하기 위한 반도체칩이다. AI chip은 Application + Algorithm + Software + Hardware가 유기적으로 집약된 미래 엔지니어링의 결정체이며, AI 산업의 근본 경쟁력을 결정짓는 요소 기술이다. Ex: Google TPU, Tesla Autopilot, Alexa AI Speaker
  • 15. AI Chip Global Competition Competition heating up: vertical & regional AI chip 시장은 글로벌 기술 격전지이며, 국가별 기업별 vertical 한 방향으로 가고 있다. Ex: Nvidia, Intel, Google, Amazon, Facebook, Samsung, Qualcomm, ARM, Baidu, Alibaba, Graphcore, Cerebras, Groq, Cambricon, Horizontal Robotics, Habana…
  • 16. How do we build AI chips? Weakness & Strength .
  • 17. How do we build AI chips? AI 칩 엔지니어링은 많은 요소 기술들이 복합적으로 적용되는 정밀 공학. Application Algorithm Software Microarchitecture Verification Physical Implementation Manufacturing Packaging Testing Board Design
  • 18. What is the strength of our ecosystem? AI 칩 제조 경쟁력은 갖추고 있음. Caution: Very Captial Intensive. Application Algorithm Software Microarchitecture Verification Physical Implementation K Manufacturing J Packaging J Testing J Board Design K
  • 19. What is the weakness of ecosystem? AI 칩 설계 경쟁력은 글로발 기업에 비해 매우 취약함. Application L Algorithm L Software L Microarchitecture L Verification L Physical Implementation Manufacturing Packaging J Testing J Board DesignJ
  • 20. Weakness example: Microarchitecture Microarchitecture가 취약하다는 말은 무엇을 의미하나? Microarchitecture는 근본 개념설계의 영역이다. 한국 산업 미래를 위한 제언인 책 “축적의 시간”은 근본 개념설계가 우리 인더스터리에 가장 취약한 문제 영역이고, 반드시 극복해야 할 과제로 규정하고 있다. 근본 개념설계는 지성의 힘을 바탕으로 하며 부가가치가 높은 상품으로 이어지는 핵심이다. 퓨리오사는 근본 개념설계에 도전하는 회사이다. 이 다음 슬라이드에서는 Microarchitecture를 정의하고, 근본 개념 설계의 정수인 Microarchitecture 설계 방법론에 대해서 이야기한다.
  • 21. Illustration of physical chips Zoom into a microchip
  • 22. Microarchitecture = micro + architecture Chip Design회사의 Architecture Blueprint에 기반한 상세 설계도를 FAB회사에 전달 칩을 제조한다.
  • 23. Great architecture needs great architect. Microarchitecture는 근본 개념설계의 영역이다. Great building servers people to enable the best human activities in the most humane manner possible given the building material. Great microarchitecture serves computation process that enables the best applications in the most efficient manner possible given the silicon/power/budget § Real estate in the micro world § Great architect should know in and out of everything and is able to implement the chip as scheduled with the given budgets
  • 24. Microarchitect’s toolkit 근본 개념 설계는 필드의 근본 개념에 근거해야 한다. § Instruction Set Architecture § VLIW, SIMD, VECTOR, Systolic Array § SuperScale, Multithreading, DataFlow § Pipelining § Virtualization § Prefetching, Caching § IO, Memory subsystem § Finite State Machine § …
  • 25. Key Question: What is the great winner architecture for AI computation?
  • 26. More important questions How can we explore and find the best architecture and build it?
  • 27. Build the performance modeling simulator It’s a so called cycle accurate-simulator which can simulate both behavior and performance of machine we’re building at the very fine granularity and abstraction level which is usually at the level of clock cycle. This enforces the discipline of § Concrete and precise thinking § Data-Driven evaluation for important trade-off of design choices Architect should have strong (or reasonable) SW skill to build this simulator. OOP language and Event-Driven programming paradigm is the natural fit for this job. C++ is the standard choice.
  • 28. Arch exploration takes time and experiences. Korean industries have neglected this part because we didn’t (or couldn’t afford to) allocate enough time for defining and exploring the design space to come up with the solid architecture specification. It takes time because § Workload characterization and prediction takes time. § Simulation needs supercomputer-scale computation. § Understanding very detailed design trade-off just takes time. In other words, cultivating intuition by refining it iteratively by methodically taking good measures takes time
  • 29. Time Schedule So let’s say it takes 1.5~2 years to build commercial AI chips from concept to production. We need to allocate at least 6~8 month for performance modeling that goes in parallel to the implementation Performance Modeling / Architecturing RTL Implementation Software Architecturing / Implementation Verification Physical Design / Manufacturing
  • 30. Arch Examples: : Quantization (suggested by Google) § Aggressive operator fusion: Performing as many operations as possible in a single pass can lower the cost of memory accesses and provide significant improvements in run-time and power consumption § Compressed memory access: One can optimize memory bandwidth by supporting on the fly de-compression of weights (and activations). A simple way to do that is to support lower precision storage of weights and possibly activations. § Lower precision 4/8/16 bit arithmetic processing § Per-layer selection of bitwidths § Per-channel quantization
  • 31. Example of AI chips: Google TPU
  • 32. Example of AI chips: Furiosa MadRun
  • 33. Team building: What’s the right team to build great winner AI chips?
  • 34. New Organization Essential Application (Business) + Algorithm+ Software Driven된 기존과 다른 조직 구성이 필수적으로 필요하다. Any orgnization that designs a system… will inevitably produce a design whose structure is a copy of the organization’s communication structure. – Conway, cliff young • 큰 기업이 스타트업보다 불리함. • 스타트업에게도 쉽지 않음.
  • 35. FuriosaAI: organization structure Application + Algorithm + Software Driven • Application Partners: Naver, BinaryVR, Molocos, Neosapience, Seoul Smart City Project… • Algorithm (2) • Software (6): Compiler, Runtime, Driver, Tool Chains • Microarchitect (4): NPU Core, NoC, DRAM subsystem • Logic Design (3) • Physical Design (1): Outsourcing to SiFive or China Partners • Manufacturing / Packaging / Board: TSMC or Samsung and Design house.
  • 36. 뜻으로 보는 기술 스타트업: Furiosa Perspective
  • 37. 뜻으로 본 한국 역사 (함석헌) 수난의 여왕 치욕과 분열과 압박과 상실과 좌절을 극복해나가는 역사 한국 기술 스타트업은 글로발 그리고 국내 생태계의 험준한 위치에서 가파른 수난의 지형을 뚫고 올라가는 도전의 걸음임과 동시에 수많은 실패속에서도 결국 분명히 우뚝 서겠다는 강한 의지와 희망이다.
  • 38. 뜻으로 보는 기술 스타트업 새로운 창조를 위한 씨알 씨알의 역사적 의미 - 씨알이란 말은 민(people)이란 뜻인데, 우리 자신을 역사적 악에서 해방시키고, 새로운 창조를 위한 자격을 스스로 닦아 내기 위한 씨알. 기술 스타트업의 생태계적 의미 – 지성(People + AI)을 바탕으로 근본 문제를 해결하여 우리 생태계 (ecosystem)를 기존 관성에서 해방시키고, 혁신적 비지니스 모델 을 창조하기 위한 자격을 스스로 닦아 내기 위한 씨알. Keyword: 주체성, 근본성, 순수성, 생동성, 관계성
  • 39. Final Word: ecosystem We should do deep research on local and global Ecosystem. 기술 기업은 필수적으로 기술을 필요로 하는 파트너와 적극 협력하는 관계가 중요하며 이는 국내와 글로발 생태계에 대한 깊은 이해와 더불어 자신에 대한 철저한 인식을 바탕으로 해야 한다.