SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
Metal
girl+sk8er	

osxdev.org
Recap
WWDC 2014
•Swift
•Yosemite
•Metal
I’m so happy that I was too lazy to learn Objective-C
Maybe or not
Game Industry Trend
C++	

OOP	

Design Pattern	

TDD	

-
C	

FP / PP / DOP	

-	

Fast Iteration	

Immutability
Maybe or not
Game Industry Trend
C++ / Objective-C	

OOP	

Design Pattern	

TDD	

C / Swift	

FP / PP / DOP	

!
Fast Iteration	

Immutability
seen season one before?
Explaining Metal
Season one BAAM!
Explaining Metal
SubTitle
Boss의 한마디
http://www.bloter.net/archives/195819
This talk
•No API in detail
•No code(my own)
•No demo
CPU vs GPU
Control
Cache
ALU ALU
ALU ALU
DRAM DRAM
or instruction stream sharing. While
mming model permits each shader
w a unique stream of control, in
ecution on nearby stream elements
e same dynamic control-flow decisions.
le shader invocations can likely share
am. Although GPUs must accom-
where this is not the case, instruction
oss multiple shader invocations is a key
e design of GPU processing cores and is
gorithms for pipeline scheduling.
a GPU’s
hin
cessing
or exe-
tions.
mple-
nces exist
d product
GPUs
ciency
multi-
mploy
Even higher performance is possible by populating
each core with multiple floating-point ALUs. This is done
efficiently with SIMD processing, which uses each ALU to
perform the same operation on a different piece of data.
The most common implementation of SIMD processing
is via explicit short-vector instructions, similar to those
provided by the x86 SSE or PowerPC Altivec ISA exten-
sions. These extensions provide a SIMD width of four,
with instructions that control the operation of four ALUs.
Alternative implementations, such as NVIDIA’s 8-series
architecture, perform SIMD execution by implicitly shar-
Type Processor Cores/Chip ALUs/Core3
SIMD width MaxT4
GPUs AMD Radeon HD 2900 4 80 64 48
NVIDIA GeForce 8800 16 8 32 96
CPUs Intel Core 2 Quad1
4 8 4 1
STI Cell BE2
8 4 4 1
Sun UltraSPARC T2 8 1 1 4
TABLE 1
1
SSE processing only, does not account for x86 FPU.
2
Stream processing (SPE) cores only, does not account for PPU cores.
3
32-bit, floating point (all ALUs are multiply-add except the Intel Core 2 Quad)
Apple A7
http://www.anandtech.com/show/8116/some-thoughts-on-apples-metal-api
Why we should use driver?
Why we should use driver?
•GPU runs asynchronously
•Different address space
•Different ISA
•Display is updated by frame
그림 그리기
•도화지를 편다
•(그릴 그림을 생각한다)
•붓과 물감을 고른다
•붓으로 그림을 그린다.
•(구겨 버리거나 걸어둔다)
•새 도화지를 편다
그림 그리기 / Graphics App.
•도화지를 편다 / Framebuffer setup
•(그릴 그림을 생각한다) / Data setup
•붓과 물감을 고른다 / State setup
•붓으로 그림을 그린다. / Draw call
•(구겨 버리거나 걸어둔다) / Update a frame
•새 도화지를 편다 / Framebuffer clear
Graphics Driver는 이 모든 과정의 API를 제공한다
Graphics Driver의 계층 구조
API Interface
State Management
Command Queue Management
I/O Controller
Shader Compiler
Why is it expensive?
Graphics Driver가 하는 일
•State validation
■ Confirming API usage is valid
■ Encoding API state to hardware state
•Shader compilation
■ Run-time generation of shader machine code
■ Interactions between state and shaders
•Sending work to GPU
■ Managing resource residency
■ Batching commands
OpenGL
State validation
void glTexImage2D(	

 GLenum target,	

	

 GLint level,	

	

 GLint internalFormat,	

	

 GLsizei width,	

	

 GLsizei height,	

	

 GLint border,	

	

 GLenum format,	

	

 GLenum type,	

	

 const GLvoid * data);
Are you kidding?
Shader Compilation
•No standard for pre-built shader
•No standard for shader binary format
int Init(ESContext *esContext)	

{	

UserData *userData = esContext->userData;	

GLbyte vShaderStr[] = 	

"attribute vec4 vPosition; n"	

"void main() n"	

"{ n"	

" gl_Position = vPosition; n"	

"} n";	

	

GLbyte fShaderStr[] = 	

"precision mediump float; n"	

"void main() n"	

"{ n"	

" gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0); n"	

"} n";	

GLuint vertexShader;	

GLuint fragmentShader;	

GLuint programObject;	

GLint linked;
음영(陰影)
Shader
•Shader는 오브젝트를 어둡게 칠한다
courtesy of 西川善司
복붙
Sending work to GPU
•Batching commands and committing
•Transferring data and texture
Design target
Metal
•Low CPU overhead
•More predictable performance
•Better programmability
Key ideas
Metal
•Create and validate state up-front
•Shader can be compiled offline
•Enable versatile multi-threading
•Shared memory for CPU & GPU
•Handle synchronisation explicitly
•Tile-based deferred rendering
•C++11 based language
•No legacy baggage
•Compute shader
But, A7 only - What the x
Multi-threading
Metal vs OpenGL ES
Code comparison
Low CPU overhead enable
So what
•more draw calls
•more objects
•better physics
•better AI
•more complex logic
•low battery usage
Use engine or forget
How do I start?
•Unity 5(next year) - free/4,500$
•Unreal 4(may be this year) - 19$/month
•Cocos2D - free
•Xcode template
Proprietary API
•Apple is a promoter of Khronos Group
•OpenCL story
•판이 꺼졌으니 사다리 걷어차기? 
■ 하지만 구글은 바보가 아니다(Expansion Pack)
몰라도 그만
Conclusion
•Low CPU overhead
•Can do something more
•A7 only(할 수 없거나 귀찮거나)
•Game-changer? maybe or not

Contenu connexe

Tendances

Tendances (7)

OpenGL Shading Language
OpenGL Shading LanguageOpenGL Shading Language
OpenGL Shading Language
 
NVIDIA's OpenGL Functionality
NVIDIA's OpenGL FunctionalityNVIDIA's OpenGL Functionality
NVIDIA's OpenGL Functionality
 
Open Graphics Library
Open Graphics  Library Open Graphics  Library
Open Graphics Library
 
OpenGLES - Graphics Programming in Android
OpenGLES - Graphics Programming in Android OpenGLES - Graphics Programming in Android
OpenGLES - Graphics Programming in Android
 
Inspecting Block Closures To Generate Shaders for GPU Execution
Inspecting Block Closures To Generate Shaders for GPU ExecutionInspecting Block Closures To Generate Shaders for GPU Execution
Inspecting Block Closures To Generate Shaders for GPU Execution
 
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
Vertex Shader Tricks by Bill Bilodeau - AMD at GDC14
 
clWrap: Nonsense free control of your GPU
clWrap: Nonsense free control of your GPUclWrap: Nonsense free control of your GPU
clWrap: Nonsense free control of your GPU
 

En vedette

Bridging the Gap - Laracon 2013
Bridging the Gap - Laracon 2013Bridging the Gap - Laracon 2013
Bridging the Gap - Laracon 2013
Ben Corlett
 
Stateless authentication with OAuth 2 and JWT - JavaZone 2015
Stateless authentication with OAuth 2 and JWT - JavaZone 2015Stateless authentication with OAuth 2 and JWT - JavaZone 2015
Stateless authentication with OAuth 2 and JWT - JavaZone 2015
Alvaro Sanchez-Mariscal
 
James kalbach alignment diagrams euro ia 2010
James kalbach alignment diagrams euro ia 2010James kalbach alignment diagrams euro ia 2010
James kalbach alignment diagrams euro ia 2010
Jim Kalbach
 

En vedette (10)

Hybrid Open Access. Cul-de-sac or shortcut towards Open Access?
Hybrid Open Access. Cul-de-sac or shortcut towards Open Access?Hybrid Open Access. Cul-de-sac or shortcut towards Open Access?
Hybrid Open Access. Cul-de-sac or shortcut towards Open Access?
 
Bridging the Gap - Laracon 2013
Bridging the Gap - Laracon 2013Bridging the Gap - Laracon 2013
Bridging the Gap - Laracon 2013
 
Minimum Viable Architecture For Web Apps
Minimum Viable Architecture For Web AppsMinimum Viable Architecture For Web Apps
Minimum Viable Architecture For Web Apps
 
Confd, systemd, fleet을 이용한 어플리케이션 배포 in CoreOS
Confd, systemd, fleet을 이용한 어플리케이션 배포 in CoreOSConfd, systemd, fleet을 이용한 어플리케이션 배포 in CoreOS
Confd, systemd, fleet을 이용한 어플리케이션 배포 in CoreOS
 
Stateless authentication with OAuth 2 and JWT - JavaZone 2015
Stateless authentication with OAuth 2 and JWT - JavaZone 2015Stateless authentication with OAuth 2 and JWT - JavaZone 2015
Stateless authentication with OAuth 2 and JWT - JavaZone 2015
 
AngularとOnsen UIで作る最高のHTML5ハイブリッドアプリ
AngularとOnsen UIで作る最高のHTML5ハイブリッドアプリAngularとOnsen UIで作る最高のHTML5ハイブリッドアプリ
AngularとOnsen UIで作る最高のHTML5ハイブリッドアプリ
 
Micro Services - Small is Beautiful
Micro Services - Small is BeautifulMicro Services - Small is Beautiful
Micro Services - Small is Beautiful
 
James kalbach alignment diagrams euro ia 2010
James kalbach alignment diagrams euro ia 2010James kalbach alignment diagrams euro ia 2010
James kalbach alignment diagrams euro ia 2010
 
Be A Great Product Leader (Dropbox / AirBnB 2013)
Be A Great Product Leader (Dropbox / AirBnB 2013)Be A Great Product Leader (Dropbox / AirBnB 2013)
Be A Great Product Leader (Dropbox / AirBnB 2013)
 
AWS初心者向けWebinar AWS上でのDDoS対策
AWS初心者向けWebinar AWS上でのDDoS対策AWS初心者向けWebinar AWS上でのDDoS対策
AWS初心者向けWebinar AWS上でのDDoS対策
 

Similaire à [Osxdev]metal

Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014
Droidcon Berlin
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate GuideДмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
UA Mobile
 
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphics
changehee lee
 

Similaire à [Osxdev]metal (20)

Computer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming IComputer Graphics - Lecture 01 - 3D Programming I
Computer Graphics - Lecture 01 - 3D Programming I
 
3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)
 
Android open gl2_droidcon_2014
Android open gl2_droidcon_2014Android open gl2_droidcon_2014
Android open gl2_droidcon_2014
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill) Developing Next-Generation Games with Stage3D (Molehill)
Developing Next-Generation Games with Stage3D (Molehill)
 
[若渴計畫]由GPU硬體概念到coding CUDA
[若渴計畫]由GPU硬體概念到coding CUDA[若渴計畫]由GPU硬體概念到coding CUDA
[若渴計畫]由GPU硬體概念到coding CUDA
 
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio [Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
[Unite Seoul 2019] Mali GPU Architecture and Mobile Studio
 
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate GuideДмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
Дмитрий Вовк - Learn iOS Game Optimization. Ultimate Guide
 
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
The next generation of GPU APIs for Game Engines
The next generation of GPU APIs for Game EnginesThe next generation of GPU APIs for Game Engines
The next generation of GPU APIs for Game Engines
 
Optimizing Games for Mobiles
Optimizing Games for MobilesOptimizing Games for Mobiles
Optimizing Games for Mobiles
 
Smedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphicsSmedberg niklas bringing_aaa_graphics
Smedberg niklas bringing_aaa_graphics
 
NVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and TransparencyNVIDIA Graphics, Cg, and Transparency
NVIDIA Graphics, Cg, and Transparency
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
 
NVIDIA CUDA
NVIDIA CUDANVIDIA CUDA
NVIDIA CUDA
 
Lets have a look at Apple's Metal Framework
Lets have a look at Apple's Metal FrameworkLets have a look at Apple's Metal Framework
Lets have a look at Apple's Metal Framework
 
Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022Anatomy of ROCgdb presentation at gcc cauldron 2022
Anatomy of ROCgdb presentation at gcc cauldron 2022
 

Plus de NAVER D2

Plus de NAVER D2 (20)

[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다
 
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기
 
[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발
 
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
 
[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A
 
[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기
 
[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning
 
[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications
 
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingOld version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
 
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
 
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
 
[224]네이버 검색과 개인화
[224]네이버 검색과 개인화[224]네이버 검색과 개인화
[224]네이버 검색과 개인화
 
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
 
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
 
[213] Fashion Visual Search
[213] Fashion Visual Search[213] Fashion Visual Search
[213] Fashion Visual Search
 
[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화
 
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
 
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

[Osxdev]metal

  • 3. I’m so happy that I was too lazy to learn Objective-C
  • 4. Maybe or not Game Industry Trend C++ OOP Design Pattern TDD - C FP / PP / DOP - Fast Iteration Immutability
  • 5. Maybe or not Game Industry Trend C++ / Objective-C OOP Design Pattern TDD C / Swift FP / PP / DOP ! Fast Iteration Immutability
  • 6. seen season one before? Explaining Metal
  • 9. This talk •No API in detail •No code(my own) •No demo
  • 10. CPU vs GPU Control Cache ALU ALU ALU ALU DRAM DRAM or instruction stream sharing. While mming model permits each shader w a unique stream of control, in ecution on nearby stream elements e same dynamic control-flow decisions. le shader invocations can likely share am. Although GPUs must accom- where this is not the case, instruction oss multiple shader invocations is a key e design of GPU processing cores and is gorithms for pipeline scheduling. a GPU’s hin cessing or exe- tions. mple- nces exist d product GPUs ciency multi- mploy Even higher performance is possible by populating each core with multiple floating-point ALUs. This is done efficiently with SIMD processing, which uses each ALU to perform the same operation on a different piece of data. The most common implementation of SIMD processing is via explicit short-vector instructions, similar to those provided by the x86 SSE or PowerPC Altivec ISA exten- sions. These extensions provide a SIMD width of four, with instructions that control the operation of four ALUs. Alternative implementations, such as NVIDIA’s 8-series architecture, perform SIMD execution by implicitly shar- Type Processor Cores/Chip ALUs/Core3 SIMD width MaxT4 GPUs AMD Radeon HD 2900 4 80 64 48 NVIDIA GeForce 8800 16 8 32 96 CPUs Intel Core 2 Quad1 4 8 4 1 STI Cell BE2 8 4 4 1 Sun UltraSPARC T2 8 1 1 4 TABLE 1 1 SSE processing only, does not account for x86 FPU. 2 Stream processing (SPE) cores only, does not account for PPU cores. 3 32-bit, floating point (all ALUs are multiply-add except the Intel Core 2 Quad)
  • 12. Why we should use driver?
  • 13. Why we should use driver? •GPU runs asynchronously •Different address space •Different ISA •Display is updated by frame
  • 14. 그림 그리기 •도화지를 편다 •(그릴 그림을 생각한다) •붓과 물감을 고른다 •붓으로 그림을 그린다. •(구겨 버리거나 걸어둔다) •새 도화지를 편다
  • 15. 그림 그리기 / Graphics App. •도화지를 편다 / Framebuffer setup •(그릴 그림을 생각한다) / Data setup •붓과 물감을 고른다 / State setup •붓으로 그림을 그린다. / Draw call •(구겨 버리거나 걸어둔다) / Update a frame •새 도화지를 편다 / Framebuffer clear Graphics Driver는 이 모든 과정의 API를 제공한다
  • 16. Graphics Driver의 계층 구조 API Interface State Management Command Queue Management I/O Controller Shader Compiler
  • 17. Why is it expensive? Graphics Driver가 하는 일 •State validation ■ Confirming API usage is valid ■ Encoding API state to hardware state •Shader compilation ■ Run-time generation of shader machine code ■ Interactions between state and shaders •Sending work to GPU ■ Managing resource residency ■ Batching commands
  • 18. OpenGL State validation void glTexImage2D( GLenum target, GLint level, GLint internalFormat, GLsizei width, GLsizei height, GLint border, GLenum format, GLenum type, const GLvoid * data);
  • 19. Are you kidding? Shader Compilation •No standard for pre-built shader •No standard for shader binary format int Init(ESContext *esContext) { UserData *userData = esContext->userData; GLbyte vShaderStr[] = "attribute vec4 vPosition; n" "void main() n" "{ n" " gl_Position = vPosition; n" "} n"; GLbyte fShaderStr[] = "precision mediump float; n" "void main() n" "{ n" " gl_FragColor = vec4(1.0, 0.0, 0.0, 1.0); n" "} n"; GLuint vertexShader; GLuint fragmentShader; GLuint programObject; GLint linked;
  • 21. 복붙 Sending work to GPU •Batching commands and committing •Transferring data and texture
  • 22. Design target Metal •Low CPU overhead •More predictable performance •Better programmability
  • 23. Key ideas Metal •Create and validate state up-front •Shader can be compiled offline •Enable versatile multi-threading •Shared memory for CPU & GPU •Handle synchronisation explicitly •Tile-based deferred rendering •C++11 based language •No legacy baggage •Compute shader But, A7 only - What the x
  • 25. Metal vs OpenGL ES Code comparison
  • 26. Low CPU overhead enable So what •more draw calls •more objects •better physics •better AI •more complex logic •low battery usage
  • 27. Use engine or forget How do I start? •Unity 5(next year) - free/4,500$ •Unreal 4(may be this year) - 19$/month •Cocos2D - free •Xcode template
  • 28. Proprietary API •Apple is a promoter of Khronos Group •OpenCL story •판이 꺼졌으니 사다리 걷어차기? ■ 하지만 구글은 바보가 아니다(Expansion Pack)
  • 29. 몰라도 그만 Conclusion •Low CPU overhead •Can do something more •A7 only(할 수 없거나 귀찮거나) •Game-changer? maybe or not