This document discusses using the Adobe Flash Stage3D API to build 3D browser MMOs. It outlines optimization techniques used in the XPEC Flash 3D engine like material sorting, shared buffers, and command buffering to reduce draw calls and improve performance. It also covers challenges in implementing features like particles, skinning, shadows, and post-effects within Flash's limitations and solutions developed. Future work areas discussed include Adobe Texture Format, multi-threading via workers, and cross-compilation technologies.
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser MMOG
1. Using The New Flash Stage3D Web Technology
To Build Your Own Next 3D Browser MMOG
Daosheng Mu, Lead Programmer
Eric Chang, CTO
XPEC Entertainment Inc.
2. Outline
• Brief of Speakers
• Introduction of Adobe Flash Stage3D API
• XPEC Flash 3D Engine
• Optimization for Flash Program
• Future Works
• Conclusion
• Q & A
3. Brief of Speakers
• Eric Chang
– 19 Years of Game Industry
Experiences
– Cross-platform 3D Game
Engine Development
– PC/Console/Web
4.
5. Brief of Speakers
• Daosheng Mu
– 4.5 Years of Cross-platform 3D Game Engine Development
Experiences
– PC/Console/Web
6. Why Flash?
Native C/C++ vs. Unity vs. Flash
Native
C/C++
Unity Flash
Development
Difficulty
High Low Mid
Ease of
Cross Platform
Low High High
Performance High Mid Low
Market
Popularity Low Mid
High
(>95%)
10. Stage3D
• Stage3D includes with GPU-accelerated
3D APIs
– Z-buffering
– Stencil/Color buffer
– Vertex shaders
– Fragment shaders
– Cube textures
– More…
11. Stage3D
• Pros:
– GPU accelerated API
– Relies on DirectX, OpenGL, OpenGL ES
– Programmable pipeline
• Cons:
– No support of alpha test
– No support of high-precision texture format
12. Stage3D
ResourceNumber allowedTotal memory
Vertex buffers 4096 256 MB
Index buffers 4096 128 MB
Programs 4096 16 MB
Textures 4096 128 MB*
Cube textures 4096 256 MB
Draw call limits 32,768
*350 MB is absolute limit for textures, 340 MB is the result we gather
13. AGAL
• Adobe Graphics Assembly Language
– No support of ‘if-else’ statements
– No support of ‘constants’
15. Model Pipeline
• Action Message Format (AMF):
– Native ByteArray compression
– Native object serialization
3DS Max
Engine
Loader
Exporter
Collada
Binary
Converter AMF
AMF
Engine
Render
16. XPEC Flash 3D Engine
• Application: update/render on CPU
• Command buffer: store graphics API
instruction
Application
Command
buffer
Driver
GPU
CPU
17. XPEC Flash 3D Engine:
Application
Object3D
• Material
• Geometry
Update
• UpdateDeltaTime
• UpdateTransform
Scene
management
• Scene partition
• Frustum culling
Update
• UpdateHierarchy
Draw
• SetMaterial
• SetGeometry
Stage3D
• Set Stage3D APIs
18. Scene Management
• Goal: Minimize draw calls as possible
• Indoor Scene
– BSP tree
• Outdoor Scene
– Octree/Quad tree
– Cell
– Grid
19. Scene Management: Project C4
• Grid partition
• Object3D: (MinX, MaxX), (MinY, MaxY)
(0, 0)
(2, 2)
(4, 4)
y
x
20. Scene Management: Project C4
• Frustum: (MinX, MaxX), (MinY, MaxY)
(0, 0)
(2, 2)
(4, 4)
(1,4),(0,4)
y
x
29. Particle System
• Each particle property
is computed on the
CPU at each frame
– Alpha, Color,
LinearForce, Size,
Speed, UV
– Facing
30. Particle System
• Index buffer
– Indices will not be changed
• Vertex buffer
– Problem:
• Particle amount depends on frame
• Upload data to vertex buffer frequently
41. Toon Shading
• Single pass
– Problem: Dependent on no. of face
• Two passes
– Scale vertex position following the vertex
normal
– Not dependent on no. of face
𝑣
∶ 𝑣𝑖𝑒𝑤 𝑣𝑒𝑐𝑡𝑜𝑟
𝜃
𝑖𝑓 𝜃 > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑, 𝑑𝑟𝑎𝑤 𝑡𝑜𝑜𝑛 𝑐𝑜𝑙𝑜𝑟
𝑁
: 𝑣𝑒𝑟𝑡𝑒𝑥 𝑛𝑜𝑟𝑚𝑎𝑙
42. Toon Shading
• Enable
back face
• Scale
vertex
position
• Draw color
Toon
• Enable
front face
• Draw
material
General Result
43. Alpha Test
• Problem:
– Stage3D without alpha test
– “kil opcode in AGAL”
• Performance penalty on mobile device
44. Alpha Test
• Solution:
Render loop
time(ms)
Total time(ms)
6600GT alpha
test
17~19 47
6600GT alpha
blend
18~19 65~67
8800GT alpha
test
0.16 37
8800GT alpha
blend
0.3 36
•304 draw calls
•Alpha-test performance is better on
desktop
Replace alpha-test
with alpha-blend
48. Optimization for Flash Program
• Problem:
– For Each is slow
• “Use for-loop to replace it”
– Memory management
• “Recycle manager”
• “Strengthen garbage collection”
49. Optimization for Flash Program
• Solution:
– Recycle manager
• Reduce garbage collection loading
• Save objects initial time
• public function
recycleObject3D( obj:IObject3D ):void
• public function requestObject3D( classType:int ,
searchKey:*, renderHandle:int = 0 ):*
50. Optimization for Flash Program
• Solution:
– Strengthen garbage collection
• Avoid inner function
• Force to dereference function pointer
• Dereference attribute in object destructor
51. • Avoid inner function
• Force to dereference function pointer
Without inner function
Use inner function
52. Optimization for Flash Program
• Experiment: before vs. after
– Switching among levels
Before improvement: After improvement :
54. Rapid loading
• Streaming
– Data compression
• PNG: swf compression: 20%~55%
• Package: zip compression: 25~30%
– Batch loading
• Separate resource to several packages
• Download what you really need
55.
56. Rapid loading
Enter to
avatar stage
Enter to
game stage
After loading
picture
finished
5Mb/s
Elapsed time
(sec)
15 6 12
• game code
• ui
• game scene • scene textures
57. Future Works
• Adobe Texture Format (ATF)
– Support for compressed/mipmap textures on the
different GPU chipset
• FlasCC
– C++ AS3 Compilation
• AS3 Workers
– Multi-thread support
• MovieClip
– Replace with Stage3D UI framework, ex: Starling
2011.2 release
Based on flash player benefit: 市佔高、用戶多、商業行為市場大
3D API: 並且跨平台( browsers, mobile devices ),叫做stage3D。一個Flash Player最多可以使用到4個stage3D,所以可以做同時四個視窗的應用。
One codebase 跨所有的瀏覽器。所有使用flash player開發的應用都可以橫跨所有瀏覽器,並且效能差距也不大。
如果現在要用Html5 WebGL來做遊戲最常被討論的就是在哪個瀏覽器上效能差距如何,因為它是被不同瀏覽器所來處理
而Flash的所有程式都是在一個flash player內被執行,因此在 stage3D上作3D遊戲效能並不會差距很大。
Stage3d 提供了一般3d api該要有的功能:
Alpha test: 不支援一般desktop 3d api理應提供的render state,需要我們在pixel shader額外加上指令
RGBA: 對於高精確度的圖檔格式並沒有提供,每個channel只能存入8bit資料。我們在開發shadow map時,必須要特殊儲存方式的貼圖來存放深度資訊。
最大公約數texture 128 MB for mobile device,PC大約是340 MB。
MMOG仍然不夠用
Assembly -> ByteArray -> Program3D
不能使用branching, if…, 常數一定要透過shader constant傳入,不能直接在shader內被宣告
Pixelbender3D is readable and high-level, but AGAL is a good way to have good performance
Put on different with PCF and non-pcf
Vexel.depth result > shadowMapResult color is black
Vexel.depth result < shadowMapResult color is white
選shadow map的原因: 地形高低起伏,正確的深度投影、自投影可以視情況關閉
整體效能並沒有影響
Game is comic style
Two pass效能差
選two pass原因: 因為我們面數過低
選two pass原因: 因為我們面數過低
It can be removed, one mobile use alpha blend to replace
我們曾經為了將我們的demo放到iPad上,fps 從1x~2x,就是把所有用到alpha test的 material改成alpha blend
Alpha blend在desktop其實影響比較嚴重。
以下還有一些我們引擎擁有的效果,今天沒有時間一一細講
省去realtime光影的計算
Memory leak: 由於一些物件被認為是有reference到,造成GC機制不會去回收它,但我們其實認為他理應要被回收,所以造成記憶體被堆疊
Memory leak: inner function, dereference, recycle
For loop use for loop to replace for each
Memory leak: 由於一些物件被認為是有reference到,造成GC機制不會去回收它,但我們其實認為他理應要被回收,所以造成記憶體被堆疊
Memory leak: inner function, dereference, recycle
For loop use for loop to replace for each
Memory leak: inner function, dereference, recycle
For loop use for loop to replace for each