5. Objectives
Automation
Business life cycle management,for example, modification 、
monitor、fault handling and so on.
Resource utilization is elastic.
Standardization
Flow
Instance standard
System environment、runtime、framework
Unification
Integrate the third-party service,for example DB、Cache、log、FS
and so on.
Linkage with other system platform
TRANSLATED VERSION
8. 2. Practice and Reform(Part1)
Java,base on cf 1.0
TRANSLATED VERSION
9. Java Apps
• Number of Product Categories >100
• APP >200
• Instances>2000
• Average single-instance 10G(Memory)
• Average Daily total pv > 1billion
• The numbers of developers and testers for APP > 700
• Tomcat5/6/7、jdk1.5/1.6、Standalone
TRANSLATED VERSION
10. Implementation and Preparation
• Relevant modification based on CentOS
ü Deploy each CF component independently
⁺ Analyze BOSH、chef,implementation based on physical machine
ü OS environment initialization
⁺ apt-get is changed to yum
ü Ubuntu-cmd to CentOS
⁺ DEA(v1.0),agent.rb、secure.rb
yum install -y make gcc gcc-c++ kernel-devel.x86_64 openssl-devel.x86_64 libxml2.x86_64 libxml2-
devel.x86_64 libxslt.x86_64 libxslt-devel.x86_64 git.x86_64 sqlite.x86_64 ruby-sqlite3.x86_64 sqlite-
devel.x86_64 unzip.x86_64 zip.x86_64 ruby-devel.x86_64 ruby-mysql.x86_64 mysql-devel.x86_64 curl-
devel.x86_64 postgresql-libs.x86_64 postgresql-devel.x86_64 zlib-devel.x86_64 readline-devel.x86_64
ImageMagick.x86_64 ImageMagick-devel.x86_64 php-magickwand.x86_64
TRANSLATED VERSION
11. Cluster capacity assessment
• Number of instances,NATS capacity assessment
ü Number of instances hosted by single DEA(<100),the pressure to NATS-Server has little
effect
ü Single NATS-Server can host 330 DEAs by a conservative estimate,The number of single
instance is 5~30.
ü Multiple NATS-Server,extendable
Deplay
(ms)
Number of DEAs (10 ~ 340)
Number of Single DEA
instances(5 ~ 30)
Critical line
330 DEAs
TRANSLATED VERSION
12. In cluster, component redundant,
LB design
• NATS
ü Cluster,multiple NATS, synchronous heartbeat
ü Cache information from client side. If network is cut down,it
should keep to reconnect.
ü Multiple NATS does load balance(Client > 0.5.beta.6)
NATS-Server1 NATS-Server2
NATS-Client
(caching message)
NATS-Server1/2,
Random list
TRANSLATED VERSION
13. Multiple cluster redundant design
• Multiple independent cluster ,logic independent
ü The first layer’s switch,modify DNS A record,for multiple domain names(CNAME to this A
record), they will uniformly switch to to different clusters
ü The second layer’s switch,modify “interface layer”(For its application layer’s function ,it can be
simply understood as Nginx’s reverse proxy )
ü Ensure App (stateless) capacity,or expand the capacity quickly to prevent overload when the
traffic switch back
Baidu GateWay
Front End
Router
A记录
Baidu GateWay
Front End
Router
app1 app1
CNAME(formal domain
name)
CNAME(formal domain
name)
www.baidu.com CNAME www.a.shifen.com.
www.baidu.cn CNAME www.a.shifen.com.
www.a.shifen.com. A 119.75.218.77
www.a.shifen.com. A 119.75.217.56
TRANSLATED VERSION
16. New features
• Support RPC, Single instance with multiple
ports
ü One instance will open multiple ports,and provide API to search the
IP ,ports in real time
ü Linkage with “name service”,synchronize dynamic IP/port’s
relationship with name.
ü RPC caller will connect the instance directly according to name
TRANSLATED VERSION
17. DEA server
Support RPC、
Single instance with multiple ports
Instance01:port
Instance02:port
API Bridge
NS
server
TXT record
ip:port
ip:port
RPC caller
NS client
Domain
ip:port
ip:port
ip_local_port_range
10000 ~ 60000
Port pool(There is freeze
period after allocation)
61000 ~ 65000
TRANSLATED VERSION
18. New features
• Support JMX
ü API to search the IP and Jconsole port in real time, then implement to
collect JMX data in real time.
TRANSLATED VERSION
20. New features
• Enhancement to health monitor
ü Seven layers’ detection
ü Number of file handler detection
TRANSLATED VERSION
21. DEA Server
DEA agent.rb
Health Manger
instance
http
avali
abili
ty
instance
CPU MEM DISK ……
report
Enhancement to health monitor
hand
ler
TRANSLATED VERSION
22. DEA(v1.0), logical enhancement
• Ports Management
ü Description
⁺ Single DEA, multiple instance,parallel to assign and start the port,there is no
critical line,but there is the port competition issue
ü Solution
⁺ Reference DEA(v2.0)’s logic(Notes: it’s DEA_NG, not compatible with CF1.0)
⁺ Define ip_local_port_range as 10000~61000,it is dynamic ports’ range
⁺ Make 61001~65000 as DEA scheduling assigned ports
⁺ For assigned port,add “[release time、port num]” data structure
⁺ It resolve the port competition by delaying to release the port
ü Note
⁺ CF2.0 has resolved this problem by the same method above.
TRANSLATED VERSION
23. DEA(v1.0),logical enhancement
• Instance resource information management
ü Description
⁺ Du command takes long time to calculate the disk space, as a result, the
following commands’ calculation is not consistent
⁺ When calculate the CPU utilization, it doesn’t consider the number of cores
ü Solution
⁺ Adjust the related command’s order
⁺ When calculate the CPU utilization, it should be divided by the number of
cores
ü Notes
⁺ CF2.0 has resolved this problem.
TRANSLATED VERSION
24. New features
(Linkage with peripheral system)
• File persistent
ü Use MFS(Moose File System)
ü DEA deply MFS-Client and mount /mfs/path to let instance use
ü MFS service provide the HTTP interface to get the data
• Route based on URL,distinguish APP
ü foo.baidu.com/app1 à app1.foo.baidu.com
ü foo.baidu.com/app2 à app2.foo.baidu.com
• Monitor linkage
ü APP’s life cycle,to interact with external monitor system’s API, to implement the
monitor item’s automatic modification.
• The SDK
ü Automatic release(encapsulate vmc)
ü View file
TRANSLATED VERSION
25. Summary of key reform point(CF V1.0
• Relevant reform based on CentOS
• NATS-Cluster usage、NATS-Client retry and cache
• Support RPC、single instance with multiple ports
• Support dynamic JMX、Jconsole
• Enhance the health monitor
• Ports management
• Instance resource information management
• Peripheral component:File persistent、Monitor linkage、
URI Route、The SDK
TRANSLATED VERSION
26. 2. Practice and Reform(Part2)
C/C++,base on cf 2.0
TRANSLATED VERSION
27. Several key problems of C/C++ Apps
• Container’s runtime is isolated with resource
ü Kernel/GNU
ü Resource isolation
ü Snapshot,Core Dump
• Single instance, multiple processes
ü Health monitor
ü The order of processes’ execution
ü Communication within instance and among process
ü Multiple ports
ü The isomorphism of multiple instances
TRANSLATED VERSION
28. Several key problems of C/C++ Apps
• Big instance
ü Big instance number(100 thousands)
ü Large amount of data(single instance,2TB)
ü High memory usage(single instance,100G)
ü Long start time(30mins)
ü Large flow(single instance,daily total PV2 hundred million)
ü When drift,to prevent insufficient resources
• APP communication
ü Network layer communication,authorization、flow control
ü Output file,need to get from outside
ü Input file,need to push from outside
ü RPC,none-HTTP protocol,not containing PATH info,can’t route
TRANSLATED VERSION
29. Instance’s OS-Level
environment preparation
• Container’s runtime environment
ü Kernel is consistent with host machine
ü Make Container’s file environment
warden/warden/root/linux/rootfs/setup.sh
if grep -q -i centos /etc/issue
then
exec $(dirname $0)/centos.sh $@
fi
TRANSLATED VERSION
30. Relationship between Container
and host machine
Warden
Networking,Bridge / NAT / Firewall / FlowControl
DEA
init─┬─xxx
├─xxx─xxx
├─xxx
mount r usr/ lib/ etc/
mount rw xxx/
network interface(sub net)
Cgroup – CPU / MEM
Name space
init─┬─xxx
├─xxx─xxx
├─xxx
mount r usr/ lib/ etc/
mount rw xxx/
network interface(sub net)
Cgroup – CPU / MEM
Name space
TRANSLATED VERSION
31. Package management
• Buildpack API
ü detect , check
ü Compile,environment preparation
⁺ Directory structure
⁺ Program files,and relevant supporting program
⁺ Startup script, and ensure the startup order of process …
⁺ Monitor script,it can periodically execute and check the whole instance’s health
ü Release,information to publish
ü Procfile,parameter passing(e.g. port)
ü .profile.d,environment variable
TRANSLATED VERSION
32. Point to enhance health monitor
• Self-defined monitor scripts
ü self-defined monitor scripts, which is published together with instance and periodically
to modify the content of stat_file
ü DEA will check the stat_file periodically
Instance stat_file
monitor.sh
process-1
process-2
DEA
HM
TRANSLATED VERSION
33. Reform to APP
• For RPC,support NS Client
ü Dynamic configuration file to replace route
ü Port management,freeze time
• Input/Output file
ü Input file need to get from outside actively
ü Output file,pushed to the transit(e.g. cloud storage ),or service based on NS
• Multiple process management, startup scripts
ü Multiple processes,to control their startup order
ü Process control
• File persistent
ü Remote log
ü Use the cloud storage
TRANSLATED VERSION
35. Reform Summary(cf v2.0)
• Relevant reform based on the CentOS
• Container’s environment order
• Buildpack’s order
• Support RPC, single instance, multiple ports
• Enhance the health monitor
• Peripheral: file persistent, monitor linkage, URI Route, SDK
TRANSLATED VERSION
37. Working Process Description
Review
• Standard
• Capacity
• SLA
Access
• Org
relationship
• Name info
• Operation info
Process
approval
• Authorizatio
n apply
• Name apply
• Release opt
Release
update
• PreRelea
se
• Gray
scale
• Rollback
Failure
handling
• availabi
y
• Security
• Issue
mgmt
TRANSLATED VERSION
38. Standard and Capacity Example
• Standard information collection
ü App related name, related interface people(R&D, QA, operation,
related manager, and so on)
ü Runtime is isolated with container’s version
ü Stateless, RPC, URI Route
ü Dynamic and static files are isolated
ü File persistence
• Capacity information collection
ü PV、QPS
ü Single instance’s CPU, memory, disk, bandwidth, restarting time
ü Number of instances
TRANSLATED VERSION
39. SLA examples
• Service object
ü Java Application(“APP” for short in the following)
ü APP that conforms to the standard
• Servicing time
ü 24×365 all year round
• Way to communication
ü Mail、Tel、interface people information
• Stability related indicators
ü Core components,availability >99.99%(by month),MTTR<20mins,
MTBF>5days
ü Control services,availability >99.95%(the whole year)
ü APP’s self SLA, it won’t cause bad effect because of platform its self.
ü Notes:APP’s self problem,beyond the scope of SLA,for example,
bug, capacity forest error, external system’s failure(e.g. DB, Cache) and so on
TRANSLATED VERSION
40. Organization, Layer
• Product line(Org)
• Module(Space)
• Group(APP)
• Version (APP-*)
Product line -2
Product line-1 (Org)
Module-2
Module-1 (Space)
Group-1(A)
Group-2(B)
实例,版本-1
(APP-1-1)
实例,版本-2
(APP-1-2)
实例,版本-1
(APP-2-1)
实例,版本-2
(APP-2-2)
Instance,v1
(A-1)
Instance,V2
(A-2)
Instance,v1
(B-1)
Instance,V2
(B-2)It is one APP,but multiple
instances in the dashed frame.
TRANSLATED VERSION
41. Further encapsulation to CC
Product line(Org) OrgName
Module(Space) OrgName_SpaceName
Module group OrgName_SpaceName_GroupTag
Module version OrgName_SpaceName_GroupTag_VersionTag
Instance(Unique id) OrgName_SpaceName_GroupTag_VersionTag_Index
TRANSLATED VERSION
42. GroupTag、VersionTag
• GroupTag
• It can distinguish: configuration number、computer room、rack … from different dimension
• Version Tag
• It can distinguish:program, data, configuration file and so on
• Including: four version number, timestamp
• Instance full name,for example
• Org_Space_GroupA_1-1-1-1-438249600_1
• Org_Space_GroupB_1-1-1-1-438249600_1
TRANSLATED VERSION
43. Examination, approval and release
• Distribute form and approve
ü APP information(program version, capacity information, related
instruction and so on)
ü Approval(related manager, and the people who should know)
ü Operator、Operating time
ü Monitor information(Monitoring and controlling strategy、
Interface people and so on)
• Start to distribute operation, and add
monitor
ü Before release,related approval processes must pass
ü Operator, program version, MD5、time information and so on,it
must keep consistent with approval
ü It must be consistent and pass the processes,then it can
release
ü After successful release, add the monitor
Distribute
form
Approval
Release
APP
Add Monitor
TRANSLATED VERSION
45. Basic grays scale release
app_v1
instance01app_v1.paas.baidu.com
app_v1
instance02
app_v2
instance01
app_v2
instance02
app_v3
instance01
app_v3
instance02
app.baidu.com
1、Make one formal domain name point to multiple apps at the same time
2、Adjust the proportion of many instances’ number,then adjust the
proportion of traffic.
app.baidu.com
app_v2
instance03
By adjusting the proportion
of the many instance’s
number, to adjust the
proportion of gray scale
traffic
TRANSLATED VERSION
46. “The path to sermon”,
The platform popularization
• The medal, who own the other half ?
ü Support app
⁺ New service needs to follow the PaaS related standard and thought
⁺ Old service,need R&D to reform and QA to do regression test
ü Periphery support
⁺ DB, Cache, storage, interface, security, monitor and so on
• Clear the benefits,establish the win-win ecosystem
ü Deliver faster, save more resource, and make it more simple
ü One-stop and all-in-all service,hand in hand to popularize
TRANSLATED VERSION
47. Some solutions:
• Give users(APP developers) noble imperial
enjoyment
ü For important APP,do some specific service
ü For important managers,it should have a set of complete, timely communication, such
as reports, etc
ü The principle is “capitalism”, rather than “socialism”
• Event “marketing”
ü E.g. “struts2 0day”
⁺ Actively cooperate with R&D and QA to do the issues identification, repair and
implementation
⁺ Actively report the progress and do the event managment
⁺ Late,for this to actively promote and participate the discussion and make decision,
for example, security, and architecture group
⁺ The principle is “win-win”,rather than shirking the responsibility
TRANSLATED VERSION
49. Reform operation
“NoOps”
PaaS(and IaaS) overall functionality
>= Traditional operation work
Storage
Servers
Networking
O/S
Middleware
Virtualization
Data
Applications
Runtime
OP(SRE),
operation
PaaS (and IaaS)
TRANSLATED VERSION
50. How to reform,Example
• Automatic fault recovery
ü Add the health monitor mechanism based on the
traditional monitoring
ü Instance automatically restart and “drift”
ü Reduce the traditional alarm and man power
⁺ It will only alarm, when automatic recovery fail
Monitor
Whole instance
name_1
ip:port
… …
Health
monitor
AP
I
… …
Real instance_1
ip:port
Instance after drifting_1
• ”drift” is a normal phenomenon, it doesn’t alarm
• It only need the alarm, when “drift” fail
• It refinins to monitor instance,every time according t
name,detect and return ip:port
TRANSLATED VERSION
51. How to reform, Example
• More agile
ü Make developer forget the servers, instead of resource oriented
ü It has a complete configuration management and automatic deployment
function
ü Release, pre-release, rollback, extremely simple, and it doesn’t need the
extra complex deployment tool
ü Elastic extension, extremely simple
ü Use Buildpack,implement cloud compiling and run directly
• Experience of all in one and one-stop
ü From distribute form, release and modify the monitor,the working process is
totally automatic
ü Integrate the third-party service, unify the management entrance
TRANSLATED VERSION
53. Future plans
• Feedback to community
• For private cloud function,try best to encapsulate the native components(based
CF2.0) , then make the new component open source
• If affect the native components,try best to merge to the master branch
• Write more document and tips, and actively to participate in communication
• Development orientation
• For large application(big instance)related
• Intelligent scheduling related
• Information Security
• Further continuous integration
• UI
TRANSLATED VERSION
54. We are hiring !
@Weiyu Wang(王炜煜)
weibo.com/wwy1640
Thanks
TRANSLATED VERSION