IaaS Cloud Architecture Design

IaaS Cloud Conceptual Design

2012.04.01
Terry.Cho
bwcho75@gmail.com

Table Of Contents

1. Overview

2. Architecture Principals

3. Patterns of Cloud Architecture

4. Infrastructure Domain Model

5. User & Service Domain Model

6. Domain Model Mapping

7. Software architecture

8. User Interface reference scenario

9. Infrastructure Architecture

Overview

 본 문서는 일반적인 IaaS 클라우드를 구축하는데 있어서 필요한 아키텍쳐와 기본적인 소프트웨어 및
하드웨어에 대한 디자인을 정의한다.

Design Strategy

 Strategy
 Amazon EC2 이상의 서비스 기능을 제공 (MULTICAST,VLAN,VM갂 빠른 네트워크 Access )
 Public 클라우드 모델 보다는 기업 대상의 서비스가 가능한 수준의 도메인 모델을 설정
 기업의 조직 구조 또는 회계 정산 구조에 맞는 형태의 VIEW 제공
 Global Rollout을 고려한 설계

Usecase

 IaaS의 UseCase 모델은 다음과 같다.

통합 정산
통합 모니터링

ACCOUNT ADMIN
자원 할당
회사 또는 비즈니스 단
IaaS 위
+ 인프라 관리

Fundamental
CLOUD ADMIN
Service
IFRA ADMIN
서비스 단위
서비스 정의
및 할당
서비스 개발 및 운영

USER
서비스 컴포넌트 단위

Architecture Principals

 Fundamental Principals which drives Cloud Architecture Design
1. Infinite capacity
2. Continuous availability
3. Predictability
4. Take a service provider’s approach to delivering infrastructure
5. Resiliency over redundancy mind set
6. Minimize human involvement
7. Optimize resource usage
8. Incentive desired resource consumption behavior

1. Infinite Capacity

 From customer’s perspective, cloud service appear to have infinite capacity
 Objectives
 Drive a change of thinking in EA, Service Delivery and Operations teams
 Place high emphasis on capacity planning

2. Continuous Availability

 From customer’s perspective, cloud service should never exhibit any interruption to service, even
if failure occur within the cloud environment
 Objectives
 Drive a change of thinking in EA, Service Delivery and Operations teams
 New approach to resiliency/redundancy
 Place high emphasis on availability planning

3. Predictability

 Remote as much variation from the environment as possible to increase predictability
 Objectives
 Increase predictability translates to lower costs and higher quality
 Reduce variations across infrastructure, system management and operations

4. Take a service provider’s approach to delivering infrastructure

 SDS should adopt a service provider model, where the provider delivers infrastructure on demand
 Objectives
 Drive change of thinking in EA, Service Delivery and Operation teams
 Consciously thinking about re-usable , on-demand services as opposed to project oriented
services
 Provider and consumer have different perspective & needs
 New approach to budgeting
 Blended central & project-based budgets

5.Resiliency over redundancy mind set

 From the provider’s perspective, focus on maintaining service availability through resiliency,
rather than redundancy
 Objective
 Reduce redundancy at the infrastructure level which is highly costly
 Eliminate duplicated redundancy which is typical across several layers of the stack
 Move toward resilience model which is more cost effective
 Push resilience up the stack; designing application for resilience is less costly than redundant
infrastructure

5.Resiliency over redundancy mind set

 Availability through redundancy  Availability through resiliency
 Aim : avoid hardware component failure  Aim : avoid service failure
 Redundancy at hardware layers  Automated detection-and-response
 Fewer failures, but great impact  Resiliency at Software fail over
 Measured by Mean-Time-Between-Failures  More failure, but less impact
 Measured by Mean-Time-to-Restore-Service

6. Minimize human involvement

 A highly automated environment is required to achieve resiliency
 Objectives
 Well-defined, mature procedure can be automated
 Move further up the automation continuum
 A high fidelity end-to-end health model is also required for automation
 Automation is necessary to achieve resiliency

7. Optimize resource usage

 From the provider’s perspective, resource should be optimized to maximize utilization and
minimize waste
 Objectives
 Provide highest ROI by maximizing resource utilization
 Drive efficiency and reduce cost

8. Incentive desired resource consumption behavior

 Leverage cost, quality and agility to influence consumer behavior in ways that facilitate Cloud
Architecture Principals
 Objectives
 Avoid unlimited consumption
 Get consumers to release resource when no longer needed
 Exposing cost or resource allocated to consumer allows consumers to act responsibly

3. Patterns of Cloud Architecture

Patterns of Cloud Architecture

 Concepts which support the principals and enables IaaS
1. Homogenization of physical infrastructure
2. Provisioning on demand
3. Cloud Management
4. Consumption based pricing
5. Virtualized Infrastructure
6. Server classification
7. Holistic approach to availability
8. Compute resource decay
9. Elastic Infrastructure
10. Partitioning of shared resource

1. Homogenization of physical infrastructure

 Eliminate hardware variation
 Reduce complexity of the environment
 VMs get consistent experience from all hosts
 Simplifies automation
 Cost saving may be achieved through bulk purchase
 Blockers
 Might not be realistic for many specific services. Some service requires High Performance or Unique
hardware configuration like GPU
 Hardware vendor replace product models frequently

2. Provisioning on Demand

 Provide agility to service consumer
 Providing infrastructure to consumer on demand
 Provisioning process is automated and it is completed a few minutes
 Basic software installation ( DBMS, middleware) is also supported further

3. Cloud Management

 Managing workloads on a pool of compute , network and storage resources
 Virtualized infrastructure alone is insufficient
 No concept of fault domain
 Depends upon redundancy for availability
 Need to manage a group of VMs which collectively provide a service. (e.g. Database servers, Email Servers,
J2EE WAS Servers etc)
 Ensure deployment across fault domain (racks, groups of racks)

4. Consumption based pricing

 Charge service fee based on actual resource usage

5. Virtualized Infrastructure

 Abstraction at multiple levels enables step changes in service design and delivery
 Virtualize Network, Storage and Compute independence
 De-coupling consumer from provider resources
 New approach to resilience, less dependent on redundancy
 Any virtualized service can run and function identically on any physical server
 Necessary for achieving resiliency
 Virtualization alone is not enough for cloud

6. Server classification

 The definition of a service and its non-functional characteristics
 Limit the number of service classifications in the same way the number of server specifications should be
limited
 Analyze existing workloads to determine the classifications
 Split stateful and stateless workloads into separate classifications (stateless cost a lot less)
 Expose the actual cost of each classification to incentivize consumer behavior

7. Holistic approach to availability

 Availability is achieved through resiliency, redundancy and application design
 Traditional availability is delivered via redundancy , required expensive hardware
 Availability should be considered across the whole stack; infrastructure, cloud platform, application and
data
 Designing an application to expect and handle failure reduce infrastructure costs
 Resiliency (Mean-Time-to-Restore-Service), minimized the need for redundant hardware
 Even if the number of outstages increases, the duration of each outstate is very low, maintaining a high
availability experience for the user

8. Compute Resource Decay

 Compute failures decrease capacity rather than case an incident
 Virtualization allows workloads to be moved around, therefore VM outages are short-lived, or if able to be
moved proactively, non-existent
 Concept depends upon homogenization and pool of resources concepts

9. Elastic Infrastructure

 Ability to expand and contract capacity on-demand
 Deep understanding of the business is required in order to ensure maximum capacity requirements are
maintained efficiently
 Requires triggers on when to scale out and when to scale back
 Triggers may be automated or request-driven
 Scale down is important to avoid waste
 Auto scale out needs deep technical understanding of specific application or middle ware type.

4. Infrastructure Domain Model

Date Center

 Data Center의 개념 정의
 Data Center
 클라우드 서비스를 제공하기 위한 하드웨어 리소스와 인력 그리고, 이를 포함한 물리적인 건물
 네트워크 구갂 속도 등에 따른 성능, 서비스의 법적 규제등 에 따라 데이터 센터의 규모와 위치를 선정 LC LC
RC
 분류
RC
본 아키텍쳐에서는 계층적 데이터 센터 모델을 사용한다. 중앙 센터에서 집중화된 통제와 관리를 담당하고, 각 지역 센터에서는
GC
서비스를 제공한다. LC

LC
 RC (Regional Center)
 클라우드 서비스 제공
 센터 레벨의 OSS,BSS 제공 RC
 모니터링 및 운영 정보를 MC로 젂달
 GC (Global Center)
 RC의 모든 기능 포함
 사용자를 위한 Self Service Portal 시스템 포함
 각 RC에서 부터 제공되는 정보를 바탕으로 한 통합 OSS+BSS 제공
※ LC (Local Center) : RC에 의해 통제 되며, RC의 외부 Zone으로 관리됨. 사용자 정보와 같이 Regulation에 관렦된 시스템 또는 CDN Edge Node 등의 용
도로 사용되고, 향후 필요에 따라 아키텍쳐 확장 가능

Date Center

 Data Center 갂 연계
 연동 종류
Global Center Regional Center
Data Center갂 연동 패턴은 크게 아래와 같이 API 호출과,
DATA 연동으로 분류할 수 있다.
 API 연계 Self Service Portal Cloud Admin Portal

 API 요청을 받는 UI 인터페이스는 GC에 위치 API Flow
한 Portal Interface만을 이용해서 발생된다.
API Routing (Proxy or ESB) API Routing (Proxy or ESB)
 발생된 Request는 각 Center에 위치한 API
Routing 계층을 통해서 각 센터로 Routing된
다. Cloud Controller Cloud Controller

 DATA 연계
 Operation을 위한 모니터링 데이터나, Billing Cloud Infrastructure Cloud Infrastructure

등을 위한 Business 데이터는 각 센터에 저장
되며, 비동기적으로 Data Bus를 통해서 GC로
수집 된다. Operational Data Business Data
Operational Data Business Data
(Monitoring History etc) (Billing etc)
※ 센터갂 연동을 위한 Network은 Dedicated << Global >> << Global >>
(Monitoring History etc) (Billing etc)

Line을 사용하거나 VPN 기반의 Tunneling을 이
용하여 데이터 유출을 방지 한다. Data Bus (ETL, CDC etc)

Data Flow

VPN or Dedicated Network

Zone

 Zone의 개념 정의
 정의 : Zone은 Data Center를 구성하는 단위로, 여러
개의 Hardware Server Rack의 집합이다.
 하나의 Zone은 다음과 같은 요소로 구성된다.
Load
Server Rack Multicast is allowed (VLAN 내에서만) Balancer

 Virtual Machine을 호스팅하기 위한 여러 개의 Physical Server를 NW Switch
포함하고 있으며, 각 Rack에는 Storage와 연결을 위한 SAN
SAN Switch
Switch와, Zone 밖의 네트웍으로 연동하기 위한 Network Switch
를 가지고 있다.
NW Switch NW Switch NW Switch NW Switch

Storage Rack SAN Switch SAN Switch SAN Switch SAN Switch Network
 SAN 기반의 공유 Storage를 제공한다. Physical Physical Physical Physical
Rack
Server Server Server Server
Network Rack Physical Physical Physical Physical
 Rack에서 나오는 네트웍을 연결하기 위한 Backbone 역할을 하는
Network Switch를 포함하며, 이 Switch는 Data Center의 Router Physical
Server
Physical
Server
Physical
Server
Physical
Server
를 통해서 인터넷과 연결 된다. Compute
 Load Balancer를 포함하고 있으며, VM갂의 부하 분산 서비스를 Rack
제공한다.
각 Server Rack으로 부터 나오는 SAN 연결을 위해서 SAN에 대한

SAN (10G TCP)
Backbone 역할을 하는 SAN 스위치를 포함한다.
SAN
 Zone 구성 이유 Controller

 Zone은 자체적으로 인터넷 연결을 위한 Router를 제외한
Disk Array
Management Network
,Network과 Storage 기능을 가지고 있기 때문에, 하나의 독 Center내의 Cloud Infra 중앙 통제를 위해서 Management
Storage Network은 Center내의 모든 Zone이 공유한다.
립적인 논리 서비스 단위를 대표할 수 있다.
Rack
 Data Center를 다수의 Zone으로 분리했을때 Zone 의 장애
는 다른 Zone에 영향을 주지 않기 때문에 장애에 젂파 성
을 낮출 수 있다.
 Switch, Storage를 젂체 데이터 센터를 통해서 공유하지 않
기 때문에, 해당 장비들의 물리적인 확장 한계를 극복할 수
있다.
 서비스 종류나 고객에 따라서 Dedicate 된 Zone 제공을 통
한 서비스 제공이 가능하다.

Zone
 Zone 의 종류

• Public Zone
- 일반 고객을 대상으로 한 클라우드 서비스를 위한 Zone

• Private Zone
Zone 1 Zone 2 - 특정 기업이나 단체를 대상으로 서비스를 제공하는 Zone
Public Service Public Service 2
- 고객의 요구에 따라서 별도의 Firewall이나 On Premise 까지
Dedicated Network 또는 VPN 연동 서비스를 제공한다.
- Inter zone connection
Private Zone의 경우, 고객의 요구 사항에 따라 Zone갂에 Dedicated 된
Zone 3 Zone 4
Custome
Zone 5
Customer 2
Fault Zone 고속 Network를 제공하여, Zone갂의 연동 서비스를 제공한다. (장애는 젂
Custom
er1 r 1(DR) 파 되지 않는 구조)

• Fault Zone
Inter zone Connection - Zone Fail 시 VM을 Host 할 수 있는 Stand by Zone
- Fault Zone은 다른 Zone의 Rack Type을 모두 수용할 수 있는 구조
Cloud 를 가져야 한다.
VPN
- Costly.
On Premise - Alternatives : Fault Rack in Zone
Customer 1
(On Premise)

향후 Data Center 용량 확장에 따라 Zone의 수를 늘려 나
갂다.

Rack

 Rack 의 개념 정의
 RACK 정의 : VM을 호스팅하는 다수의 Physical Server들의 집합으로, 물리적인 19” 서버 하드웨어 RACK을 의미한다.
 RACK TYPE의 정의 : RACK에 들어가 있는 모든 Physical Server 하드웨어의 SPEC은 동일하며, RACK 별로 서버 하드웨어 사양이 다를 수 있
으며 이를 RACK TYPE으로 정의한다.
 Scale Out Unit
 클라우드의 하드웨어 확장 단위를 Scale Out Unit 이라고 하며, 본 아키텍쳐에서의 Scale Out Unit은 Rack으로 한다.
 Scale Out Unit 은 표준 확장 단위로, 기술, 성능 등을 미리 검증해놓고, 용량 확장 시 신속하게 구매를 짂행할 수 있다.
(주 SCALE OUT UNIT이 너무 작으면, 잦은 구매가 발생하며, Bulk Buy에 의한 Discount 효과를 볼 수 없다. SCALE OUT UNIT이 너무 크면 사용하지 않는 하드
웨어를 미리 구매하여 낭비하기 때문에, 적젃한 단위의 Scale Out Unit Size 지정이 필요하다. )

NW Switch
NW Switch NW Switch NW Switch

SAN Switch
SAN Switch SAN Switch SAN Switch

Physical Server Physical Server … Physical Server
+ Physical Server

Physical Server
Physical Server Physical Server Physical Server

Physical Server

Physical Server

RACK RACK RACK SCAL OUT UNIT

Rack

 Rack Type • Rack Type 의 정의
본 아키텍쳐에서 Rack에 들어 있는 모든 서버 하드웨어
는 동일한 Spec을 가지고 있다. 그러나 개별 Rack들은
성능과 장애 대비 성 두 가지 요소에 따라서 여러 가지 종
류를 가질 수 있으며, 이러한 종류를 Rack Type이라 정의
NW Switch NW Switch NW Switch NW Switch 한다.

• Rack Type 의 종류
SAN Switch SAN Switch SAN Switch SAN Switch

Physical Physical Physical Physical
Physical Physical Physical Physical Rack Type의 분류 기준은 성능적인 측면과 장애 대비 기능 측
면 두 가지에서 분류할 수 있다.
Physical Physical Physical Physical

성능 측면에서 분류 방법

Rack Type A (High Performance & Rack Type B (Mid Performance & - NIC 인터페이스의 속도
High Availability) Low Availability)
• 2.6 GHZ CPU, 1:4 density • 2.6 GHZ CPU, 1:8 density - SAN 인터페이스의 속도
• 10G NIC * 2 (Teaming) • 10G NIC
- CPU CLOCK 등
• 10G NIC ISCSI * 2 (MPIO) • 10G NIC ISCSI
• Storage • Storage : RAID 5,Dynamic Disk - RAID, DISK 옵션 등
 RAID 1+0, RAID 5 Option Only
 Dynamic ,Static ,Pass 장애 대비 측면에서 분류 방법
through disk Rack Rack Type
- NW 스위치, SAN 스위치 이중화 여부
- NIC, SAN 인터페이스 이중화 여부
Zone
• Rack Type 설계 시 고려 사항
Rack 내의 VM은 장애 시 다른 Rack으로 이동될 수 있으나, 동
일한 하드웨어 스펙의 서버로 이동해야하기 때문에, 시스템 구
축시 동일한 Rack Type의 Rack이 최소한 두 개 이상 존재해야
한다.

Resource Pool

 Resource Pool
 Resource Pool의 정의 – Resource Pool의 물리적인 Physical Server의 묶음으로, VM이 Live Migration 또는
Fail Over Architecture를 통해서 이동할 수 있는 범위를 지정하며, Hypervisor의 Clustering 구성의 한계에서
오는 개념으로, Cloud 구현이 Hypervisor의 Clustering을 사용하지 않고, Live Migration과 Fail Over를 지웎한
다면 Resource Pool의 개념이 필요 없다.
 Resource Pool의 범위 – Resource Pool은 Fail Over시 VM이 잧 기동 되는 부분을 정의하기 때문에, 서로 다른
Rack 을 걸쳐서 Resource Pool이 구성되어야 하며, 해당 Rack은 웎래 Rack과 같은 Rack Type을 가져야 한다.
 Resource Pool내의 Physical Server의 개수 – Resource Pool은 앞서 설명했듯이, Hypervisor의 하나의 Cluster
Node와 Mapping이 되기 때문에, Server의 수는 Hypervisor의 Clustering Feature에 따른다.

Rack #1 Rack #2

NW Switch NW Switch

SAN Switch SAN Switch
Physical Physical
Server Server
Physical Physical
Server Server Resource Pool 1
Physical Physical
Resource Pool 2
Server Server

※ 같은 Service Role 내의 VM은 같은 Resource Pool에 Deploy된다.

Infrastructure 개념 정리

Data Center #1

 Data center
Zone #1 Zone #2 • Data center is physical building
which hosts physical infrastructures
Router Network • It is located in multiple region
Rack for Data Rack #1 Rack #2 Rack #3
Center
 Zone
NW Switch NW Switch NW Switch
• It is logical unit
Router
• Set of Physical server racks
NAT SAN Switch SAN Switch SAN Switch • Multicast is allowed in same zone
• 1 zone has 1 SAN, 1 network rack
…..
Physical Physical Physical
(SAN, NW switch)
VPN Server Resource Pool 1
Server Server

Firewall
Physical
Server
Physical
Server
Physical
Server • Load balancing between VM can be
Physical Physical Physical done in a Zone
IPS Server Server Server  Rack
NW Resource Pool 2 • It is physical server rack
Switch
VLAN for Service A in Infra A • It contains physical servers for
(Maximum is single Zone)
Service Boundary
Service Role
boundary
compute
Network Rack Storage Rack • Rack has a type for performance &
Network Rack Storage Rack
For Zone #1 For Zone #1
For Zone #1 For Zone #1
redundancy
SAN ex)
SAN
TYPE 1(High performance & redundancy)
Load Balancer Controller Load Balancer Controller
NW Switch Disk Array
Disk Array – 10G NIC *2, 10G ISCSI *2
NW Switch
TYPE 2(Low IO & no redundancy) – 10G
SAN Switch
SAN Switch NIC *1, 1G ISCSI * 1
 Resource Pool
• Logical Unit of Physical Server Pool
Management Network • One Resource Pool consists of one
more Physical servers which resides
in different Rack
• VMs in same Service Role are
deployed in same Resource Pool
• Load balancing and failover is
occurred in Resource Pool boundary

Infrastructure Architecture

 Reference Architecture


 Infrastructure Hierarchy
Central
Cloud
Cloud
Service
Mgmt
Cloud
DataCenter Manageme
nt
Fundament
al
Service

Zone
Network
Rack
Load
Balancer

NW Switch

SAN
Switch
Compute
Rack SAN
Switch

Rack NW Switch
Type
Resource Physical
Pool Servers

Storage SAN
Rack Controller
Network
Gateway
Disk Array RAID

Router

NAT

VPN

Firewall

IPS


 Component Description
Level Component Description Notes
Cloud Service Central Cloud • It is cloud management system across datacenters in multiple region It includes BSS
Management • Only central cloud management system has end user interface (portal)

Data center Cloud Management • It is individual management system for each data center It includes DHCP, DNS, Cloud
• It communicates with Central Cloud Management with Remote API OS , OSS etc.

Fundamental Service • It is additional functional services like
 RDBMS service
 Blob Storage
 Map & Reduce
 Notification etc
Zone Storage Rack • It is storage for EBS which attached into VM as a main repository SAN is preferred
• One zone has only one logical storage rack (can be multiple physically)

Network Rack Load Balancer • Load balancer which balancing input load to VM SW L4 is preferred
NW Switch • Simple network switch which aggregate network traffic from individual VLAN support required
compute rack L2 is preferred

SAN Switch • SAN switch with aggregate SAN traffic from individual compute rack Storage virtualization should be
considered. Intelligent switch can
be used.

Compute Rack SAN Switch • Connect Physical server to SAN as a SAN backbone in rack

NW Switch • Connect Physical server network as a network backbone in rack

Resource Pool • Logical unit of physical server set It has a dependency to Hypervisor
• In resource pool boundary, VM can be moved for fail over solution

Physical Server • Physical server which hosts VM

Storage Rack SAN Controller • SAN Controller Storage virtualization should be
considered

Disk Array • RAID based DISK array IO segregation architecture should
be considered


 Component Description
Level Component Description Notes

Network Gateway Router • Routing network traffic between internet and cloud internal

NAT • Network address translator

VPN • Provides secure access to internal VM instance

Firewall • permit or deny network transmissions based upon a set of rules and is frequently used
to protect networks from unauthorized access while permitting legitimate
communications to pass.

IPS (Intrusion Prevention • monitor network and/or system activities for malicious activity. The main functions of
System) intrusion prevention systems are to identify malicious activity, log information about
said activity, attempt to block/stop activity, and report activity

NW Switch • Network backbone L3 is preferred
• Aggregate traffic from Zone

5. User & Service Domain Model

Usecase

 ACCOUNT & INFRA

Account에 대한 전체 권한
+ 정산 관련 업무

ACCOUNT ADMIN
ACCOUNT
ACCOUNT INFRA INFRA 구성 및 설정
IFRA

INFRA
INFRA

CLOUD ADMIN
INFRA ADMIN

INFRA 내부 자원(Resource)
설정 및 사용

USER

 SERVICE & SERVICE ROLE

CREATE
INFRA 도메인 개념 정의

Service ACCOUNT
Service • 과금 단위
Service Role : EX) Web Front End Service Role : EX) Web Front End • 하나의 기업이나 사업 본부와 같이 독립된 Business Unit
Load Balancer
ACCOUNT ADMIN
• 하나의 ACCOUNT는 1..N개의 INFRA를 가질 수 있음
Load Balancer IFRA
VM1 VM2 VM3
CREATE
• 논리적인 하드웨어 묶음으로, 논리적인 데이터 센터의 개념
Service Role : EX) DBMS • 하나의 INFRA안에는 1..N개의 SERVICE가 배포될 수 있음
Load Balancer INFRA ADMIN
SERVICE
VM1 VM2 VM3 • 업무 시스템 (예 ERP, CRP, 블로그 서비스 등)
VM1 VM2 VM3 • 하나의 SERVICE는 1..N개의 SERVICE ROLE로 구성됨
Service ※ VLAN은 서비스 단위로 구성됨. 다른 SERVICE라도, 같은
Service Role : EX) Web Front End Service Role : EX) DBMS INFRA내에 있으면 같은 VLAN ID를 지정할 수 있음 (SERVICE간
Load Balancer 통신이 있는 경우 이를 위한 배려)
Load Balancer USER SERVICE ROLE
VM1 VM2 VM3 • 하나의 업무를 구성하는 세부 기능 컴포넌트 (예 웹서버,
Service Role : EX) DBMS CREATE DBMS 서버, CMS 컴포넌트, IDM 컴포넌트 등)
CONFIGURE • 하나의 SERVICE ROLE은 1..N개의 VM과 이를 묶어 주기
Load Balancer START/STOP
VM1 VM2 VM3 위한 0..1개의 LOAD BALANCER로 구성된다.
VM1 VM2 VM3
CF. Shared_IP_Group in OpenStack
VM
• 가상 머신
• 하나의 가상화된 서버를 표현함

 DOMAIN 계층 구조

도메인 개념 정의
Account

ACCOUNT
Infrastructure
Firewall • 과금 단위
Policy • 하나의 기업이나 사업 본부와 같이 독립된 Business Unit
Service
VLAN
ID
Ex) Production • 하나의 ACCOUNT는 1..N개의 INFRA를 가질 수 있음
IFRA
Service Role
LB Ex) Web Front End • 논리적인 하드웨어 묶음으로, 논리적인 데이터 센터의 개념
하나의 INFRA안에는 1..N개의 SERVICE가 배포될 수 있음
Policy
•
Virtual SERVICE
업무 시스템 (예 ERP, CRP, 블로그 서비스 등)
Machine
Rack
Type •
Virtual
Machine
Load Balancing • 하나의 SERVICE는 1..N개의 SERVICE ROLE로 구성됨
※ VLAN은 서비스 단위로 구성됨. 다른 SERVICE라도, 같은 INFRA내에 있으면
Virtual
Machine
같은 VLAN ID를 지정할 수 있음 (SERVICE간 통신이 있는 경우 이를 위한 배려)
SERVICE ROLE
Service Role Ex) Reporting Service • 하나의 업무를 구성하는 세부 기능 컴포넌트 (예 웹서버,
DBMS 서버, CMS 컴포넌트, IDM 컴포넌트 등)
Virtual • 하나의 SERVICE ROLE은 1..N개의 VM과 이를 묶어 주기
Machine
위한 0..1개의 LOAD BALANCER로 구성된다.
Service Role은Physical Server와 Networking 성능과 장애
Virtual
Machine
Load Balancing •
Virtual
대응 능력에 따라 Rack Type으로 나뉘어 진다.
Machine ※ Service Role의 개념은 같은 VM을 묶어서, Scale Out이 편리하도록 할 수 있다.
Infrastructure
Firewall
VM
Policy • 가상 머신
Service
VLAN
Ex) Dev/Test
• 하나의 가상화된 서버를 표현함
ID

LB
Service Role Ex) Web Front End
Policy
VLAN 2
Virtual
Machine
Load Balancing
Virtual
Machine
VLAN can be shared
between Services in same
infrastructure

Usecase

 Domain Model Sample

Account에 대한 전체 권한
+ 정산 관련 업무

IaaS ACCOUNT ADMIN Service : CMS Service

Legacy Account
Service Role : Web Front End
생산 관리
Load Balancer
Account
DOMAIN DOMAIN VM1 VM2 VM3
DOMAIN USER USER
해외 마케팅 파트너사 ADMIN Service Role : DBMS

Account Domain Load Balancer
Configure DOMAIN DOMAIN
Manage USER USER VM1 VM2 VM3
Cloud
Service : Push Service

해외 지사 Service Role : Push Server

Domain DOMAIN DOMAIN
CLOUD DOMAIN Load Balancer
USER USER
ADMIN
ADMIN VM1 VM2 VM3

DOMAIN DOMAIN Service Role : MySQL Cluster

USER USER Load Balancer

그룹사 VM1 VM2 VM3

Domain
DOMAIN DOMAIN
DOMAIN USER USER
ADMIN

DOMAIN DOMAIN
USER USER

Provisioning Scenario

 VM Provisioning End 2 End work flow

CREATE CREATE
CREATE
VLAN

Service
Infra INFRA ADMIN USER
ACCOUNT ADMIN
1. Create Infra and Fire Wall
assign it into Business Load Balancer
Unit
Rack
Type

Service Role
(Front End)

Rack
Type
Load Balancer

Service Role
( DB Server)
6. Metering &
Charging
2. Create Service and associated Service 4. Create VM in the Service Role
Role 5. Configure Load balancer in the
3. Assign the Service Roles to User Service Role

Domain Concept Mapping

 Concept boundary mapping Create

Cloud Central Account Create
Service Cloud Mgmt
Create

DataCenter
Cloud Infrastructure
Firewall ACCOUNT ADMIN
Management
Policy
Configure Configure
VLAN
Fundamental Service
Service
ID Create
Create
LB
Zone Service Role
Network
Policy INFRA ADMIN
Rack
Load Virtual
Balancer Machine
Rack
Type Virtual
NW Switch
Machine

Virtual Configure
SAN Switch Machine
USER
Compute
Rack SAN Switch
Create
Configure
Rack NW Switch Manage
Type
Resource Physical
Pool Servers

Storage SAN
Rack Controller
Network
Gateway
Disk Array RAID

Router

NAT

VPN

Firewall

IPS

INFRSTRUCTURE CONCEPT DOMAIN CONCEPT

Software Architecture

 Level 1. Conceptual Architecture

User Portal Cloud Admin Portal

Platform Service
OSS << PaaS>>

IDM
Orchestration

Virtual Machine Configuration Fundamental
Monitoring Backup
Manager Manager Services
RDS,Blob Storage,No SQL
BSS etc

Infrastructure
Infrastructure
(Low cost, Low reliability –
(High cost, High reliability-redundancy support)
No redundancy support)


 Level 1. Conceptual Architecture and Priority

1 1

OSS

1.5 IDM
Orchestration

1 Virtual Machine Configuration
Monitoring Backup
Manager Manager

BSS 2 3 1.5
Infrastructure
Infrastructure
1
(Low cost, Low reliability –
(High cost, High reliability-redundancy support)
No redundancy support)


 Level 2. Conceptual Architecture


Web based Web based
CLI Reporting CLI Reporting
Management Management
OSS

NMS
Orchestration
User Profile

Sync (Propagation)
SMS Service Bus Work Flow Engine Adapter Management

Role Management
BSS
Domain Manager Server Profile Management Snapshot
Monitoring Interface << Bare metal Server>> Authentication &
Management
Metering Authorization

Software Install

Software Asset
Management
VM Manager Memory Grid VM Profile

Engine
<< Cache >> Management
Charging IDM
Network Manager Event Trigger Patch Management
Billing
Bare Metal
Storage Manager Alert & Notification Provisioning
Payment

Virtual Machine Monitoring Configuration Back Up
Manager Manager

Router SAN Switch Physical Server

Switch SAN Controller

Load Balancer Disk Array

Fire Wall

NAT

VPN

Networking Storage Server

Component Description

Level 1 Level 2 Description

User Portal Cloud 사용자에 대한 인터페이스 제공

CLI Command Line Interface를 통해서 각 사용자나 그룹별로 할당 받은 VM에 접속하여, VM에 대한 관리 인터페이스를 제공함

Reporting 누적 사용량, 응답 시간 등 KPI를 OLAP을 이용한 다차원 분석을 이용하여 리포팅 서비스를 제공함
CF. BI, Microsoft MS-SQL내의 Excel 기반 BI Reference 권고

Web Based Management 사용자 관리, 권한 관리
Infra (VM,Storage,Network etc) 생성, 관리

Cloud Admin Portal Cloud 시스템 관리자 인터페이스 제공 (OSS 기능 중 일부 기능 제공)

CLI Command Line Interface를 통하여, 전체 클라우드 시스템 (VM + Host Server) 에 대한 관리 인터페이스를 제공함

Reporting 모니터링에 필요한 서비스에 대한 리포트 서비스 제공
매출, 사용량 등 비즈니스 관점에 대한 리포트 서비스 제공

Web Based Management 웹 기반의 클라우드 관리 및 모니터링 인터페이스 제공

Orchestration 클라우드 시스템내의 API 통신에 대한 Hub 역할을 수행한다.
Infrastructure (Network, Storage, Server) 등을 조합하여 Cloud 관리에 대한 Business Process (Provisioning, Patching)등의
Operation을 구현한다.

Service Bus API에 대한 Hub 역할을 수행하며,Mediation, Routing, 메시지에 대한 Transforming 을 수행한다.
CF. SOA의 Enterprise Service Bus. (EX Oracle Service Bus)

Work Flow Engine Work Flow를 기반으로 하여, Infrastructure (Network, Storage, Server), BSS,OSS에 대한 Integration을 수행하고, 여러 개의 하부 시
스템을 Business Process를 구현한다.
CF. SOA의 BPM 엔진, Cloud의 Provisioning Engine (EX. Microsoft Opalis, HP Matrix)

Adapter 이 기종의 Infrastructure (Network, Storage, Server)와 BSS,OSS 에 대한 인터페이스를 Cloud 시스템내의 표준화된 프로토콜 (REST
or SOAP/HTTP)로 변화해주는 역할을 수행한다.
※ Optional Layer : Virtual Machine Manager에서 이 기능을 수행할 경우 기능 중복이 됨
CF. Cloud Stack의 Storage, Network Manage API.



Virtual Machine Manager Infrastructure를 추상화 하여 API를 외부로 Expose한다.
VM에 대한 관리를 수행한다.

Domain Manager Infrastructure concept, Domain concept을 실제 사용자와 Hardware Infrastructure에 Mappning하고 관리한다.

VM Manager Physical Server 에 대한 VM 관리 기능을 수행하고, 이 기능을 Open API 형태로 Expose 한다.
• Template Manager – VM 생성용 Template 관리
• Live Motion – VM을 Physical Host간에 정지 없이 이동하는 기능 제공
• Life Cycle Management – VM의 생성 부터 폐기까지의 Life Cycle을 관리
• VM Mgmt – VM에 대한 Control 기능을 Expose (Start , Stop, etc)
• VM Locator – Domain Model과 현재 Infra 사용률을 감안하여 VM이 Hosting될 최적의 Server를 배정한다.

Network Manager Networking 장비들의 기능을 Abstract하여 Open API 형태로 Expose한다
• VPN Management
• IP Pool
• NAT Management
• Router Management
• Firewall Management
• Load Balancer Management
• VLAN Management
• Router Management
※ External Device Integration is required.
Storage Manager SAN Storage 장비들의 기능을 Abstract하여 Open API 형태로 Expose 한다.
• Volume Management

Monitoring Cloud 컴포넌트를 통해서 정보를 수집하고, 모니터링 한다.
미리 정해진 규칙에 따라서 시스템에서 발생되는 이벤트에 대해서 특정한 동작을 호출한다.

Monitoring Interface 이 기종과 다양한 프로토콜의 각종 Infrastructure 및 BSS,OSS, Cloud 내부 컴포넌트로 부터 모니터링 정보를 수집한다.

Memory Grid 수집된 모니터링 정보를 Clustered된 Memory 내에 저장한다.
CF. Oracle Coherence, OpenSource memcache,Microsoft Windows Server AppFabric Cache

Event Trigger Pre defined 된 Role에 따라서, Monitoring 과정에서 특정 이벤트를 발생 시킨다. (장애, 성능 저하 등).
Event Trigger 기능은 Fail Over, Scale Out을 수행하는 시점이 된다.

Alert & Notification Event Trigger 에 따라서 지정된 상대에게 Alert 또는 Notification Message를 발생한다.

Back Up VM Snapshot을 기반으로 VM에 대한 Backup을 지원한다.

Snapshot Management VM의 Snapshot을 추출하고, 관리 그리고 Restore를 담당한다.



Configuration Management Physical Server와 VM에 대한 Software 및 Patch Install을 담당한다.

Server Profile Physical Server에 대한 관리를 수행한다. (등록, Server 별 사양 관리,설치 Software List, Patch List 등)
Management

VM Profile Management VM 에 대한 관리를 수행한다. (등록, VM 별 사양, 설치 Software List, Patch List등)

Patch Management OS 별, Software 별 Patch List를 유지하고, VM과 Physical Server에 대해서 Patch를 수행한다.

Software Asset Software 에 대한 설치 이미지 관리
Management Software 에 대한 라이선스 관리

Software Install Engine Software 및 Patch를 Physical Server 및 VM에 직접 Install 하는 역할을 수행한다.

Bare Metal Provisioning Physical Server에 대한 OS 설치를 담당한다.

IDM (Identity Management) 사용자 계정 및 권한을 관리하며, Global Deployment 를 고려하여 센터간의 사용자 계정 및 권한 동기화를 수행한다.

User Profile Management 사용자 계정 관리

Role Management Role 정의 및 관리

Authentication & 인증 및 권한 인가 처리
Authorization

Sync (Propagation) Center간의 계정 및 권한 동기화
이 기종 솔루션 간의 계정 및 권한 관리 통합

OSS (Operation Support System) 하드웨어 인프라에 대한 통합 모니터링 및 관제를 수행
기존 Monitoring Interface는 클라우드 관리자 관점이며, OSS는 망관리, 인프라관리, Storage Device등 Device 및 센터 관점에서의 세
밀한 관리를 지원.

NMS Network Management System
네트워크 인프라에 대한 통합 관제 및 모니터링

SMS System Management System
하드웨어 시스템에 대한 통합 관제 및 모니터링
CF. HP Open view, CA Unicenter



Networking 클라우드에 사용되는 네트워크 인프라를 정의한다.
※ 클라우드의 공용 인프라 (Backbone)등은 하드웨어 기반의 장비를 사용하고, 사용자에게 할당되는 네트워크 장비는 요건에 따라서
소프트웨어 장비를 고려한다.

Router 네트워크 라우터

Switch 네트워크 스위치 (VLAN 기능 지원 필요)

Load Balancer Load Balancer
CF. L4 or L7

Fire Wall 방화벽

NAT Network Address Translator

VPN VPN

Storage VM을 Host하기 위한 Disk 영역과, Amazon EC2 EBS와 같은 영역을 제공하는 Storage 저장 공간

SAN Switch Physical Server와 SAN Storage를 연결하는 스위치

SAN Controller Disk Array를 Control 하는 Controller (ISCSI Controller)

Disk Array Disk Array

Server VM을 Host하기 위한 물리적 서버

Physical Server 물리적 서버

Architecture concept

 VM Manager / VM Locator
RACK #1 RACK #2 RACK #3
 Resource Pool의 개념
 Hypervisor 들은 관리 목적과 다양한 기능 제공을 목적으로
ERP Web Front
Clustering 모델을 제공하고, 하나의 Cluster는 단일 End Service Role
70VM/90
Managed Node에 의해서 관리된다. VM #1
Total Hosted
 일반적으로 Hypervisor Clustering은 물리적으로 16~64개의
서버를 묶어서 하나의 Cluster를 구성하기 때문에, 무한 용량
ERP Web Front
확장이 필요한 Cloud 구조에서는 이 Cluster 구조가 용량 확 End Service Role
80VM/90
VM #2
장의 문제가 될 수 있고, 이 안에서만 VM에 대한 이동 및 관 Total Hosted
리를 할 수 있기 때문에, 이 Cluster를 하나의 Resource Pool
로 정의하여 관리한다.
Resource Pool Find other rack in resource pool
 만약 Cloud 설계에서 Cluster 개념을 사용하지 않는 다면 하 VM Placer Find most idle server in resource
pool
나의 단일 Zone을 하나의 Resource Pool로 취급하여 설계에 << in VMM >>

반영하면 된다.
VM Provisioning request
 VM Placement Policy
 같은 사용자가 사용하는 VM의 경우 장애나 성능 이유에 따
라서 특정한 RACK이나 Server에 배포되어야 하는 이유가 있
다. (예를 들어 동일한 Service Role의 VM이 같은 Server에 ERP Web Front
End Service Role
배포 되면 Server 장애 시 Fail Over가 불가능하다.) VM #3

 VM Placement Policy는 각 Infra Admin에 의해서 정의되고
관리 된다.
 VM Placement Policy에 대한 예는 다음과 같다.
 동일 Service Role의 VM은 다른 Server Rack에 배포된다.
 다른 Server Rack에서 가장 Idle한 Physical Server에 배포된다.
 만약 다른 Rack에 공갂이 없을 경우, 동일 Rack의 Physical
Server를 찾는다.
 만약 동일 Rack에 도 공갂이 없을 경우 Error를 출력한다. (용량
부족)


 Orchestration / Service Bus
• Generic Proxy Pattern

Service Consumer 젂통적인 SOA 사상에 입거한 Enterprise Service Bus 계층으로, 아
래와 같은 Generic Proxy Pattern을 기반으로 설계하며, 젂체 클라
우드 시스템의 API에 대한 Hub 기능을 수행한다.
IDM

Generic Proxy Logging - Edge Proxy
Tracing
 API에 대한 Entry Point로 다양한 프로토콜에 대해서 (REST-
Auditing
Edge Proxy XML,REST-JSON,SOAP/HTTP etc) 변홖을 짂행한다.
Orchestration  필요한 경우, API 요청에 대한 인증, 인가를 수행한다.
Orchestration Logic
Transformation - Common Proxy
Common Proxy Compensation
 API 에 대한 공통 영역으로, Logging등의 공통 기능을 수행한다.
SLA  필요할 경우 Center갂 라우팅을 수행한다.
Alert
Throttling - Local Proxy
Local Proxy
 각 서비스에 따라서 서비스의 내용이 변경된 경우
Exception Handling
Ignore Monitoring Mediation,Message에 대한 Transformation 등의 작업을 수행한다.
Reporting - Business Service
Business
Service Auto retry
 실제 서비스에 대한 Delegator 역할을 수행하며, SLA 에 대한 기능
Human Error handling
(Throttling, Alert 기능을 수행한다.)

Components


 Orchestration / Work flow engine
• Work flow Engine
Cloud 상의 여러 자원 (Physical Infrastructure) 등을 조합하여, Cloud 관리에
필요한 프로세스 (Provisioning, Patching, Resiliency)등을 구현하는 중간 계
층
- UI Based Work flow
 GUI Based로 Work flow를 설계한다.
- Work flow Runtime
 UI Based Work flow에서 작성된 Workflow를 구동하고, Process 별
로 진행 상황이나 장애 상태를 Tracking 한다.
- Adapter
 다양한 Device와 소프트웨어 서버에 대해서 Interface를 제공하기
위해서, Device에 대한 통제 및 관리 기능을 Open API등 Abstract된
Sample. Microsoft Opalis 형태로 Expose 한다. (CF. EAI Legacy Adapter)

- 필요성
 Agility
» Cloud 관련 Process 를 개발하는 데 있어서, GUI 기반으로 다
양한 Infrastructure를 연동만 해주면 되기 때문에, 변화에 대해
서 빠른 구현이 가능하다.
 Flexibility
» Process가 변화되더라도, Coding이 아닌 Process 변경만을 반
영해주면 되기 때문에 유연한 대응이 가능하다.
» Hardware Infrastructure등을 Abstraction하였기 때문에,
Infrastructure 변경에 대한 반영에 유연성을 확보할 수 있다.

Solution Candidate

 Virtual Machine Manager – Cloud Stack
 Open Source Cloud OS
 Supported by Cloud.Com (Subscription Model – Software Subscription, Patch & Bug Fix, 24x7 Tech Support. Professional
Service)
 Characteristics
 Well defined Domain Model
 Well defined Open API set
 Software based Networking devices (Firewall, NAT, Router, LB etc)
 Storage Tiering
 It has a dependency on Hypervisor Cluster Feature (Resource pool size restriction)
 Virtual Machine Manager – OpenStack / Nova
 Open Source Cloud OS
 Sponsored by RackSpace.com
 Characteristics
 Very Simple and not matured yet. Focused on VM Provisioning scenario only.
 External Networking device integration is required. (See. Zeus )
 It needs to research Storage Architecture. (No IO Segreation, Storage Tiering feature)
 Not enough feature to realize this architecture. (Many customization and enhancement is required)
 Recommendation : RackSpace.com Professional Service is mandatory to delivery.

Solution Candidate

Level 1 Component Level 2 Component CloudStack OpenStack Nova (Cactus)
Domain,Account,Security Group,Zone,Pod 등 세밀한 자체 Domain
자체 Domain Model 없음 (Cutomization 이 가능할 것으로 판
Domain Manager 개념을 가지고 있음 ○ X
단됨 - 검토 필요)
(Mapping 가능한지 Research 필요)
Template의 개념을 Flavor 로 정의 (Memory, Disk Size만 지정
-Research 필요,
Template관리 기능
Virtual Machine CPU는 Priority Time 개념만 있고,Virtual Core 개념이 없음)
VM Manager ISO 이미지 관리 기능 ○ △
Manager Image 관리 기능 있음 (Glance를 통한 확장 가능)
VM 관리가 Cluster 단위로 국한됨
Live Migration 기능 있음
대부분 VM 생성에 관련된 기능에만 국한되어 있음.
Router, VPN,Load Balancer, NAT,Fire wall,VLAN에 대한 Open API
Network Manager 제공 ○ IP 관리 기능만 있음 △
외부 Load Balancer, Firewall 연동 기능 있음
Storage Pool, Volume 관리 기능 있음-IO Segregation 가능 여부 검
Storage Manager 토 필요 ○ 별도의 Storage 관리 API는 없음. CLI로 처리 가능 △
Storage Tiering 기능 있음 (Feature List)
Orchestration Service Bus N/A X N/A X
Work Flow Engine N/A X N/A X
Adapter N/A X N/A X
Schedule 기반의 Backup 기능은 있으나 개별 VM에 대한 Sna
Back up Snapshot Management 자체 Snapshot Management 기능 있음 ○ X
pshot Management 없음
BSS Metering Metering 관련 API 있음 ○ Usage 관련 Metering API 없음 (개발 필요) X
IDM 자체 인증 모델 있음 (IDM과 연계 가능한지 정밀 검토 필요) ○ SSO Integration 기능 있음 ○
Software Networking Software 기반의 Router, VPN,Load Balancer, NAT,Fire wall,VLAN
Networking ○ 외부 솔루션 필요 X
Device 지원 자체 제공-성능 검증 필요
기본적인 VM Provisioning에 대해서만 Focus 되어 있음. (Feat
Live Migration 가능 (Feature List에는 있으나, API List에는 없음)
Comment ure 부족)
Cluster에 대한 제약 사항을 가지고 있음 (Resource Pool 개념이 있
Rackspace등 production delpoyment에 대한 Reference Imple
고, Resource Pool 밖으로 VM Movement 가능 여부 체크 필요)
mentation이 없을 경우 Very High Risk

Conclusion
• 짧은 구현시간과, 높은 품질 요구사항을 만족시켜야 하며, 현 디자인 아키텍쳐의 요건을 대부분 만족 시킬 수 있는 솔루션은 CloudStack을 추천하며,
OpenStack Nova 사용시에는 Production Deployment 경험이 있는 Rackspace와 협업을 권고
• Monitoring, Configuration Management 등의 Additional Feature 등에 대해서는 별도의 Product으로 구성을 권고
• CloudStack의 Software Networking 기능이 KT에서 용량 문제가 있는 것으로 파악되기 때문에, 용량 및 성능 측정 후 Alternative Solution 검토 권고
• CloudStack의 Hypervisor 기반의 Clustering에 대한 제약 사항 및 해결안 마련 필요

IaaS Cloud Architecture Design

IaaS Cloud Architecture Design

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à IaaS Cloud Architecture Design

Similaire à IaaS Cloud Architecture Design (20)

Plus de Terry Cho

Plus de Terry Cho (20)

IaaS Cloud Architecture Design