SlideShare une entreprise Scribd logo
1  sur  38
OpenStack at NTT Resonant:
Lessons Learned in Web Infrastructure
Tomoya Hashimoto, Business Platform Division, NTT Resonant Inc.
Kazuhiro Tooriyama, Business Platform Division, NTT Resonant Inc.
Toshikazu Ichikawa, NTT Software Innovation Center, NTT Corporation
Presentation Video
This slide deck was presented in OpenStack Summit Tokyo 2015.
You will find our video-recorded presentation at the URL
https://www.openstack.org/summit/tokyo-
2015/videos/presentation/openstack-at-ntt-resonant-lessons-learned-
in-web-infrastructure
2
Speakers
Tomoya Hashimoto
Kazuhiro Tooriyama
2010 – 2014
NTT Communications Development of ISP NW (OCN)
2014 - current
NTT Resonant Engineer of server platform
2001 - 2012
NTT Resonant goo blog, oshiete goo (Q&A service)
Development and operation of core services
2012 - current
NTT Resonant Architect of server platform
3
Toshikazu Ichikawa
2011 – 2014
Verio (NTT America) Development of cloud service “Cloudn”
and managed hosting service
2014 - current
NTT Development of cloud service platform
1. About NTT Resonant
2. OpenStack Infrastructure Design
3. VM setup by Puppet with OpenStack
4. Monitoring OpenStack and VMs
5. Current Issues and Future Plan
4
Agenda
1. About NTT Resonant
5
1. About NTT Resonant
6
Regional Communications Business Long Distance and
International Communications Business
Mobile Communications
Business
Date Communications
Business
$112
billion
in total
revenue
240,000
employees
worldwide
#1
in Data
Center floor
space
#2
in Global IP
Backbone
Source: TeleGeography
All facts and figures accurate as of March 2014
R&D
1. About NTT Resonant
B2C services
Platform and B2B2C services
Portal Site
Smartphone application
goo milk feeder goo disaster
prevention application
7
Services
of
Customers
Healthcare
Disaster prevention/
response solutionsPhone Cloud / Developer support
e-commerce site
for communications devices
NTT Resonant’s Business Area
1. About NTT Resonant
8
Dictionaries ZIP codes Laboratory Bodycloud
Housing and real estate Search Baby-care Movies
Maps Navigation Horoscopes RankingsCar and bike
News Weather Healthcare Smartphone applicationsBlogs
Job search Love and marriage
Online store
Travel
Providing 60+ services including
• Web search
• Blogging
• News
• Oshiete! goo Q&A site
Launched in 1997
18th years old
Web portal site “goo”
http://www.goo.ne.jp/
1. About NTT Resonant
9
How large is ?
The 3rd largest web portal in Japan
Yahoo! Google
Rakuten
MSN
Scale of web portal “goo”
170 million unique browsers per month
1 billion page views per month
Source: 2015.02 NetRatings
2. OpenStack Infrastructure Design
10
2. OpenStack Infrastructure Design
• Migrate to another data center, under limited timeframe
–The termination of existing data center (DC) contract is fixed. We
need to migrate our system from existing DC to another DC by the
time.
• Shorten a lead time for service release
–Speed up by changing manual operation to create and manage VMs
–Comparable to public cloud service such as AWS
• Support all workflows to provide service
–not just introducing OpenStack, as an infrastructure for web services
–Not only VM creation, but also an installation and configuration of
software inside VMs
What was required to us at OpenStack deployment
11
Service Teams
Platform Team
2. OpenStack Infrastructure Design
Organization and Formation
NTT Resonant
Service Developer
60+ services
Platform service
Operation
Partners
Operator
Outsourcing
12
~10 engineers
300+ engineers
…
Service Developer
Service Developer
Service Developer
Service Developer
Operator
Operator
Design Team
NTT R&D
OpenStack Community
…
Joint
experiment
Contribution
Distribution
2. OpenStack Infrastructure Design
• It’s decided to migrate our services to another data center.
–2014/3 Project Started
OpenStack installation’s design and deployment begins
–2014/10 OpenStack is ready, in production
–2014/10-2015/01 (4 months) 70 services, 1300 VMs started
OpenStack deployment timeline with our services
13
March July Oct. JanJuneApril May Aug. Sep. Nov. Dec.
2014 2015
Migration of services
from old existing environment
★OpenStack started, In Production
★Migration Completed
OpenStack Installation: Design / Deployment
Requirement
Definition
About 6 months
★Old existing environment
Closed
2. OpenStack Infrastructure Design
• Using OpenStack as Private Cloud
• In production since 2014 October
• As of now, it supports
–80+ services
–1 billion page views per month
• With
–400 hypervisors
•2 Nova cells
–4,800 physical cores
–1,800+ virtual servers
OpenStack Scale at main data center of NTT Resonant
14
Launch
Dashboard
2. OpenStack Infrastructure Design
OpenStack Components (Icehouse Release)
15
Horizon
Neutron
Nova
Glance
Cinder
Swift
Keystone
Network
Hypervisor
Image
Block Storage
Object Storage
Identity
Virtual Router and LAN
Virtual Load Balancer
Virtual server
VM Template
Image snapshot
Virtual volume
RESTful file store
Replication
What we use
VM VM VM
APP
OS
APP
OS
APP
OS
designed by Freepik
Trove
Heat Orchestration
Database
services
Ceilometer Telemetry
• Distribution
–RDO with CentOS 6
–Icehouse version
• Automation
–Puppet for Configuration Management
•Thanks RDO Community for Puppet manifest
16
2. OpenStack Infrastructure Design
Deployment
• Provider network with VLAN
–No control L3+ including
Router, NAT, Load Balancer, Firewall
• Using ML2, Linuxbridge agent
–We are familiar with it
• Service Model
–An administrator prepares networks and subnets per tenant
–A tenant is not allowed to create/delete a network
• Close to “Scenario: Provider networks with Linux bridge” in the
“OpenStack Networking Guide” [1]
17
[1] http://docs.openstack.org/networking-guide/scenario_provider_lb.html
Neutron
L4-7: Load Balancer, VPN
L3: Router, NAT
L2: Network, Port
What we use
2. OpenStack Infrastructure Design
Networking with Neutron
18
Node Type OpenStack
Components
RabbitMQ (MQ) and
MariaDB (Database)
HAProxy (LB)
and Pacemaker
(HA cluster)
Top cell
Controller
Nova, Glance,
Keystone, Neutron,
Horizon
RabbitMQ
Mirrored queue
Nova, Keystone, Neutron,
Horizon,
DB, MQ
Child cell
Controller
Nova RabbitMQ
Mirrored queue
MQ
Database N/A MariaDB
Galera Cluster
N/A
Swift Proxy Swift, Glance N/A Swift, Glance
Swift Storage Swift N/A N/A
Compute Nova, Neutron N/A N/A
Node types and HA(High Availability) strategy
2. OpenStack Infrastructure Design
2. OpenStack Infrastructure Design
Contribution to Community related to this project
19
• This bug was a show-stopper for the project until we fixed
–Bug Fix [1]:
•the Shelve function didn't work at Icehouse release with
nova-cell deployment
•We use shelve/unshelve for hypervisor maintenance
• Some bugs we found and fixed
–Security Bug Fix [2]:
•This was announced as OSSA 2015-017 recently
–8 bug fixes other than above
[1] "shelve api does not work in the nova-cell environment“
https://bugs.launchpad.net/nova/+bug/1338451
[2] "Deleting instance while resize instance is running leads to unuseable compute nodes”
https://bugs.launchpad.net/nova/+bug/1392527
2. OpenStack Infrastructure Design
• We modified codes to enforce our operation rules
–We modified only Horizon
•Users come through Horizon, not API
•What we implemented
–Server naming restriction
–Access limit to security group function
–And, about 40 items
–No modification to other components
except bug fix backports
•Minimize the cost to maintain
20
Server creation dialog of Horizon
Customization on Dashboard
3. VM setup by Puppet with OpenStack
21
3. VM setup by Puppet with OpenStack
Issue of VM setup (installation and configuration)
22
• Only 4 months from VM creation to service migration
– Time is limited for VM setup
– 1,300 VMs need to be migrated onto OpenStack
– Automate procedures, as much as possible
• The key is puppet manifest using at existing data center (DC)
– We used puppet manifest to setup VM at existing DC
• Making a bridge between OpenStack and puppet
– The goal is to setup our services on top of OpenStack quickly and easily
We resolved this issue by using puppet integrated with OpenStack
OpenStack
3. VM setup by Puppet with OpenStack
How we use puppet with OpenStack
23
• Our puppet design
– Individual puppet master per tenant
– Linux account, middleware, config file etc.
– Single manifest repository
• What is required to use Puppet
– Host name can be resolved with DNS
– Host group is defined in LDAP
– Puppet manifest has the entry for a host group
Tenant: A
Tenant:A User
VM-A
VM-B
SVN
Puppet
Master
DNS LDAP
Necessary
OpenStack
3. VM setup by Puppet with OpenStack
How we use puppet with OpenStack
24
• Synchronization tool
– Polling on Nova API to detect a new
VM
– VM registration to DNS, LDAP and
puppet manifest
– Complete above steps every 5 minutes
• OpenStack user is able to apply puppet
manifest easily and quickly right after a
VM creation
Tenant: A
Tenant:A User
VM-A
VM-B
SVN
Puppet
Master
Synchronization
tool
Polling
NovaAPI
DNS LDAP
Add entry
Add entry
Outcome from VM setup framework with OpenStack
25
• Drastically shortened timeline and efficient workflow
–1000 VMs service deployment within 1 months
–Only 30 minutes from VM creation to service start
•It needed 5 business days without OpenStack
–Eliminate tasks of two operators by reducing manual operation
• Common process to build service environment
–Service engineer don’t worry about environment, focusing on
business
3. VM setup by Puppet with OpenStack
4. Monitoring OpenStack and VMs
26
4. Monitoring OpenStack and VMs
• Two monitoring environments
1. For cloud infrastructure
•NW, Physical servers, Openstack itself
2. For web services
•Providing standard service monitoring methods on the private cloud
• Tools and Situations
– Zabbix
•Semi-auto VM monitoring
– Redmine and Wiki
•As an issue(ticket) managiment system
•Auto issuing 1 ticket / 1 trouble
– Operation Center
•24/7 monitoring and calling via TEL
•1st response to simple occasion
27
Abstract of our monitoring env.
Operation Center
Web Service Teams
Automatic Issuing
Infra Team
(us)
Watcing
24/7 In case of trouble or provisioning
In case of serious situation designed by Freepik
4. Monitoring OpenStack and VMs
• Severity order
– API monitoring
•keystone-api, nova-api, neutron-api, horizon GUI, glance-api, swift-proxy
•Quite serious trouble
– Process failure detection
•nova-*, swift-*, keystone-*, rabbitmq-server, mysqld(MariaDB) etc.
– Process performance monitoring
•Depending on middleware
•i.e.) MySQL connection number etc.
– Log monitoring
•Treat any log message above ERROR as “trouble” from the beginning
–Lack of knowledge leads doubt
•Filtering problem-free logs day by day
28
1. For cloud infrastructure (OpenStack monitoring)
4. Monitoring OpenStack and VMs
• What’s this?
– Log messages from the OpenStack launching one virtual machine
29
223 lines, 119698 characters
(only 24 lines without DEBUG level)
Problems: complicated logs
(icehouse release)
4. Monitoring OpenStack and VMs
• Analyzing without DEBUG logs
– In a case of failure to create new instance
30
2015-07-XX 17:00:YY TopCellController INFO
nova.osapi_compute.wsgi.server
172.X.X.X "GET <API_URL>/servers/<VM-UUID> HTTP/1.1" status: 200
->Accepting the request of creating new one
2015-07-XX 17:00:YY TopCellController INFO
nova.scheduler.filter_scheduler Attempting to build 1 instance(s)
->Just reporting
2015-07-XX 17:00:YY ChildCellController WARNING
nova.scheduler.driver [instance:<VM-UUID>] Setting instance to ERROR state.
->The beginning of sleepless night
2015-07-XX 17:00:YY ChildCellController INFO
nova.filters Filter DiskFilter returned 0 hosts
-> Lack of free disks? Where is the processing sequence?
(icehouse release)
Problems: complicated logs
For newbies, it’s not friendly. 
4. Monitoring OpenStack and VMs
31
2015-07-XX 17:00:YY ChildCellController DEBUG
nova.filters Filter RamFilter returned 88 host(s) get_filtered_objects
/usr/lib/python2.6/site-packages/nova/filters.py:88
->report: enough memory
2015-07-XX 17:00:YY ChildCellController DEBUG
nova.scheduler.filters.disk_filter
(<hypervisor-name>) ram:46581 disk:731136 io_ops:0 instances:3
does not have 1433600 MB usable disk, it only has 731136.0 MB usable disk.
->report: not enough disk
* 88 times
2015-07-XX 17:00:YY ChildCellController INFO
nova.filters Filter DiskFilter returned 0 hosts
->Lacks of free disk space. We need to add more disk rapidly.
• Analyzing DEBUG logs
– In a case of failure to create new instance
(icehouse release)
Problems: complicated logs
DEBUG log shows internal processing, but it’s quite scruffy.
4. Monitoring OpenStack and VMs
• New function to trace logs easily even across components
– Target) nova, cinder, glance, neutron, keystone, etc.
• Current
– Each component, each request ID
– Need to map request IDs for tracing logs
– Difficulty of finding IDs 
– i.e.) Create new volume from image (cinder calls glance api)
• NTT’s suggestion
– Log request ID mapping within 1 line in each caller
– Approved as a cross-project spec, To be implemented
• https://review.openstack.org/#/c/156508
32
Log Request ID mapping
glance-apicinder-volume
2015-10-08 16:14:33.498 DEBUG cinder.volume.manager [req-A admin] image down load from glance req-B
015-10-08 16:14:33.521 DEBUG glanceclient.common.http
[req-A admin]
HTTP/1.1 200 OK
content-length: 0
x-image-meta-status: active
x-image-meta-owner:
46e99ee00fd14957b9d75d997cbbbcd8
…
x-openstack-request-id: req-B
…
x-image-meta-disk_format: ami log_http_response
/usr/local/lib/python2.7/dist-
packages/glanceclient/common/http.py:136
…
2015-10-08 16:14:33.517 11610 DEBUG
glance.registry.client.v1.client [req-B
924515e485e846799215a0c9be9789cf
46e99ee00fd14957b9d75d997cbbbcd8 - - -] Registry request GET
/images/c95a9731-77c8-4da7-9139-fedd21e9756d HTTP 200
request id req-req-5cb606e5-ea1c-4afc-a626-a4deb83c56a1
do_request /opt/stack/glance/glance/registry/client/v1/client.py:124
2015-10-08 16:14:33.520 11610 INFO eventlet.wsgi.server [req-
B 924515e485e846799215a0c9be9789cf
46e99ee00fd14957b9d75d997cbbbcd8
…
Buried
deep!
4. Monitoring OpenStack and VMs
• We’ve been providing standard monitoring system inside our company
– Standardized monitoring work-flow for internal service developers
• Standard monitoring item sets and rules
• Parameter threshold of alerts
– Monitoring configuration into Zabbix (or Nagios) by our hands
• Think about monitoring scheme with OpenStack
– Over 1,000 virtual machines are born, also suddenly die
– By our hands?
– Our zabbix given new function
• Detecting new VMs and starting monitoring semi-automatically
• Before getting along with OpenStack...
– Consider your today’s work-flow deeper for an efficient operation
33
2. For web services - Changing operation work-flow
5. Current Activity and Future Plan
34
5. Current Activity and Future Plan
• Changing Sizing and improving VM density
• Initial flavors are designed by focusing on migration project
•Compatibility with old DC rather than resource efficiency
•VM spec same as old DC was the best for migration plan
• Current usage
–Disk capacity is too much
•Design: 37 Gbytes disk size per 1 Gbytes memory size
•Actual Usage: 7 Gbytes disk size per 1 Gbytes memory size
• Providing new flavors based on actual usage, asking to return
unused disk capacity
• Increase server physical memory double
• Aiming to increase VM density 1.3 – 2 times
35
Current Activity
5. Current Activity and Future Plan
• Upgrade Openstack
–Load Balancer as a Service (LBaaS) is desired
•Current: Manual operation to Load Balancer
•LBaaS API v1 is not enough
•Waiting our vender driver for LBaaS API v2
–Establish upgrade operation
•Need to apply our patches
•Need to develop and test these patches
•These prevents us from frequent upgrade
–Mitaka release
•NTT R&D locates at “Mitaka”
36
Future Plan
37
Summary
1. About NTT Resonant, operates web portal site “goo”.
• 170 million unique browsers and 1 billion page views per month
2. OpenStack Infrastructure Design
• It increased our business speed and agility
• We successfully deployed 400 hypervisors in 6 months
• Stable in production for more than 1 year
3. VM setup by Puppet with OpenStack
• We could start 70+ services on 1,300 VMs in 4 months
• It shorten the time to deploy a service from 5 days to 30 minutes
4. Monitoring both of OpenStack and VMs, with Zabbix
5. Current Activity and Future Plans
• Current: Sizing to improve VM density
• Future: Upgrade, LBaaS and more toward Mitaka release
Openstack new VM
Appendix: Our monitoring environment
TIPS) Semi-auto monitoring setting
Zabbix polles VMs and reads monitoring.conf, and then apply specified template.
サーバ
Zabbix
サーバ
サーバ
サーバ
サーバ
Redmine
(2) Getting monitoring definition via the agent.
(1)Zabbix server
polles IP segments of Openstack VMs,
finding out zabbix agents
=> Registering it as monitoring target
(3)Applying corresponding monitoring
template against monitoring.conf
(4)In case of catching trouble,
kick the script for auto-issuing
=> Sending request to Redmine
API
X.Y.Z.0/24
Monitoring.conf
apache_prod
mysql_prod
linux_prod
alert_on
ZabbixAgent
Examples
apache_prod = Apache in production monitor
apache_dev = Apache in development monitor
linux_prod = Linux OS in production monitor
alert_on = sending alert to the VM users
alert_off = maintenance(silent) mode
…
Script
Polling VMs
(Auto Discovery)
Trouble Ticket
Issuing
New!
38

Contenu connexe

Dernier

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 

Dernier (20)

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 

En vedette

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

OpenStack Summit Tokyo2015 Presentation NTTResonant

  • 1. OpenStack at NTT Resonant: Lessons Learned in Web Infrastructure Tomoya Hashimoto, Business Platform Division, NTT Resonant Inc. Kazuhiro Tooriyama, Business Platform Division, NTT Resonant Inc. Toshikazu Ichikawa, NTT Software Innovation Center, NTT Corporation
  • 2. Presentation Video This slide deck was presented in OpenStack Summit Tokyo 2015. You will find our video-recorded presentation at the URL https://www.openstack.org/summit/tokyo- 2015/videos/presentation/openstack-at-ntt-resonant-lessons-learned- in-web-infrastructure 2
  • 3. Speakers Tomoya Hashimoto Kazuhiro Tooriyama 2010 – 2014 NTT Communications Development of ISP NW (OCN) 2014 - current NTT Resonant Engineer of server platform 2001 - 2012 NTT Resonant goo blog, oshiete goo (Q&A service) Development and operation of core services 2012 - current NTT Resonant Architect of server platform 3 Toshikazu Ichikawa 2011 – 2014 Verio (NTT America) Development of cloud service “Cloudn” and managed hosting service 2014 - current NTT Development of cloud service platform
  • 4. 1. About NTT Resonant 2. OpenStack Infrastructure Design 3. VM setup by Puppet with OpenStack 4. Monitoring OpenStack and VMs 5. Current Issues and Future Plan 4 Agenda
  • 5. 1. About NTT Resonant 5
  • 6. 1. About NTT Resonant 6 Regional Communications Business Long Distance and International Communications Business Mobile Communications Business Date Communications Business $112 billion in total revenue 240,000 employees worldwide #1 in Data Center floor space #2 in Global IP Backbone Source: TeleGeography All facts and figures accurate as of March 2014 R&D
  • 7. 1. About NTT Resonant B2C services Platform and B2B2C services Portal Site Smartphone application goo milk feeder goo disaster prevention application 7 Services of Customers Healthcare Disaster prevention/ response solutionsPhone Cloud / Developer support e-commerce site for communications devices NTT Resonant’s Business Area
  • 8. 1. About NTT Resonant 8 Dictionaries ZIP codes Laboratory Bodycloud Housing and real estate Search Baby-care Movies Maps Navigation Horoscopes RankingsCar and bike News Weather Healthcare Smartphone applicationsBlogs Job search Love and marriage Online store Travel Providing 60+ services including • Web search • Blogging • News • Oshiete! goo Q&A site Launched in 1997 18th years old Web portal site “goo” http://www.goo.ne.jp/
  • 9. 1. About NTT Resonant 9 How large is ? The 3rd largest web portal in Japan Yahoo! Google Rakuten MSN Scale of web portal “goo” 170 million unique browsers per month 1 billion page views per month Source: 2015.02 NetRatings
  • 11. 2. OpenStack Infrastructure Design • Migrate to another data center, under limited timeframe –The termination of existing data center (DC) contract is fixed. We need to migrate our system from existing DC to another DC by the time. • Shorten a lead time for service release –Speed up by changing manual operation to create and manage VMs –Comparable to public cloud service such as AWS • Support all workflows to provide service –not just introducing OpenStack, as an infrastructure for web services –Not only VM creation, but also an installation and configuration of software inside VMs What was required to us at OpenStack deployment 11
  • 12. Service Teams Platform Team 2. OpenStack Infrastructure Design Organization and Formation NTT Resonant Service Developer 60+ services Platform service Operation Partners Operator Outsourcing 12 ~10 engineers 300+ engineers … Service Developer Service Developer Service Developer Service Developer Operator Operator Design Team NTT R&D OpenStack Community … Joint experiment Contribution Distribution
  • 13. 2. OpenStack Infrastructure Design • It’s decided to migrate our services to another data center. –2014/3 Project Started OpenStack installation’s design and deployment begins –2014/10 OpenStack is ready, in production –2014/10-2015/01 (4 months) 70 services, 1300 VMs started OpenStack deployment timeline with our services 13 March July Oct. JanJuneApril May Aug. Sep. Nov. Dec. 2014 2015 Migration of services from old existing environment ★OpenStack started, In Production ★Migration Completed OpenStack Installation: Design / Deployment Requirement Definition About 6 months ★Old existing environment Closed
  • 14. 2. OpenStack Infrastructure Design • Using OpenStack as Private Cloud • In production since 2014 October • As of now, it supports –80+ services –1 billion page views per month • With –400 hypervisors •2 Nova cells –4,800 physical cores –1,800+ virtual servers OpenStack Scale at main data center of NTT Resonant 14 Launch
  • 15. Dashboard 2. OpenStack Infrastructure Design OpenStack Components (Icehouse Release) 15 Horizon Neutron Nova Glance Cinder Swift Keystone Network Hypervisor Image Block Storage Object Storage Identity Virtual Router and LAN Virtual Load Balancer Virtual server VM Template Image snapshot Virtual volume RESTful file store Replication What we use VM VM VM APP OS APP OS APP OS designed by Freepik Trove Heat Orchestration Database services Ceilometer Telemetry
  • 16. • Distribution –RDO with CentOS 6 –Icehouse version • Automation –Puppet for Configuration Management •Thanks RDO Community for Puppet manifest 16 2. OpenStack Infrastructure Design Deployment
  • 17. • Provider network with VLAN –No control L3+ including Router, NAT, Load Balancer, Firewall • Using ML2, Linuxbridge agent –We are familiar with it • Service Model –An administrator prepares networks and subnets per tenant –A tenant is not allowed to create/delete a network • Close to “Scenario: Provider networks with Linux bridge” in the “OpenStack Networking Guide” [1] 17 [1] http://docs.openstack.org/networking-guide/scenario_provider_lb.html Neutron L4-7: Load Balancer, VPN L3: Router, NAT L2: Network, Port What we use 2. OpenStack Infrastructure Design Networking with Neutron
  • 18. 18 Node Type OpenStack Components RabbitMQ (MQ) and MariaDB (Database) HAProxy (LB) and Pacemaker (HA cluster) Top cell Controller Nova, Glance, Keystone, Neutron, Horizon RabbitMQ Mirrored queue Nova, Keystone, Neutron, Horizon, DB, MQ Child cell Controller Nova RabbitMQ Mirrored queue MQ Database N/A MariaDB Galera Cluster N/A Swift Proxy Swift, Glance N/A Swift, Glance Swift Storage Swift N/A N/A Compute Nova, Neutron N/A N/A Node types and HA(High Availability) strategy 2. OpenStack Infrastructure Design
  • 19. 2. OpenStack Infrastructure Design Contribution to Community related to this project 19 • This bug was a show-stopper for the project until we fixed –Bug Fix [1]: •the Shelve function didn't work at Icehouse release with nova-cell deployment •We use shelve/unshelve for hypervisor maintenance • Some bugs we found and fixed –Security Bug Fix [2]: •This was announced as OSSA 2015-017 recently –8 bug fixes other than above [1] "shelve api does not work in the nova-cell environment“ https://bugs.launchpad.net/nova/+bug/1338451 [2] "Deleting instance while resize instance is running leads to unuseable compute nodes” https://bugs.launchpad.net/nova/+bug/1392527
  • 20. 2. OpenStack Infrastructure Design • We modified codes to enforce our operation rules –We modified only Horizon •Users come through Horizon, not API •What we implemented –Server naming restriction –Access limit to security group function –And, about 40 items –No modification to other components except bug fix backports •Minimize the cost to maintain 20 Server creation dialog of Horizon Customization on Dashboard
  • 21. 3. VM setup by Puppet with OpenStack 21
  • 22. 3. VM setup by Puppet with OpenStack Issue of VM setup (installation and configuration) 22 • Only 4 months from VM creation to service migration – Time is limited for VM setup – 1,300 VMs need to be migrated onto OpenStack – Automate procedures, as much as possible • The key is puppet manifest using at existing data center (DC) – We used puppet manifest to setup VM at existing DC • Making a bridge between OpenStack and puppet – The goal is to setup our services on top of OpenStack quickly and easily We resolved this issue by using puppet integrated with OpenStack
  • 23. OpenStack 3. VM setup by Puppet with OpenStack How we use puppet with OpenStack 23 • Our puppet design – Individual puppet master per tenant – Linux account, middleware, config file etc. – Single manifest repository • What is required to use Puppet – Host name can be resolved with DNS – Host group is defined in LDAP – Puppet manifest has the entry for a host group Tenant: A Tenant:A User VM-A VM-B SVN Puppet Master DNS LDAP Necessary
  • 24. OpenStack 3. VM setup by Puppet with OpenStack How we use puppet with OpenStack 24 • Synchronization tool – Polling on Nova API to detect a new VM – VM registration to DNS, LDAP and puppet manifest – Complete above steps every 5 minutes • OpenStack user is able to apply puppet manifest easily and quickly right after a VM creation Tenant: A Tenant:A User VM-A VM-B SVN Puppet Master Synchronization tool Polling NovaAPI DNS LDAP Add entry Add entry
  • 25. Outcome from VM setup framework with OpenStack 25 • Drastically shortened timeline and efficient workflow –1000 VMs service deployment within 1 months –Only 30 minutes from VM creation to service start •It needed 5 business days without OpenStack –Eliminate tasks of two operators by reducing manual operation • Common process to build service environment –Service engineer don’t worry about environment, focusing on business 3. VM setup by Puppet with OpenStack
  • 27. 4. Monitoring OpenStack and VMs • Two monitoring environments 1. For cloud infrastructure •NW, Physical servers, Openstack itself 2. For web services •Providing standard service monitoring methods on the private cloud • Tools and Situations – Zabbix •Semi-auto VM monitoring – Redmine and Wiki •As an issue(ticket) managiment system •Auto issuing 1 ticket / 1 trouble – Operation Center •24/7 monitoring and calling via TEL •1st response to simple occasion 27 Abstract of our monitoring env. Operation Center Web Service Teams Automatic Issuing Infra Team (us) Watcing 24/7 In case of trouble or provisioning In case of serious situation designed by Freepik
  • 28. 4. Monitoring OpenStack and VMs • Severity order – API monitoring •keystone-api, nova-api, neutron-api, horizon GUI, glance-api, swift-proxy •Quite serious trouble – Process failure detection •nova-*, swift-*, keystone-*, rabbitmq-server, mysqld(MariaDB) etc. – Process performance monitoring •Depending on middleware •i.e.) MySQL connection number etc. – Log monitoring •Treat any log message above ERROR as “trouble” from the beginning –Lack of knowledge leads doubt •Filtering problem-free logs day by day 28 1. For cloud infrastructure (OpenStack monitoring)
  • 29. 4. Monitoring OpenStack and VMs • What’s this? – Log messages from the OpenStack launching one virtual machine 29 223 lines, 119698 characters (only 24 lines without DEBUG level) Problems: complicated logs (icehouse release)
  • 30. 4. Monitoring OpenStack and VMs • Analyzing without DEBUG logs – In a case of failure to create new instance 30 2015-07-XX 17:00:YY TopCellController INFO nova.osapi_compute.wsgi.server 172.X.X.X "GET <API_URL>/servers/<VM-UUID> HTTP/1.1" status: 200 ->Accepting the request of creating new one 2015-07-XX 17:00:YY TopCellController INFO nova.scheduler.filter_scheduler Attempting to build 1 instance(s) ->Just reporting 2015-07-XX 17:00:YY ChildCellController WARNING nova.scheduler.driver [instance:<VM-UUID>] Setting instance to ERROR state. ->The beginning of sleepless night 2015-07-XX 17:00:YY ChildCellController INFO nova.filters Filter DiskFilter returned 0 hosts -> Lack of free disks? Where is the processing sequence? (icehouse release) Problems: complicated logs For newbies, it’s not friendly. 
  • 31. 4. Monitoring OpenStack and VMs 31 2015-07-XX 17:00:YY ChildCellController DEBUG nova.filters Filter RamFilter returned 88 host(s) get_filtered_objects /usr/lib/python2.6/site-packages/nova/filters.py:88 ->report: enough memory 2015-07-XX 17:00:YY ChildCellController DEBUG nova.scheduler.filters.disk_filter (<hypervisor-name>) ram:46581 disk:731136 io_ops:0 instances:3 does not have 1433600 MB usable disk, it only has 731136.0 MB usable disk. ->report: not enough disk * 88 times 2015-07-XX 17:00:YY ChildCellController INFO nova.filters Filter DiskFilter returned 0 hosts ->Lacks of free disk space. We need to add more disk rapidly. • Analyzing DEBUG logs – In a case of failure to create new instance (icehouse release) Problems: complicated logs DEBUG log shows internal processing, but it’s quite scruffy.
  • 32. 4. Monitoring OpenStack and VMs • New function to trace logs easily even across components – Target) nova, cinder, glance, neutron, keystone, etc. • Current – Each component, each request ID – Need to map request IDs for tracing logs – Difficulty of finding IDs  – i.e.) Create new volume from image (cinder calls glance api) • NTT’s suggestion – Log request ID mapping within 1 line in each caller – Approved as a cross-project spec, To be implemented • https://review.openstack.org/#/c/156508 32 Log Request ID mapping glance-apicinder-volume 2015-10-08 16:14:33.498 DEBUG cinder.volume.manager [req-A admin] image down load from glance req-B 015-10-08 16:14:33.521 DEBUG glanceclient.common.http [req-A admin] HTTP/1.1 200 OK content-length: 0 x-image-meta-status: active x-image-meta-owner: 46e99ee00fd14957b9d75d997cbbbcd8 … x-openstack-request-id: req-B … x-image-meta-disk_format: ami log_http_response /usr/local/lib/python2.7/dist- packages/glanceclient/common/http.py:136 … 2015-10-08 16:14:33.517 11610 DEBUG glance.registry.client.v1.client [req-B 924515e485e846799215a0c9be9789cf 46e99ee00fd14957b9d75d997cbbbcd8 - - -] Registry request GET /images/c95a9731-77c8-4da7-9139-fedd21e9756d HTTP 200 request id req-req-5cb606e5-ea1c-4afc-a626-a4deb83c56a1 do_request /opt/stack/glance/glance/registry/client/v1/client.py:124 2015-10-08 16:14:33.520 11610 INFO eventlet.wsgi.server [req- B 924515e485e846799215a0c9be9789cf 46e99ee00fd14957b9d75d997cbbbcd8 … Buried deep!
  • 33. 4. Monitoring OpenStack and VMs • We’ve been providing standard monitoring system inside our company – Standardized monitoring work-flow for internal service developers • Standard monitoring item sets and rules • Parameter threshold of alerts – Monitoring configuration into Zabbix (or Nagios) by our hands • Think about monitoring scheme with OpenStack – Over 1,000 virtual machines are born, also suddenly die – By our hands? – Our zabbix given new function • Detecting new VMs and starting monitoring semi-automatically • Before getting along with OpenStack... – Consider your today’s work-flow deeper for an efficient operation 33 2. For web services - Changing operation work-flow
  • 34. 5. Current Activity and Future Plan 34
  • 35. 5. Current Activity and Future Plan • Changing Sizing and improving VM density • Initial flavors are designed by focusing on migration project •Compatibility with old DC rather than resource efficiency •VM spec same as old DC was the best for migration plan • Current usage –Disk capacity is too much •Design: 37 Gbytes disk size per 1 Gbytes memory size •Actual Usage: 7 Gbytes disk size per 1 Gbytes memory size • Providing new flavors based on actual usage, asking to return unused disk capacity • Increase server physical memory double • Aiming to increase VM density 1.3 – 2 times 35 Current Activity
  • 36. 5. Current Activity and Future Plan • Upgrade Openstack –Load Balancer as a Service (LBaaS) is desired •Current: Manual operation to Load Balancer •LBaaS API v1 is not enough •Waiting our vender driver for LBaaS API v2 –Establish upgrade operation •Need to apply our patches •Need to develop and test these patches •These prevents us from frequent upgrade –Mitaka release •NTT R&D locates at “Mitaka” 36 Future Plan
  • 37. 37 Summary 1. About NTT Resonant, operates web portal site “goo”. • 170 million unique browsers and 1 billion page views per month 2. OpenStack Infrastructure Design • It increased our business speed and agility • We successfully deployed 400 hypervisors in 6 months • Stable in production for more than 1 year 3. VM setup by Puppet with OpenStack • We could start 70+ services on 1,300 VMs in 4 months • It shorten the time to deploy a service from 5 days to 30 minutes 4. Monitoring both of OpenStack and VMs, with Zabbix 5. Current Activity and Future Plans • Current: Sizing to improve VM density • Future: Upgrade, LBaaS and more toward Mitaka release
  • 38. Openstack new VM Appendix: Our monitoring environment TIPS) Semi-auto monitoring setting Zabbix polles VMs and reads monitoring.conf, and then apply specified template. サーバ Zabbix サーバ サーバ サーバ サーバ Redmine (2) Getting monitoring definition via the agent. (1)Zabbix server polles IP segments of Openstack VMs, finding out zabbix agents => Registering it as monitoring target (3)Applying corresponding monitoring template against monitoring.conf (4)In case of catching trouble, kick the script for auto-issuing => Sending request to Redmine API X.Y.Z.0/24 Monitoring.conf apache_prod mysql_prod linux_prod alert_on ZabbixAgent Examples apache_prod = Apache in production monitor apache_dev = Apache in development monitor linux_prod = Linux OS in production monitor alert_on = sending alert to the VM users alert_off = maintenance(silent) mode … Script Polling VMs (Auto Discovery) Trouble Ticket Issuing New! 38

Notes de l'éditeur

  1. 38