How To Troubleshoot Collaboration Apps for the Modern Connected Worker
Public private hybrid - cmdb challenge
1. Private, Public, Hybrid Unique
Challenges with OpenStack
Gathering Info CMDB
Ryszard Chojnacki, an OPS CMDB blueprint worker
27 October, 2015
2. Covering what?
• Define a scenario for collecting data
• Show actual payload headers used in one application
• Set the stage for a CMDB blueprints direction:
3. ETL vs. Federation approaches
Extract, Transform & Load
Send data as appropriate
• Allows for
– Complex and low cost queries
– Can be built to accommodate
loss
– History, what changed last
week
Federation
Access the sources of information in “real time”
• Allows for
– What is the situation NOW!
– Works well for OpenStack
APIs depending on use-case
5. Imagine this Scenario
You have hardware, data and applications spread over multiple
locations – how can you aggregate meta data into 1 place
6. Local source:
provisioning as example
{
“fqdn": “compute-0001.env1.adomain.com",
“serial": “USE1234567",
“os_vendor": “Ubuntu",
“os_release": “12.04",
“role": “compute-hypervisor",
}
Suppose provision systems are
created such that there is 1 for
each environment
• Each system has a limited
scope
• Each system must be uniquely
identifiable to permit data
aggregation
1
2
3
7. Global source:
asset management
Payload type rack_info:
{
“rack_id”: “r0099”,
“datacenter”: “Frankfurt”,
“tile”: “0404”
}
Payload type rack_contents:
{
“in_rack”: “r00099”,
“serial”: “USE1234567”,
“u”: 22,
“pid”: “qy799a”
}
Suppose that there is a single asset
management tool that covers al
environments
• Scope is global
• Unique ID still employed
• The example has more than
one type of data for:
• Each rack – rack_info
• Each asset – rack_contents
1
2
3
9. Message formats
payload
{
"payload": { . . . }
}
Separate logically, by encapsulating
data into a payload document
For example, put here:
• Provisioning data
• Rack data
• Asset data
10. Message formats
version
{
"version": {
"major": 1
// provider extension possible here; minor, tiny, sha1, …
},
"payload": { . . . }
}
“Schema” version for the payload
• Major versions same indicates
no incompatible changes made
to schema
• Where compatible snapshot
live process will occur
Note: Documents don’t have
schemas, but there must be some
required, plus optional key/value
pairs, so that consumers of the data
can rely programmatically on it
11. Message formats
when was that?
{
"batch_ts": <epoch_time>,
"record_ts": <epoch_time>,
“batch_isodate”: “2015-01-16 16:07:21.503680”, . . .
"version": {
"major": 1
},
"payload": { . . . }
}
Useful for understanding how old is
the data I’m seeing
• Batch timestamp must be
constant for all records in the
same batch
• Record is when the record was
exported/message created –
maybe the same as batch
• Updated, if available, is when
the data was last changed in
the source system
Note human readable _isodate
forms, not used in processing
12. Message formats
{
"source": {
"system": "rackreader.adomain.net"
"type": "rack_info",
"location": "env"
},
"batch_ts": <epoch_time>,
"record_ts": <epoch_time>,
"import_ts": <epoch_time>,
"version": {
"major": 1
},
"payload": { . . . }
}
Provides where the data came from
and the type of data
• System: usually fqdn of source
system
• Location: the scope of the
system
• Type: describes the payload
content, and is tied to the
schema
13. Message formats
{
"record_id": “r00099-rackreader.adomain.net",
"msg_type": "snapshot",
"source": {
"system": "rackreader.adomain.net"
"type": "rack_info",
"location": "env"
},
"batch_ts": <epoch_time>,
"record_ts": <epoch_time>,
"import_ts": <epoch_time>,
"version": {
"major": 1
},
"payload": { . . . }
}
Mark the content with a unique ID
for that record, and how to process
• A combination of an identifier in
the source system plus an fqdn
makes for a very globally
unique value
• This value is the primary key for
all data operations on the record
• This is a “snapshot” how that is
processed described shortly
15. Philosophy employed
• Operating at large scale expect to have issues
– Small % error X a big number = some degree of loss
• Tolerant of loss
• Considerate of resources
• Wanted history
– Need easy access to the very latest
• Need a flexible [document] schema – this is JSON
– Provider/Agent is the owner of the schema for its data
– Need a way to converge; communities of practice
16. Snapshot versus event
based updates
Example
• Snapshot updates every 8h
– Larger data set but not very frequent
• Live updates as they occur
– Tiny data as they occur
• Result
– Minimal network utilization
– Small overhead on source
• Use the combination that best
suits the need
We run 2 collections
• Snapshot
• Has history
• Live
• Has only the latest
Snapshots update Live
17. Message type overview
Snapshot
• Snapshot
– Defines a Snapshot record
• Batch record count
– Defines how many items in a
batch
– Only If sizes match update
Live
• Required to know what to
delete
Live
• Overwrite
– Live overwrite a complete doc
for a single record
• Delete
– Delete from live a single
record
– Never affects Snapshot
18. Message formats
snapshot_size
{
"msg_type": "snapshot_size",
"source": {
"system": "rackreader.adomain.net"
"type": "rack_info",
"location": "env"
},
"size": 3,
"batch_ts": <epoch_time>
}
If the consumer receives the size of
messages indicated then the
update of live is possible
Any records received are always
placed in the snapshot collection
19. Message formats
overwrite
{
"msg_type": "overwrite",
"record_id": “r00099-rackreader.adomain.net",
"source": {
"system": "rackreader.adomain.net"
"type": "rack_contents",
"location": "env"
},
"version": {
"major": 1
},
"record_ts": <epoch_time>,
"payload": { . . . }
}
• Separate the header info from
the payload data
20. Message formats
delete
{
"msg_type": “delete",
"record_id": “r00099-rackreader.adomain.net",
"source": {
"system": "rackreader.adomain.net"
"type": "rack_contents",
"location": "env"
},
"version": {
"major": 1
},
"record_ts": <epoch_time>,
}
• Separate the header info from
the payload data
22. Noteworthy
• If we lose an event we catch up in the batch update
• If we lose a batch, data is just 1 batch cycle stale
• Several companies have arrived at this position
• Records are fairly small
– rabbitMQ friendly
– Easy to search in your data store
23. CMDB blueprint
• Set the stage for a CMDB blueprints direction:
– Collect
– Store
– Query
• Focus on the Collection framework
• Community of Practice
– Share common stuff; hopefully every expanding domain
– Permit ad-hoc sources, for what you have now
25. Message processing
A live update to
“B” goes straight in
here
And is later
updated again by
the snapshot
B
There are 2 collections
of data; snapshot and
live
Snapshot always keeps
growing
Live only has 1 entry
per record