Node.js Transaction Tracing & Root Cause Analysis with StrongLoop Arc

TRANSACTION TRACING & ROOT
CAUSE ANALYSIS WITH
STRONGLOOP ARC
Jordan Kasper | Developer Evangelist

STEP ONE
The first step in monitoring, profiling, and tracing your
Node application is to run it in a process manager!

BUILD YOUR APP WITH SLC
~$ npm install g strongloop
~/myapp$ slc build
...
~/myapp$ ls
... ... myapp0.1.0.tgz

INSTALL AND RUN STRONG PM
On your deployment machine...
~$ npm install g strongpm
~$ slpminstall

DEPLOY TO STRONG PM
From our development machine (or staging, etc)...
~/myapp$ slc deploy http+ssh://myserver.com:8701

RUNNING LOCALLY
If you need to profile things locally (your machine or a
staging/testing server), run slc start from your app directory:
~/myapp$ slc start
Process Manager is attempting to run app `.`.
  To confirm it is started: slc ctl status tracingexampleapp
  To view the last logs: slc ctl logdump tracingexampleapp
  ...
Then start the Arc UI:
~/myapp$ slc arc

AVAILABLE METRICS
CPU Load (system)
Heap Memory sage
Event Loop Count
Event Loop Tick Timing
HTTP Connections
Database Connections (Oracle, MySQL, Mongo, Postgres)
Misc other modules (Redis, Memcache(d), Message queues)

WHAT DO I LOOK FOR?
CPU Usage is pretty obvious, just watach your high points!
With Heap Memory Usage you want to see a "sawtooth"
chart, each drop indicates garbage collection. No drop is
bad!

WHAT DO I LOOK FOR?
The two Event Loop metrics are opposed. You want the
loop count to remain high under normal load (more ticks
per metrics cycle is good). Any dips may be bad.
The Loop timing, on the other hand, indicates how long
event loop ticks are taking. Any spikes here are bad!

SETUP METRICS COLLECTION
On our production machine, with strong-pm installed,
simply set the collection location:
~$ export STRONGLOOP_METRICS="log:/path/to/apimetrics.log"
~$ export STRONGLOOP_METRICS="syslog"
~$ export STRONGLOOP_METRICS="statsd://mylogserver.com:1234"
~$ export STRONGLOOP_METRICS="graphite://mylogserver.com:1234"
~$ export STRONGLOOP_METRICS="splunk://mylogserver.com:1234"

SETUP METRICS COLLECTION
Alternatively, on the production machine you can run:
~$ slpminstall metrics <url>
Or during runtime:
~$ slc ctl envset myapp STRONGLOOP_METRICS=<url>

PROFILING
We can spot issues using the metrics being monitored, but
now we need to find the cause of those issues.
Profiling CPU usage and memory is the way to do this.

PROGRAMMATIC MEMORY MONITORING
If we have memory issues, it may be helpful to monitor
memory usage dynamically.
~$ npm install heapdump save
var heapdump = require('heapdump');
var THRESHOLD = 500;
setInterval(function () {
    var memMB = process.memoryUsage().rss / 1048576;
    if (memMB > THRESHOLD) {
        process.chdir('/path/to/writeable/dir');
        heapdump.writeSnapshot();
    }
}, 60000 * 5);

MEMORY MONITORING
Caution: Taking a heap snapshot is not trivial on
resources.
If you already have a memory problem, this could kill your
process!
Unfortunately sometimes you have no alternative.

SMART PROFILING
How can we using the monitoring to profile?
"smart profiling" based on event loop blockage
~$ slc ctl cpustart 1.1.49408 20
1. Monitors a specific worker (1.1.49408)
2. Event loop blocked for more than 20ms, start CPU profile
3. Stop profiling once event loop resumes

FINDING THE WORKER ID
~$ slc ctl status
Service ID: 1
Service Name: myapp
Environment variables:
  No environment variables defined
Instances:
    Version  Agent version  Cluster size
     4.1.0       1.5.1            4
Processes:
        ID      PID   WID  Listening Ports  Tracking objects?  CPU profilin
    1.1.49401  49401   0
    1.1.49408  49408   1     0.0.0.0:3001
    1.1.49409  49409   2     0.0.0.0:3001
    1.1.49410  49410   3     0.0.0.0:3001
    1.1.49411  49411   4     0.0.0.0:3001

DEEP TRANSACTIN TRACING
Analyze performance of your application from a high level
down to the function level.

ANOMOLY INSPECTION
See something off?
Click on that point in the resource usage chart.
(The orange triangles at the bottom identify anomolies
betond three-sigma deviations.)

TRACING WATERFALL
By clicking on the "sync" line we can inspect the costs of the
synchronous code.

FLAME CHARTS
The flame chart identifies each function in the call stack,
organized in color by module.
The size of the bar represents the total resource
consumption for that function and all of its function calls.
Clicking on a function shows that functions resource usage.

LOOKING FOR MORE?
Check out our blog post on Transaction Tracing and
identifying a DoS attack!
http://bit.ly/arc-tracing

THANK YOU!
QUESTIONS?
Jordan Kasper | Developer Evangelist
Join us for more events!
strongloop.com/developers/events

Node.js Transaction Tracing & Root Cause Analysis with StrongLoop Arc

Recommandé

Recommandé

Contenu connexe

Dernier

Dernier (20)

En vedette

En vedette (20)

Node.js Transaction Tracing & Root Cause Analysis with StrongLoop Arc