2. Maciej Szymczyk
• Big Data Engineer / Software Developer
• Professional Soldier / National Cyber Security Centre
• Teacher at Military University of Technology
• Blogger – Wiadro Danych – https://wiadrodanych.pl
• Amateur Cyclist & Triathlete ( ½ Ironman Finisher)
3. My first webapp
• MVC – All logic in Controllers
• Zero logging
• Zero metrics
• Asking customers what to do to reproduce bugs
4. Complex apps
• Microservices
• Many instances
• Polyglot persistence
• Web servers Apache/Nginx/IIS
• Many types of clients (Android / iOS / Web)
• Docker/Docker Swarm/Kubernetes
• External dependencies (GCM / APNS)
5. What can we log?
Usage
Audit
Statistics
Performance Errors Debug
6. When/What can we log?
• Messages
• Requests
• Body
• Query String Values
• Headers
• Methods
• Parameters
• User Id/IP
• Client type/version
• Service/Machine Name
• Correlation/Trace Id (Guid)
9. Where should I put my logs?
• Console
• Log Files
• Relational Database (MSSQL)
• Elasticsearch / Splunk / Loki
• HDFS / S3 / ADLS
LET’S USE RDB
OK, YOU WILL
BUILD
DASHBOARDS
11. When your only tool is a hammer,
everything looks like a nail
12. Elasticsearch
• Full text search engine based on Apache Lucene
• REST API
• Open Source
• Scalable
• Multiple ways of injestion
• REST
• Beats (Filebeat, Metricbeat, Heartbeat, etc)
• Logstash
13. Elastic Stack usecase
• Logs/metrics aggregation
• Search engine
• Realtime Business Analytics
• SIEM – Security Information and Event Management
25. Shard
• Part of Elasticsearch index
• Shards are distributed across cluster
• You can’t change numer of shards after creating index
• Each shard can has 0 - n replicas
Index
SHARD 1
SHARD N
REPLICA N
REPLICA 1
...
...
34. Too lazy to…
• Implement wrapper…
• …and config for it…
• …and convince other to use it.
• Replace existing ILogger<> usages
• Try catch every piece of code for errors
• Use Stopwatch all the time
• Look for username in logs from controller
38. Application Performance Monitor (APM)
• Distributed tracing
• APM agent streams
application
performance metrics
to APM server
• Detect anomalous
response Times
• Java, .NET, Node.js,
Django, Flask, Rails,
Rack, RUM – JS, Go
39. Elasticsearch takeaways
• Do not exceed 64 GB of RAM per data node (31 GB JVM)
• SSD > HDD
• avoid NAS/SAN
• Use ECS (Elastic Common Schema)
• Use ILM (Index Lifecycle Management)
• Hot/Warm/Cold Architecture
• define rules over when to perform certain actions