Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Efficient logging in multithreaded C++ server
1. 1
www.chenshuo.com
EFFICIENT LOGGING IN
MULTITHREADED C++ SERVER
2012/06 Shuo Chen
2. Show me the code
2
C++ logging library in muduo 0.5.0
http://code.google.com/p/muduo (release)
github.com/chenshuo/muduo (latest)
github.com/chenshuo/recipes/tree/master/logging
Performance on i5-2500
1,000,000+ log messages per second
Max throughput 100+MiB/s
1us~1.6us latency per message for async logging
2012/06 www.chenshuo.com
3. Two meanings of log
3
Diagnostic log Transaction log
log4j,logback, slf4j Write-ahead log
log4cxx, log4cpp, Binlog,redo log
log4cplus, glog, g2log, Journaling
Pantheios, ezlogger Log-structured FS
Here we mean diagnostic log/logging
Textual,
human readable
grep/sed/awk friendly
2012/06 www.chenshuo.com
4. Features of common logging library
4
Multiple log levels, enable/disable at runtime
TRACE, DEBUG, INFO, WARN, ERROR, FATAL
Flexibilities are not all necessary!
Appenders, layouts, filters
Many possible destinations? ie. Appender
There is one true destination, really
Configure log line format at runtime? ie. Layout
You won’t change format during life time of a project
Muduo logging is configured at compile time
No xml config file, but a few lines of code in main()
2012/06 www.chenshuo.com
5. What to log in distributed system
5
Everything! All the time!
http://highscalability.com/log-everything-all-time
Log must be fast
Tensof thousands messages per sec is normal
without noticeable performance penalty
Should never block normal execution flow
Log data could be big
up to 1GB per minute, for one process on one host
An efficient logging library is a prerequisite for
any non-trivial server-side program www.chenshuo.com
2012/06
6. Frontend and backend of log lib
6
Frontend formats log messages
Backend sends log messages to destination (file)
The interface between could be as simple as
void output_log(const char* msg, int len)
However, in a multithreaded program, the most
difficult part is neither frontend nor backend, but
transfer log data from frontend to backend
Multiple producers (frontend), one consumer
Low latency / low CPU overhead for frontend
High throughput for backend
2012/06 www.chenshuo.com
7. Frontend should be easy to use
7
Two styles in C++, C/Java function vs. C++ stream
printlog(“Received %d bytes from %s”, len, client);
LOG << “Received ” << len << “ bytes from ” << client;
print*() can be made type safe, but cumbersome
You can’t pass non-POD objects as (...) arguments
Pantheios uses overloaded function templates
LOG is easier to use IMO, no placeholder in fmt str
When logging level is disabled, the whole statement
can be made a nop, almost no runtime overhead at all
http://www.drdobbs.com/cpp/201804215
2012/06 www.chenshuo.com
8. Why muduo::LogStream ?
8
std::ostream is too slow and not thread-safe
One ostringstream object per log msg is expensive
LogStream is fast because
No formatting, no manipulators, no i18n or l10n
Output integers, doubles, pointers, strings
Fmt class is provided, though
Fixed-size buffer allocated on stack, no malloc call
Also limit the max log msg to 4000 bytes, same as glog
See benchmark result at
www.cnblogs.com/Solstice/archive/2011/07/17/2108715.html
2012/06 muduo/base/test/LogStream_bench.cc www.chenshuo.com
9. Log line format is also fixed
9
You don’t want to change output format at
runtime, do you?
One log message per line, easy for grep
date time(UTC) thread-id level message source
20120603 08:02:46.125770Z 23261 INFO Hello - test.cc:51
20120603 08:02:46.126926Z 23261 WARN World - test.cc:52
20120603 08:02:46.126997Z 23261 ERROR Error - test.cc:53
No overhead of parsing format string all the time
Further more, the date time string is cached in 1 sec
TRACE and DEBUG levels are disabled by default
Turn them on with environment variable www.chenshuo.com
2012/06
10. The one true destination: LogFile
10
Local file, with rolling & timestamp in log filename
It is a joke to write large amount of log messages to
SMTP, Database, Network (FTP, “log server”)
The purpose of logging is to investigate what has
happened in case of system failure or malfunction
Network could fail, how to log that event?
Log to network may also double bandwidth usage
Be aware of log to network mapped file system
What if disk fails? The host is not usable anyway
Check dmesg and other kernel logs
2012/06 www.chenshuo.com
11. Performance requirement
11
Suppose PC server with SATA disks, no RAID
110MB/s 7200rpm single disk
Logging to local disk, 110 bytes per log message
1000k log messages per second with IO buffer
Target for a “high-performance” logging library
For a busy server with 100k qps, logging every
request should not take too much CPU time
100k log msgs per second can only serve 10k qps
The log library should be able to write 100MB/s
And lasts for seconds (at peak time of course)
2012/06 www.chenshuo.com
13. Other trick for postmortem
13
Log output must be buffered
can’t afford fflush(3)ing every time
What if program crashes, there must be some
unwritten log messages in core file, how to find ?
Put a cookie (a sentry value) at beginning of
message/buffer
Cookie can be an address of function, to be unique
Also set cookie to some other function in dtor
Identify messages with gdb find command
2012/06 www.chenshuo.com
14. Logging in multithreaded program
14
Logging must be thread-safe (no interleaving)
And efficient
Better to log to one file per process, not per thread
Easierfor reading log, not jump around files
The OS kernel has to serialize writing anyway
Thread-safe is easy, efficient is not that easy
Global lock and blocking writing are bad ideas
One background thread gathers log messages and
write them to disk. Aka. asynchronous logging.
2012/06 www.chenshuo.com
15. Asynchronous logging is a must
15
Aka. Non-blocking logging
Disk IO can block for seconds, occasionally
Cause timeouts in distributed system
and cascade effects, eg. false alarms of deadlock, etc.
Absolutely no disk IO in normal execution flow
Very important for non-blocking network programming,
check my other slides
We need a “queue” to pass log data efficiently
Not necessarily a traditional blocking queue,
no need to notify consumer every time there is
2012/06 something to write www.chenshuo.com
16. What if messages queue up?
16
Program writes log faster than disk bandwidth?
Itqueues first in OS cache
Then in the process, memory usage increase rapidly
In case of overload, the logging library should
not crash or OOM, drop messages instead
Send alerts via network if necessary
Not a problem in synchronous logging
Blocking-IO is easy/good for bandwidth throttling
2012/06 www.chenshuo.com
17. Double buffering for the queue
17
Basic idea: two buffers, swap them when one is full
Allbusiness threads write to one buffer, memcpy only
Log thread writes the other buffer to disk
Improvement: four buffers, no waiting in most case
Critical code in critical sections, next two pages
typedef boost::ptr_vector<LargeBuffer> BufferVector;
typedef BufferVector::auto_type BufferPtr;
muduo::MutexLock mutex_;
muduo::Condition cond_;
BufferPtr currentBuffer_;
BufferPtr nextBuffer_;
BufferVector buffers_;
2012/06 www.chenshuo.com
18. void AsyncLogging::append(const char* logline, int len)
{
muduo::MutexLockGuard lock(mutex_);
if (currentBuffer_->avail() > len)
{ // most common case: buffer is not full, copy data here
currentBuffer_->append(logline, len);
}
else // buffer is full, push it, and find next spare buffer
{
buffers_.push_back(currentBuffer_.release());
if (nextBuffer_) // is there is one already, use it
{
currentBuffer_ = boost::ptr_container::move(nextBuffer_);
}
else // allocate a new one
{
currentBuffer_.reset(new Buffer); // Rarely happens
}
currentBuffer_->append(logline, len);
cond_.notify();
}
18 } 2012/06 www.chenshuo.com
19. // in log thread
BufferPtr newBuffer1(new Buffer);
BufferPtr newBuffer2(new Buffer);
boost::ptr_vector<Buffer> buffersToWrite(16);
while (running_)
{
// swap out what need to be written, keep CS short
{
muduo::MutexLockGuard lock(mutex_);
cond_.waitForSeconds(flushInterval_);
buffers_.push_back(currentBuffer_.release());
currentBuffer_ = boost::ptr_container::move(newBuffer1);
buffersToWrite.swap(buffers_);
if (!nextBuffer_)
{
nextBuffer_ = boost::ptr_container::move(newBuffer2);
}
}
// output buffersToWrite, re-fill newBuffer1/2
}
// final note: bzero() each buffer initially to avoid page faults
19 2012/06 www.chenshuo.com
20. Alternative solutions?
20
Use normal muduo::BlockingQueue<string> or
BoundedBlockingQueue<string> as the queue
Allocate memory for every log message, you need a
good malloc that optimized for multithreading
Replace stack buffer with heap buffer in LogStream
Instead of copying data, passing pointer might be faster
But as I tested, copying data is 3x faster for small msgs ~4k
That’s why muduo only provides one AsyncLogging class
More buffers, reduce lock contention
Like ConcurrentHashMap, buckets hashed by thread id
2012/06 www.chenshuo.com
21. Conclusion
21
Muduo logging library is fairly speedy
It provides the most fundamental features
No bells and whistles, no unnecessary flexibilities
Frontend: LogStream and Logging classes
LOG_INFO << "Hello";
20120603 08:02:46.125770Z 23261 INFO Hello - test.cc:51
Backend: LogFile class, rolling & timestamped
logfile_test.20120603-144022.hostname.3605.log
For multithreaded, use AsyncLogging class
Check examples in muduo/base/tests for how to use
2012/06 www.chenshuo.com