1. The “Evils” of OptimizationOr: Performance Anxiety Can Cause Premature Optimization James Hare – Application Architect, Scottrade
2. What is Optimization? Definition from freedictionary.com: “The procedure or procedures used to make a system or design as effective or functional as possible, especially the mathematical techniques involved” This is a general term applied to the optimization of any process or system. What does it mean for us?
3. Ada Lovelace “In almost every computation a great variety of arrangements for the succession of the processes is possible, and various considerations must influence the selection amongst them for the purposes of a Calculating Engine. One essential object is to choose that arrangement which shall tend to reduce to a minimum the time necessary for completing the calculation.” - Ada Byron’s Notes on Charles Babbage's Analytical Engine, 1842
4. Software Optimization Software can be optimized in several areas: In the design of system Choice of appropriate techniques and structures. In the code that implements the system Choice of logic that implements a feature. In the compiler that builds the system Improvements the compiler bakes into assemblies. In the runtime that executes the system Choices the CLR/JIT can make in executing assemblies.
5. So Optimization is Good, Right? Well, yes and no… Yes: Time optimizing design is nearly always well spent. Optimizing a known bottleneck will increase speed. Compiler and CLR optimizations are already there and do not impact readability or maintainability. No: “Slower” is nearly always better than wrong. Optimizing code before you have a measured need can quickly become an anti-pattern.
6. William Wulf “More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason – including blind stupidity.” "A Case Against the GOTO," Proceedings of the 25th National ACM Conference, August 1972, pp. 791-97.
7. Is an Anti-Pattern a Design Pattern? No, most of us are familiar with the term Design Pattern: A general, reusable solution to a commonly occurring problem in software design. Not a finished design, but a general description or template for how to solve a problem. Became very popular as a concept after the book Design Patterns: Elements of Reusable Object-Oriented Software released in 1994 by the “Gang of Four”.
8. Okay, So What’s an Anti-Pattern? It is the antithesis of a Design Pattern. It is a pattern that tends to be commonly used, but which usually turns out to be ineffective and/or counterproductive. Term was coined by Andrew Koenig in 1995 in his article Patterns and Antipatterns. Made popular in 1999 with the book AntiPatterns: Refactoring Software, Architectures, and Projects in Crisis.
9. How Do Anti-Patterns Relate to Optimization? Anti-Patterns cover a wide range from: Organizational (Analysis Paralysis, etc.) Project Management (Death March, etc.) Analysis (Bystander Apathy) Software Design (Big Ball of Mud, etc.) Object-Oriented Design (God Object, etc.) Programming (Spaghetti Coder, etc.) Configuration Management (Dependency Hell, etc.) Methodological (Premature-Optimization, etc.)
10. Donald Knuth “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified” “Structured Programming with Goto Statements”, Computing Surveys6:4 (1974), 261–301.
11. Premature Optimization? Performance anxiety disproportionately affects the design of a piece of code. Oftentimes in premature optimization, the developer has no measurable data to show the code needs optimization. This can result in code optimizations that have no net impact on overall performance. Many times this can happen by misapplied optimization “rules-of-thumb.”
12. Premature Optimization Can Introduce Bugs Code may become overly complicated since programmer is distracted by the perceived need to optimize. Bugs may be immediately introduced during an incorrect optimization. Since code becomes less maintainable, future enhancements are more likely to cause bugs. Remember: A slightly slower program is better than an erroneous program.
13. Premature Optimization CanBe a Waste of Time Remember Knuth: Ignore small efficiency gains 97% of the time, may pay attention to the 3% after they have been identified. Optimizing code that is not a bottleneck is almost always a waste of time. Slows the entire process for little benefit, or worse yet may add maintenance costs or bugs. Correct code delivered faster is often better than fastest incorrect code delivered slowly!
14. Rob Pike “Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you have proven that's where the bottleneck is.” “Notes on Programming in C”, Feburary 21, 1989
15. So Is Optimization Always Bad? Not at all, only optimizing for the sake of optimizing is bad. Good places for optimization are: Early in system and component design. When you find measurable bottlenecks. In those rare, but necessary, “real-time” systems. When your group has “standard” optimizations.
16. Best Place to Catch Bottlenecks? source: http://en.wikipedia.org/wiki/Waterfall_model
17. Design Optimizations Most system bottlenecks are from bad design. When designing systems: Use multithreading correctly when appropriate. Use best algorithms/collections for the problem. Avoid designing a single component/process that will become a bottleneck to the whole system. Choose the best communication protocol and methodology for the problem. Prefer to design web methods to be chunky. Cache infrequently changing remote data.
18. Design: Multi-Threading Wisely Use the .NET 4.0 Concurrent Collections wherever appropriate. Strongly consider the .NET 4.0 TPL over traditional multi-threading. Use locking judiciously, keep locks as small in scope as possible and avoid holding multiple locks at same time. Avoid serial thread processing (handing off from bucket to bucket to bucket…)
19. Design: Use Best Algorithms Favor keeping algorithmic complexity low:
20. Design: Use Best Algorithms Know and use your LINQ algorithms: Finds: Any(), First(), Find(), etc. Queries: Select(), Where(), etc. Grouping: GroupBy() Sorting: OrderBy() Etc. Already written, optimized, and unit tested. Can make code much easier to read and maintain.
22. Design: Use Best Collection Know each collection’s strengths and weaknesses. In general choose based on: Ordering: Do you need to maintain order? Lookup: Is fast lookup the ultimate goal? Insert/Deletes: Where are inserts/deletes performed? Are they needed? Synchronization: Do you need multi-threaded access of a mutable collection?
23.
24. Design: Avoid Bottlenecks If one process or component in a system has to serialize and process all items slowly, it doesn’t matter how parallel the rest of your system is.
25. Design: Use Best Communication Know best method for the problem: Do you need broadcast communication or not care if some packets get dropped? Consider UDP. Do you need connection-oriented reliable synchronous communication? Consider TCP. Do you need to be able to process messages asynchronously? Consider message queues. Do you need to be able to make cross-platform calls? Consider web methods or message queues.
26. Design: Web Methods Good OO design ≠ good distributed design. Prefer “chunky” to “chatty” web methods as they will have less network overhead. “chunky”: one method returns all data needed. “chatty”: many small methods that return pieces. Do not return DataSet! Very, very, very large! Return array or List<T> of custom type instead.
27. Design: Caching When remote data never changes, cache it! Can use unsynchronized Dictionary. Much faster than network lookup. When remote data changes infrequently, cache with expiration or refresh. Consider ConcurrentDictionary. In ASP.NET use Application and Session caches. Consider distributed cache when appropriate. AppFabric (Velocity), Coherence, etc.
28. Measurable Bottlenecks Most bottlenecks can be avoided in design. However, some bottlenecks only become apparent after testing or possibly even later in production after user or data size growth. Measure, Improve, and Re-Measure: Don’t guess at location or cause ever! Profile the code to find the bottleneck. Solve the revealed bottleneck and document. Re-profile to make sure code is streamlined.
31. “Real-time” Systems Obviously, there are some rare cases where performance is paramount: Flight control systems. Algorithmic trading. Gaming. In these cases, you can always code with an eye towards performance. The decisions on how to optimize, though, should be well known and uniform for group.
32. “Standard” Optimizations These are a double-edged sword because they can often be misapplied or incorrectly understood. Framework changes may alter the efficiency of an operation/method to where net gain is nil. Any such optimizations should be decided on as a group and standardized. In general, these should be avoided except for Microsoft recommended best practices.
33. Microsoft Performance Recommendations Throw fewer exceptions Make chunky calls Use value types for small immutable data Use AddRange() when possible Trim your Working Set Use for for string iteration (careful!) Use StringBuilder for complex String manipulation Use jagged arrays vs. rectangular arrays Etc.
34. “Standard” Misuse Example: Someone hears that StringBuilder is more efficient than concatenation and sacrifices readability for “performance”:
35. “Standard” Misuse String concatenation (+) is faster for single-concatenations with two or more arguments. Knows size of arguments in advance and computes correct buffer size. StringBuilder uses a default buffer size and then re-allocates if needs more space. String concatenation of literals is done at compile time and has no impact. Don’t apply “standard” rules without fully knowing the ramifications.
36. Summary Optimizing design and known bottlenecks is nearly always a beneficial activity. Sacrificing maintainability for unmeasured performance gain is problematic: Code that is harder to maintain is more likely to have initial bugs or bugs in modification. Most of the time micro-optimizations have no affect on overall system performance and just creates a time sink.
37. Michael A. Jackson “There are two rules for when to optimize: Don't do it. (For experts only) Don't do it yet." “Principles of Program Design”, Academic Press, London and New York, 1975.