SlideShare une entreprise Scribd logo
1  sur  17
Télécharger pour lire hors ligne
5 coding hacks to reduce GC
overhead
www.takipi.com
1. Avoid Implicit String Allocations
Strings are an integral part of almost every data structure we manage.
Heavier than other primitive values, they have a very strong impact on memory usage.
Strings are immutable - they cannot be modified after allocation.
Operators such as “+” allocate a new String to hold the resulting value.
In addition, an implicit StringBuilder is allocated to do the actual work of combining them.
For this statement:
a = a + b; //a and b are Strings
The compiler generates comparable code behind the scenes:
StringBuilder temp = new StringBuilder(a).
temp.append(b);
a = temp.toString(); // a new String is allocated. The previous “a” is now garbage.
But it can get worse. These simple statements makes 5 implicit allocations (3 Strings and 2 StringBuilders) –
String result = foo() + arg;
result += boo();
System.out.println(“result = “ + result);
The Solution
Reducing implicit String allocations in high-scale code by proactively using StringBuilders.
This example achieves the same result , while allocating only 1 StringBuilder + 1 String, instead of the original 5
allocations.
StringBuilder value = new StringBuilder(“result = “);
value.append(foo()).append(arg).append(boo());
System.out.println(value);
2. Plan List Capacities
Collections such as ArrayLists, HashMaps and TreeMaps are implemented using underlying Object[] arrays.
Like Strings (themselves wrappers over char[] arrays), arrays are also immutable.
The obvious question then becomes - how can we add/put items in collections if their underlying array’s size is
immutable?
The answer is obvious as well - by allocating more implicit arrays.
Let’s look at this example -
List<Item> items = new ArrayList<Item>();
for (int i = 0; i < len; i++)
{
Item item = readNextItem();
items.add(item);
}
The value of len determines the ultimate size of items once the loop finishes.
len is unknown to the constructor of the ArrayList which allocates an initial Object[]with a default size.
Whenever the capacity of the array is exceeded, it’s replaced with a new array of sufficient length, making the
previous array garbage.
If the loop executes > 100 times you may be forcing multiple arrays allocations + collections.
The Solution
Whenever possible, allocate lists and maps with an initial capacity:
List<MyObject> items = new ArrayList<MyObject>(len);
This ensures no unnecessary allocations and collection of arrays occur at runtime.
If you don’t know the exact size, go with an estimate (e.g. 1024, 4096) of what an average size would be, and add
some buffer to prevent accidental overflows.
3. Use Efficient Primitive Collections
Java currently support s collections with a primitive key or value through the use of “boxing” - wrapping a
primitive value in a standard object which can be allocated and collected by the GC.
For each key / value entry added to a HashMap an internal object is allocated to hold the pair.
A necessary evil, this means an extra allocation and possible deallocation every time you put an item in a map.
Use Efficient Primitive Collections (2)
A very common case is to hold a map between a primitive value (such as an Id) and an object.
Since Java’s HashMap is built to hold object types (vs. primitives), for every insertion into the map can
potentially allocate yet another object to hold the primitive value (“boxing” it).
This can potentially more than triple GC overhead for the map.
For those coming from a C++ background this can really be troubling news, where STL templates solve this
problem very efficiently.
The Solution
Luckily, this problem is being worked on for next versions of Java.
Until then, it’s been dealt with quite efficiently by some great libraries which provide primitive trees, maps and
lists for each of Java’s primitive types.
We strongly recommend Trove, which we’ve worked with for quite a while and found that can really reduce GC
overhead in high scale code.
4. Use Streams Instead of In-memory Buffers
Most of the data we manipulate in server applications comes to us as files or data streamed over the network
from another web service or DB.
In most cases, the data is in serialized form, and needs to be deserialized objects before we can begin operating
on it.
This stage is very prone to large implicit allocations.
Use Streams Instead of In-memory Buffers (2)
A common thing to do is read the data into memory using a ByteArrayInputStream or ByteBuffer and pass that
on to the deserialization code.
This can be a bad move, as you’d need to allocate and later deallocate room for that data in its entirety while
constructing objects out of it .
Since the size of the data can be of unknown size, you’ll have to allocate and deallocate internal byte[] arrays
to hold the data as it grows beyond the initial buffer’s capacity.
The Solution
Most persistence libraries such as Java’s native serialization, Google’s Protocol Buffers, etc. are built to
deserialize data directly from the incoming file or network stream.
This is done without ever having to keep it in memory, and without having to allocate new internal byte arrays
to hold the data as it grows.
If available, go for that approach vs. loading binary data into memory.
5. Aggregate Lists
Immutability is a beautiful thing, but in some high scale situations it can have some serious drawbacks.
When returning a collection from a function, it’s usually advisable to create the collection (e.g. ArrayList) within
the method, fill it and return it in the form of a Collection<> interface.
A noticeable cases where this doesn’t work well is when collections are aggregated from multiple method calls
into a final collection.
This can mean massive allocation of interim collections.
The Solution
Aggregate values into a single collection that’s passed into those methods as a parameter.
Example 1 (inefficient) -
List<Item> items = new ArrayList<Item>();
for (FileData fileData : fileDatas)
{
items.addAll(readFileItem(fileData)); //Each invocation creates a new interim list
//with internal interim arrays
}
Example 2 (efficient):
List<Item> items = new ArrayList<Item>(fileDatas.size() * averageFileDataSize * 1.5);
for (FileData fileData : fileDatas)
{
readFileItem(fileData, items); //fill items inside, no interim allocations
}
Additional reading -
String interning - http://plumbr.eu/blog/reducing-memory-usage-with-string-intern
Efficient wrappers - http://vanillajava.blogspot.co.il/2013/04/low-gc-coding-efficient-listeners.html
library/-trove-collections-types-performance.info/primitive-http://java-Using Trove
www.takipi.com
God Mode in Production Code
@takipid

Contenu connexe

Tendances

Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & PythonLonghow Lam
 
Basic ideas on keras framework
Basic ideas on keras frameworkBasic ideas on keras framework
Basic ideas on keras frameworkAlison Marczewski
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkDatabricks
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16MLconf
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaSpark Summit
 
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...MLconf
 
Chapter 10: hashing data structure
Chapter 10:  hashing data structureChapter 10:  hashing data structure
Chapter 10: hashing data structureMahmoud Alfarra
 
4.4 external hashing
4.4 external hashing4.4 external hashing
4.4 external hashingKrish_ver2
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersAlbert Bifet
 
Sql Connection and data table and data set and sample program in C# ....
Sql Connection and data table and data set and sample program in C# ....Sql Connection and data table and data set and sample program in C# ....
Sql Connection and data table and data set and sample program in C# ....Hari Haran
 
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014Edwin de Jonge
 
K-means Clustering with Scikit-Learn
K-means Clustering with Scikit-LearnK-means Clustering with Scikit-Learn
K-means Clustering with Scikit-LearnSarah Guido
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLFlink Forward
 
Dma
DmaDma
DmaAcad
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016MLconf
 
Fast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsFast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsAlbert Bifet
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewPoo Kuan Hoong
 

Tendances (20)

Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & Python
 
Basic ideas on keras framework
Basic ideas on keras frameworkBasic ideas on keras framework
Basic ideas on keras framework
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
 
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
 
Chapter 10: hashing data structure
Chapter 10:  hashing data structureChapter 10:  hashing data structure
Chapter 10: hashing data structure
 
User biglm
User biglmUser biglm
User biglm
 
Hadoop map reduce concepts
Hadoop map reduce conceptsHadoop map reduce concepts
Hadoop map reduce concepts
 
arrays
arraysarrays
arrays
 
4.4 external hashing
4.4 external hashing4.4 external hashing
4.4 external hashing
 
Efficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream ClassifiersEfficient Online Evaluation of Big Data Stream Classifiers
Efficient Online Evaluation of Big Data Stream Classifiers
 
Sql Connection and data table and data set and sample program in C# ....
Sql Connection and data table and data set and sample program in C# ....Sql Connection and data table and data set and sample program in C# ....
Sql Connection and data table and data set and sample program in C# ....
 
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014
 
K-means Clustering with Scikit-Learn
K-means Clustering with Scikit-LearnK-means Clustering with Scikit-Learn
K-means Clustering with Scikit-Learn
 
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSLSebastian Schelter – Distributed Machine Learing with the Samsara DSL
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
 
Dma
DmaDma
Dma
 
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
Tom Peters, Software Engineer, Ufora at MLconf ATL 2016
 
Fast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data StreamsFast Perceptron Decision Tree Learning from Evolving Data Streams
Fast Perceptron Decision Tree Learning from Evolving Data Streams
 
TensorFlow and Keras: An Overview
TensorFlow and Keras: An OverviewTensorFlow and Keras: An Overview
TensorFlow and Keras: An Overview
 

Similaire à 5 Coding Hacks to Reduce GC Overhead

Best practices in Java
Best practices in JavaBest practices in Java
Best practices in JavaMudit Gupta
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0Joe Stein
 
Probabilistic breakdown of assembly graphs
Probabilistic breakdown of assembly graphsProbabilistic breakdown of assembly graphs
Probabilistic breakdown of assembly graphsc.titus.brown
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Martin Odersky
 
Data structures cs301 power point slides lecture 01
Data structures   cs301 power point slides lecture 01Data structures   cs301 power point slides lecture 01
Data structures cs301 power point slides lecture 01shaziabibi5
 
Clustering_Algorithm_DR
Clustering_Algorithm_DRClustering_Algorithm_DR
Clustering_Algorithm_DRNguyen Tran
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
Data structure and algorithm.
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm. Abdul salam
 
‘go-to’ general-purpose sequential collections - from Java To Scala
‘go-to’ general-purpose sequential collections -from Java To Scala‘go-to’ general-purpose sequential collections -from Java To Scala
‘go-to’ general-purpose sequential collections - from Java To ScalaPhilip Schwarz
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1ecomputernotes
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntEugene Yan Ziyou
 
An Engineer's Intro to Oracle Coherence
An Engineer's Intro to Oracle CoherenceAn Engineer's Intro to Oracle Coherence
An Engineer's Intro to Oracle CoherenceOracle
 
C optimization notes
C optimization notesC optimization notes
C optimization notesFyaz Ghaffar
 
Lab 1 Recursion  Introduction   Tracery (tracery.io.docx
Lab 1 Recursion  Introduction   Tracery (tracery.io.docxLab 1 Recursion  Introduction   Tracery (tracery.io.docx
Lab 1 Recursion  Introduction   Tracery (tracery.io.docxsmile790243
 

Similaire à 5 Coding Hacks to Reduce GC Overhead (20)

Best practices in Java
Best practices in JavaBest practices in Java
Best practices in Java
 
Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 
Java performance
Java performanceJava performance
Java performance
 
Probabilistic breakdown of assembly graphs
Probabilistic breakdown of assembly graphsProbabilistic breakdown of assembly graphs
Probabilistic breakdown of assembly graphs
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
 
Data structures cs301 power point slides lecture 01
Data structures   cs301 power point slides lecture 01Data structures   cs301 power point slides lecture 01
Data structures cs301 power point slides lecture 01
 
Clustering_Algorithm_DR
Clustering_Algorithm_DRClustering_Algorithm_DR
Clustering_Algorithm_DR
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Data structure and algorithm.
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
 
15 Jo P Mar 08
15 Jo P Mar 0815 Jo P Mar 08
15 Jo P Mar 08
 
Scalax
ScalaxScalax
Scalax
 
Any Which Array But Loose
Any Which Array But LooseAny Which Array But Loose
Any Which Array But Loose
 
‘go-to’ general-purpose sequential collections - from Java To Scala
‘go-to’ general-purpose sequential collections -from Java To Scala‘go-to’ general-purpose sequential collections -from Java To Scala
‘go-to’ general-purpose sequential collections - from Java To Scala
 
CPP Homework Help
CPP Homework HelpCPP Homework Help
CPP Homework Help
 
computer notes - Data Structures - 1
computer notes - Data Structures - 1computer notes - Data Structures - 1
computer notes - Data Structures - 1
 
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learntKaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
Kaggle Otto Challenge: How we achieved 85th out of 3,514 and what we learnt
 
An Engineer's Intro to Oracle Coherence
An Engineer's Intro to Oracle CoherenceAn Engineer's Intro to Oracle Coherence
An Engineer's Intro to Oracle Coherence
 
C optimization notes
C optimization notesC optimization notes
C optimization notes
 
Lab 1 Recursion  Introduction   Tracery (tracery.io.docx
Lab 1 Recursion  Introduction   Tracery (tracery.io.docxLab 1 Recursion  Introduction   Tracery (tracery.io.docx
Lab 1 Recursion  Introduction   Tracery (tracery.io.docx
 

Plus de Takipi

Advanced Production Debugging
Advanced Production DebuggingAdvanced Production Debugging
Advanced Production DebuggingTakipi
 
AppDynamics VS New Relic – The Complete Guide
AppDynamics VS New Relic – The Complete GuideAppDynamics VS New Relic – The Complete Guide
AppDynamics VS New Relic – The Complete GuideTakipi
 
The Dark Side Of Lambda Expressions in Java 8
The Dark Side Of Lambda Expressions in Java 8The Dark Side Of Lambda Expressions in Java 8
The Dark Side Of Lambda Expressions in Java 8Takipi
 
Java 9 – The Ultimate Feature List
Java 9 – The Ultimate Feature ListJava 9 – The Ultimate Feature List
Java 9 – The Ultimate Feature ListTakipi
 
7 New Tools Java Developers Should Know
7 New Tools Java Developers Should Know7 New Tools Java Developers Should Know
7 New Tools Java Developers Should KnowTakipi
 
JVM Performance Magic Tricks
JVM Performance Magic TricksJVM Performance Magic Tricks
JVM Performance Magic TricksTakipi
 
JVM bytecode - The secret language behind Java and Scala
JVM bytecode - The secret language behind Java and ScalaJVM bytecode - The secret language behind Java and Scala
JVM bytecode - The secret language behind Java and ScalaTakipi
 

Plus de Takipi (7)

Advanced Production Debugging
Advanced Production DebuggingAdvanced Production Debugging
Advanced Production Debugging
 
AppDynamics VS New Relic – The Complete Guide
AppDynamics VS New Relic – The Complete GuideAppDynamics VS New Relic – The Complete Guide
AppDynamics VS New Relic – The Complete Guide
 
The Dark Side Of Lambda Expressions in Java 8
The Dark Side Of Lambda Expressions in Java 8The Dark Side Of Lambda Expressions in Java 8
The Dark Side Of Lambda Expressions in Java 8
 
Java 9 – The Ultimate Feature List
Java 9 – The Ultimate Feature ListJava 9 – The Ultimate Feature List
Java 9 – The Ultimate Feature List
 
7 New Tools Java Developers Should Know
7 New Tools Java Developers Should Know7 New Tools Java Developers Should Know
7 New Tools Java Developers Should Know
 
JVM Performance Magic Tricks
JVM Performance Magic TricksJVM Performance Magic Tricks
JVM Performance Magic Tricks
 
JVM bytecode - The secret language behind Java and Scala
JVM bytecode - The secret language behind Java and ScalaJVM bytecode - The secret language behind Java and Scala
JVM bytecode - The secret language behind Java and Scala
 

Dernier

Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 

Dernier (20)

Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 

5 Coding Hacks to Reduce GC Overhead

  • 1. 5 coding hacks to reduce GC overhead www.takipi.com
  • 2. 1. Avoid Implicit String Allocations Strings are an integral part of almost every data structure we manage. Heavier than other primitive values, they have a very strong impact on memory usage. Strings are immutable - they cannot be modified after allocation. Operators such as “+” allocate a new String to hold the resulting value. In addition, an implicit StringBuilder is allocated to do the actual work of combining them.
  • 3. For this statement: a = a + b; //a and b are Strings The compiler generates comparable code behind the scenes: StringBuilder temp = new StringBuilder(a). temp.append(b); a = temp.toString(); // a new String is allocated. The previous “a” is now garbage. But it can get worse. These simple statements makes 5 implicit allocations (3 Strings and 2 StringBuilders) – String result = foo() + arg; result += boo(); System.out.println(“result = “ + result);
  • 4. The Solution Reducing implicit String allocations in high-scale code by proactively using StringBuilders. This example achieves the same result , while allocating only 1 StringBuilder + 1 String, instead of the original 5 allocations. StringBuilder value = new StringBuilder(“result = “); value.append(foo()).append(arg).append(boo()); System.out.println(value);
  • 5. 2. Plan List Capacities Collections such as ArrayLists, HashMaps and TreeMaps are implemented using underlying Object[] arrays. Like Strings (themselves wrappers over char[] arrays), arrays are also immutable. The obvious question then becomes - how can we add/put items in collections if their underlying array’s size is immutable? The answer is obvious as well - by allocating more implicit arrays.
  • 6. Let’s look at this example - List<Item> items = new ArrayList<Item>(); for (int i = 0; i < len; i++) { Item item = readNextItem(); items.add(item); } The value of len determines the ultimate size of items once the loop finishes. len is unknown to the constructor of the ArrayList which allocates an initial Object[]with a default size. Whenever the capacity of the array is exceeded, it’s replaced with a new array of sufficient length, making the previous array garbage. If the loop executes > 100 times you may be forcing multiple arrays allocations + collections.
  • 7. The Solution Whenever possible, allocate lists and maps with an initial capacity: List<MyObject> items = new ArrayList<MyObject>(len); This ensures no unnecessary allocations and collection of arrays occur at runtime. If you don’t know the exact size, go with an estimate (e.g. 1024, 4096) of what an average size would be, and add some buffer to prevent accidental overflows.
  • 8. 3. Use Efficient Primitive Collections Java currently support s collections with a primitive key or value through the use of “boxing” - wrapping a primitive value in a standard object which can be allocated and collected by the GC. For each key / value entry added to a HashMap an internal object is allocated to hold the pair. A necessary evil, this means an extra allocation and possible deallocation every time you put an item in a map.
  • 9. Use Efficient Primitive Collections (2) A very common case is to hold a map between a primitive value (such as an Id) and an object. Since Java’s HashMap is built to hold object types (vs. primitives), for every insertion into the map can potentially allocate yet another object to hold the primitive value (“boxing” it). This can potentially more than triple GC overhead for the map. For those coming from a C++ background this can really be troubling news, where STL templates solve this problem very efficiently.
  • 10. The Solution Luckily, this problem is being worked on for next versions of Java. Until then, it’s been dealt with quite efficiently by some great libraries which provide primitive trees, maps and lists for each of Java’s primitive types. We strongly recommend Trove, which we’ve worked with for quite a while and found that can really reduce GC overhead in high scale code.
  • 11. 4. Use Streams Instead of In-memory Buffers Most of the data we manipulate in server applications comes to us as files or data streamed over the network from another web service or DB. In most cases, the data is in serialized form, and needs to be deserialized objects before we can begin operating on it. This stage is very prone to large implicit allocations.
  • 12. Use Streams Instead of In-memory Buffers (2) A common thing to do is read the data into memory using a ByteArrayInputStream or ByteBuffer and pass that on to the deserialization code. This can be a bad move, as you’d need to allocate and later deallocate room for that data in its entirety while constructing objects out of it . Since the size of the data can be of unknown size, you’ll have to allocate and deallocate internal byte[] arrays to hold the data as it grows beyond the initial buffer’s capacity.
  • 13. The Solution Most persistence libraries such as Java’s native serialization, Google’s Protocol Buffers, etc. are built to deserialize data directly from the incoming file or network stream. This is done without ever having to keep it in memory, and without having to allocate new internal byte arrays to hold the data as it grows. If available, go for that approach vs. loading binary data into memory.
  • 14. 5. Aggregate Lists Immutability is a beautiful thing, but in some high scale situations it can have some serious drawbacks. When returning a collection from a function, it’s usually advisable to create the collection (e.g. ArrayList) within the method, fill it and return it in the form of a Collection<> interface. A noticeable cases where this doesn’t work well is when collections are aggregated from multiple method calls into a final collection. This can mean massive allocation of interim collections.
  • 15. The Solution Aggregate values into a single collection that’s passed into those methods as a parameter. Example 1 (inefficient) - List<Item> items = new ArrayList<Item>(); for (FileData fileData : fileDatas) { items.addAll(readFileItem(fileData)); //Each invocation creates a new interim list //with internal interim arrays } Example 2 (efficient): List<Item> items = new ArrayList<Item>(fileDatas.size() * averageFileDataSize * 1.5); for (FileData fileData : fileDatas) { readFileItem(fileData, items); //fill items inside, no interim allocations }
  • 16. Additional reading - String interning - http://plumbr.eu/blog/reducing-memory-usage-with-string-intern Efficient wrappers - http://vanillajava.blogspot.co.il/2013/04/low-gc-coding-efficient-listeners.html library/-trove-collections-types-performance.info/primitive-http://java-Using Trove
  • 17. www.takipi.com God Mode in Production Code @takipid