This presentation introduces some concepts about the Java Collection framework. These slides introduce the following concepts:
- Collections and iterators
- Linked list and array list
- Hash set and tree set
- Maps
- The collection framework
The presentation is took from the Java course I run in the bachelor-level informatics curriculum at the University of Padova.
2. Programmazione concorrente e distribuita
SUMMARY
Introduction
Collections and iterators
Linked list
Array list
Hash set
Tree set
Maps
Collections framework
2Riccardo Cardin
3. Programmazione concorrente e distribuita
INTRODUCTION
Data structures can make a big difference in
programming
Do you need to search quickly? Do you need to rapidly
insert and remove element? ...
The first version of Java was supplied with a very
small set of classes
Vector, Stack, Hashtable, BitSet and Enumeration
Java needed a serious data structure library
Without the complexity of C++ STL
With the benefit of «generic algorithms» of C++ STL
The collections framework satisfies those needs
3Riccardo Cardin
4. Programmazione concorrente e distribuita
INTRODUCTION
Java collection framework separates interfaces
and implementations
It defines a set of interfaces that represents abstract
data structures
Each interface is then implemented in different ways,
each with main focus on some aspects
For example, fast insertion, fast search, fixed memory
consumption, ...
Using interfaces, you can change the implementation
simply modifying a single statement
4Riccardo Cardin
// ArrayList optimizes random access, LinkedList modification stmts
List<Integer> list = new ArrayList<>();
list = new LinkedList<>()
6. Programmazione concorrente e distribuita
COLLECTIONS AND ITERATORS
The fundamental interface for collection is the
Collection interface
The add method adds an element ot the collection
Returns true if the collection was changed by the addition
Iterators are used visit elements in the collection
Implements the Iterator design pattern
6Riccardo Cardin
public interface Collection<E> {
boolean add(E element);
Iterator<E> iterator();
// ...
}
public interface Iterator<E> {
E next();
boolean hasNext();
void remove();
}
7. Programmazione concorrente e distribuita
COLLECTIONS AND ITERATORS
Using iterators we have decoupled a collection
from the traversing policies
The order in which elements are visited depends on
the the collection type
Using the next method the collection is visited one
element at time
If the end of the collection is reached an
NoSuchElementException is thrown
Use the hasNext method to check if the collection
has more elements to visit
Think of Java iterators as being between elements
The iterator jumps over the next element
7Riccardo Cardin
8. Programmazione concorrente e distribuita
COLLECTIONS AND ITERATORS
8Riccardo Cardin
Think of Java iterators as being
between elements
While moving, the iterator returns
the current element
9. Programmazione concorrente e distribuita
COLLECTIONS AND ITERATORS
Using next and hasNext it is possibile to
traverse the collection
As of Java 5, there is an elegant shortcut to looping a
collection using iterators
To use the for each loop the data structure must
implement Iterable<E> interface
9Riccardo Cardin
for (String element : c) {
// do something with element
}
Collection<String> c = /* ... */;
Iterator<String> iter = c.iterator();
while (iter.hasNext()) {
String element = iter.next();
// do something with element
}
10. Programmazione concorrente e distribuita
COLLECTIONS AND ITERATORS
Using an iterator it possibile to remove elements
from a collection
The remove method removes the element that was
returned by the last call to next
There is a strong dependency between next and remove: it
is illegal to call remove if it wasn’t preceded by a call to next
This is the only way to safely modify a collection after
the creation of an iterator
10Riccardo Cardin
for (Iterator<String> iterator = list.iterator();
iterator.hasNext();) {
// Point the iterator to the current element
String string = iterator.next();
iterator.remove();
}
12. Programmazione concorrente e distribuita
CONCRETE COLLECTIONS
All concrete collections implement Collection
or Map interfaces
We will introduce only few of them
12Riccardo Cardin
Collection type Description
ArrayList An indexed sequence that grows and
shrinks dynamically
LinkedList An ordered sequence that allows efficient
insertion and removal at any location
HashSet An unordered collection that rejects
duplicates
TreeSet A sorted set
HashMap A data structure that stores key/value
associations
13. Programmazione concorrente e distribuita
LINKED LIST
A LinkedList<E> is an ordered data structure
that stores each object in a link
Each link store a reference to the next link of the seq.
In Java, a linked list is always doubly linked
13Riccardo Cardin
Implements List<E>
interface
14. Programmazione concorrente e distribuita
LINKED LIST
Very efficient for remove and add operations
These ops are made through an Iterator
Other elements in the list have not to be repositioned after
removal of an element
14Riccardo Cardin
15. Programmazione concorrente e distribuita
LINKED LIST
Also di add operation is made efficiently
through an iterator of type ListIterator<E>
Use List.listIterator method to get one
New element is added before the current position of
the iterator
Be carefull of concurrent modification using iterators
Linked list are very inefficient in random access
15Riccardo Cardin
interface ListIterator<E> extends Iterator<E> {
void add(E element);
E previous()
boolean hasPrevious()
}
for (int i = 0; i < list.size(); i++)
// do something with list.get(i);
16. Programmazione concorrente e distribuita
LINKED LIST
16Riccardo Cardin
Adding a new element
changes at most two
references
17. Programmazione concorrente e distribuita
ARRAY LIST
An ArrayList<E> is an ordered data structure
that is very efficient in random access ops.
It encapsulate a dynamically reallocated array of
objects
Adding and removing operation
are not so efficient
Reallocation of elements is needed
Use ArrayList instead of Vector
More efficient due to its not
synchronized methods
17Riccardo Cardin
19. Programmazione concorrente e distribuita
HASH SET
A Set<E> is a data structure that doesn’t care
about element’s ordering
A set contain the same element only once
Search operation performs very efficiently
A set is a Collection
An HashSet<E> uses hash codes to distinguish
among elements
An hashCode is a number that can be derived from
object data
You must provide an hash function to your classes
The function must be compatible with the equals method
19Riccardo Cardin
20. Programmazione concorrente e distribuita
HASH SET
Hash codes have to be computed quickly
Hash functions have some hash collision
Using the correct hash function the number of collision
should be unlikely
Hash table are implemented as an array of linked lists
20Riccardo Cardin
Using an hash function, the
bucket in which inserting a new
element is equal to
hash(element) % #bucket
If the hash function produceds values
that are randomly distributed, collision
should be rare
buckets
21. Programmazione concorrente e distribuita
HASH SET
An HashSet is implemented using an hash table
The contains method is very efficient, because it has
to lookup the element only in one bucket
An iterator to hash set visit each bucket in turn
Because of scattering, they are visited in a seemingly random
order
The add method adds an element if it is not already
present
Don’t mutate an element in a set once inserted
If the hash code of an element were to change, the
element would no longer be in the correct position
21Riccardo Cardin
22. Programmazione concorrente e distribuita
TREE SET
A TreeSet<E> is a sorted set
While iterating over the collection, the elements are
presented in sorted order
It uses a red-black tree to store data
Insertion is slower than insertion in an hash table, but it is
still much faster than insertion in an array or linked list
...but a tree set automatically sorts the elements ;)
The type of the elements may implement
Comparable<T>
22Riccardo Cardin
public interface Comparable<T> {
// It returns a value that is < 0, = 0 or > 0
int compareTo(T other);
}
23. Programmazione concorrente e distribuita
TREE SET
What if elements do not implement Comparable
or if you need more than on compation alg?
Provide a Comparator during set construction
The compare method acts like the compareTo method
Function object (lambda anyone?!)
Using an HashSet or a TreeSet?
There must be a total ordering defined on elements
You have to implement also the Comparator interface
Do you neeed elements to be sorted?
23Riccardo Cardin
public interface Comparator<T> {
// Defines how two elements of type T have to be compared
int compare(T a, T b);
}
25. Programmazione concorrente e distribuita
MAPS
A Map<K,V> is a data structure that allows you
to search an element using a key
A map stores key/value pairs
Key must be unique: If you call the put method twice with
the same key, the second value replaces the first one.
If no info is associated with a key, get returns null
In sets to find an element you must have a copy of it
An HashMap hashes the keys
25Riccardo Cardin
// HashMap implements Map
Map<String, Employee> staff = new HashMap<>();
Employee harry = new Employee("Harry Hacker");
staff.put("987-98-9996", harry);
String s = "987-98-9996";
e = staff.get(s); // gets harry
26. Programmazione concorrente e distribuita
MAPS
The collection framework does not consider a
map itself as a Collection
It is possible to obtain views of the map
If you are interested in all the values of a map, loop over the
enumeration of its entries
Iterators on views cannot add elements to the map
An UnsupportedOperationException is thrown
26Riccardo Cardin
Set<K> keySet() // Set of keys
Collection<K> values() // Collection of values
Set<Map.Entry<K, V>> entrySet() // Set of pairs (key,value)
for (Map.Entry<String, Employee> entry : staff.entrySet()) {
String key = entry.getKey();
Employee value = entry.getValue();
// do something with key, value
}
27. Programmazione concorrente e distribuita
THE FRAMEWORK
A framework is a set of classes that form the
basis for building advanced functionality
The Collection framework defines classes to
implement collections
The main interfaces are Collection and Map
Insertion interfaces are different between the two types
To get the elements from a Collection, just iterate over it
To get a value from a Map, use the get method
27Riccardo Cardin
// To insert an element in a Collection
boolean add(E element)
// To store a key/value pair in a Map
V put(K key, V value)
V get(K key)
28. Programmazione concorrente e distribuita
THE FRAMEWORK
A List is an ordered collection
There is the concept of position of an element
A list provides random access methods
Lists provides a specialized iterator, ListIterator
A Set is a Collection with no duplicates
The add method can reject a value if already present
Methods equals and hashCode are used to maintain
elements inside the set
28Riccardo Cardin
void add(int index, E element)
E get(int index)
void remove(int index)
void add(E element)
30. Programmazione concorrente e distribuita
THE FRAMEWORK
The framework have some companion objects
The Arrays type allows to trasform arrays into List
The list returned is a view on the array: it is not possible to
change the size of the list; elements are the same
The Collections type have a bunch of utilities
The method nCopies builds an illusory immutable list
Object is stored only once
Method singleton returns an illusory set with one element
And so on...
30Riccardo Cardin
static <T> List<T> asList(T... a)
List<String> settings = Collections.nCopies(100, "DEFAULT");
Collections.singleton(anObject)