
- A collection in data structure terms, is a group of elements. It includes Sets and Lists. It also includes Maps which don’t implement Collection.
- Collection is the interface implemented by the collection classes such as ArrayList, TreeSet and HashSet implement. It defines the methods common to them all.
- Collections is a class of utility methods for working on collections, e.g. sorting and binary search, even though they only work on Lists.
In Java version 1.5 or later the Collection classes are generified to keep track of the class of the elements permitted to be contained in them.
All classes that implement Collection automatically implement the superinterface Iterable, allowing you to iterate over all their elements.
Gotchas
- A common programming error is to use the interface Collection when you meant the class Collections and vice versa.
- The sort method is not part of the Collection interface. It lives in the Collections class. So you cannot say myCollection. sort(). You must sayCollections.sort( myCollection ); However, sort will not work on Collections generally, only Lists.
- Maps such as HashMap are not Collections, i.e. they don’t implement the Collection interface.
Examples of Collections
Some types of collections that implement Collection include the following:Collection Classes and Interfaces | ||||
---|---|---|---|---|
Class | Implements or Extends Collection Interface | Interface or Collection | Synchronised | Notes |
AbstractCollection | abstract class | an interface to act as skeleton to build most classes that implement theCollection, interface. | ||
java.lang.reflect.Array | class | Reflection classes used to dynamically create Array objects where you don’t know anything about them at compile time. You probably were looking for Arraysinstead. | ||
java.sql.Array | interface | A class for storing and retrieving arrays in an SQL database. You probably were looking for Arrays instead. | ||
ArrayList | class | A non-thread-safe List that is marginally faster than thread-safe Vector. In versions prior to Java version 1.4, the difference was more pronounced — 3 to 4 times faster. Rolling your own non-Collection array is about 4 times faster thanVector. In earlier non-optimising JVMs, the difference was more like 40 times faster. The main overhead of using ArrayList and Vector is the casting needed to fish out the elements. ArrayList is like a [] array that automatically grows. Lookup by dense ints. Considerably faster than a HashMap. | ||
Arrays | class | static methods for sorting and searching ordinary arrays, e.g. binarySearch,fill, sort. asList converts an array into a List Collection you can then feed to methods expecting a Collection or List. Don’t confuse this with the two Array classes. | ||
Collection | interface | The interface that all collections implement that guarantee you can always add,remove, clear, size, iterator, contains, toArray etc. in a standard way. Don’t confuse this with the Collections class. toArray can be made smarter if you pass it an empty array to fill just the right size. Then you have an array of specific classes, not just generic Objects. Use it like this:Dogs[] dogs = ( Dogs [] )dogArrayList.toArray ( new Dogs [dogArrayList.size()] );The Collection. toString method implemented by all Collections is a great debugging tool to dump out the entire contents of a Collection in a reasonably human-readable format enclosing each element in []. | ||
Collections | class | static methods for operating on collections, e.g. sort, binarySearch, min, max,shuffle. Don’t confuse the Collections class with the Collectioninterface. | ||
javax.swing.tree.DefaultTreeModel | class | This was written to support Swing’s JTree, but it will suffice as a generic tree collection. | ||
HashMap | class | a Map, an unsynchronized Hashtable. Allows unordered lookup by key, usually a String. It is thread safe if you only do lookups. Otherwise you must synchronise externally, or use a Hashtable. No duplicates allowed. | ||
HashSet | class | a Set, a collection of unique objects, without ability to lookup by key, usually a String. Not ordered. Often used to maintain a list of legal values, e.g. state abbreviations. You can quickly find out if a given value is in the legal list. | ||
Hashtable | class | a Map, a synchronised HashMap. Allows unordered lookup by key, usually aString. Note, this is spelled Hashtable not Hash Table. | ||
IdentityHashMap | class | It is like a Hashtable. Used in manipulating graphs where you have to keep track of which nodes have already been visited. In Java version 1.4 or later. It compares with == instead of equals. | ||
LinkedHashMap | class | a HashMap that also threads the objects together, usually in insertion order. This way you can retrieve the elements in insertion order with almost no more overhead than a HashMap. In Java version 1.4 or later. | ||
LinkedList | class | a List that behaves like Vector, but implemented as a doubly linked chain of objects. Fast for insert/delete. Slow for indexing. | ||
List | interface | specifies an ordered sequence of elements. Implemented by ArrayList, Vectorand LinkedList. | ||
Map | interface | supports lookup of objects by unique key. Keys may or may not be ordered. | ||
Queue | interface | Java version 1.5 or later. Used for enqueing objects waiting to be processed. Has some very complex multi-thread abilities. It might be looked on more as threading synchronisation package than a collection. | ||
RandomAccess | interface | An interface does not do anything. It has no methods. It is just a marker that lets you know random access to a List is efficient, e. g. LinkedList would not implement it, whereas ArrayList would. In Java version 1.4 or later. | ||
Set | interface | a mathematical set of elements, with no duplication. No lookup by key. | ||
SortedMap | interface | Map interface that guarantees sorted keys, usually Strings. implemented byTreeMap. | ||
SortedSet | interface | Set interface that guarantees iteration will be in ascending order. Implemented byTreeSet | ||
Stack | class | pushdown stack, a LIFO (Last In First Out) queue. You can also fudge a simple LIFO stack with LinkedList using use only addFirst, removeFirst. | ||
TreeMap | class | a SortedMap using a red-black tree. Allows ordered lookup by key, usuallyStrings. Does not allow duplicate keys, unless you cheat to fool it into thinking they are not really duplicates. Slower than HashMap. | ||
TreeSet | class | a SortedSet using a red-black tree. Allows ordered access, but no access by key. Slower than HashSet. | ||
Vector | class | a List. This is the same old Vector you are familiar with, now implementingCollection. It acts like a []- style array that automatically grows as needed. It is a synchronised ArrayList, i.e. suitable for use when more than one thread accesses the Vector. |
Missing Collections
Collections that you would expect to find in Java prior to 1.5 but which are not included are:- FIFOQueue, a FIFO (First In First Out) buffering queue.
- FIFOPriorityQueue, a FIFO queue with N levels of priority, so that the top priority items are normally taken first, but it is possible to service the lower priorities less frequently, e.g. serve four from top level, for every two from next level for eveyone from the level below, or with priority bumping if an item waits in a queue past a given threshold. You could use it for applications like airline boarding, emergency waiting rooms, computer task scheduling or even your personal todo list. You can fudge a simple FIFO queue with LinkedList using use only addFirst, removeLast.
- SortedList that gives the illusion of always automatically staying sorted. I wrote one.
- Cached collections that automatically drop elements not used in a long time to keep the collection from growing too large.
- Persistent collections that automatically load from disk and save back to disk.
- For other missing collections check out these sources:AmberArcher
Jakarta Common Collections (includes bidirectional map)
PCJ
Trove4J
Walt Disney Tea Trove (includes: FlyweightSet, PropertyMap, SortedArrayList)
Duplicate Keys
Nearly all the collections prevent you from storing duplicate keys. There are several ways around that:- Write your own collection that permit duplicates.
- Invent tie breaker logic to create auxiliary tail ends to the key to artificially make all keys unique, e. g. First key "rover" goes in as "rover" and the next as "rover2" Before inserting, you must alway check if the key has already been added, and if so, try a tie breaker until you find one that has not.
- Instead of storing data objects in your map, store a FIFOQueue of data objects that share the same key. This preserves FIFO order on duplicates. When there are no duplicate for a given keys, store a data object directly. You could similarly handle the problem with ArrayLists of duplicates instead of FIFOQueues. FIFOQueues have the advantage of not needing to estimate the number of duplicates for each key ahead of time.