The Set Interface
- A set is a collection that does not contain
duplicate elements.
- The Set interface models a mathematical set.
- The Set interface extends Collection and contains
no methods other than those inherited from Collection. It adds the
requirement that no duplicates are allowed.
Two general purpose Set implementations in the JDK:
- HashSet - stores its elements in a hash table
(best performance typically)
- TreeSet - stores its elements in a red-black tree
Example: Let c
be a collection. Create another collection that contains the same
elements, but which does not contain duplicate elements.
Collection<Integer>
c = new ArrayList<Integer>();
// assume
elements have been added to c
Collection<Integer>
noDups = new HashSet<Integer>(c);
Exercise: Write
a program that takes words on the command line and prints out any
duplicate words, the number of distinct words, and a list of the words
without duplicates.
Note: A HashSet doesn't
guarantee any sort of ordering on the elements in a set. Use a TreeSet
in the previous example if you want to words in the set printed in
alphabetical order.
Exercise: Write a program that
takes words on the command line and prints the words that occur only
once, and then prints the words that occur more than once.
Sample Run:
java RemoveDuplicates hello world hello again
unique words: [world, again]
duplicated words: [hello]