Sets in vanilla Java and io.vavr.collection.Set

Hi there! With this post I would like to continue my series of articles on Vavr library. This time we will explore sets data structures and compare “vanilla” Java implementation from Java Collection Framework with one that is offered by Vavr. As always, we start with a generic definition of a data structure and continue with a review of common operations and how they are performed for java.util.Set as well for Vavr.

Table of Contents

Set as a data structure

As it was with arrays and queues before, we will start our journey from the basic definition of sets as a data structure. In computer programming, set is an implementation of the mathematical concept of (finite) set and is defined as a collection of elements with no duplicates and no ordering (reference). Although, for some concrete implementations it is true that they have ordered elements.

Sets in Java

Java includes sets as a part of Java Collections Framework. java.util.Set is a a collection that contains to duplicates and at most 1 null element. Only one null element is allowed because, logically, one null is unique by its nature. So, how Java understands if an element is unique? For unique element e1 is true that there is no element e2 in set already that e1.equals(e2) == true.

Let observe concrete implementations of sets in Java – the graph below demonstrates them:

Graph 1. java.util.Set and its subclasses

These types are differ on following basis: does this set allow null elements and is it sorted? We can sum up these parameters into the following table:

HashSetLinkedHashSetTreeSet
Sorted elementsNoYesYes
Allows null elementsYesYesNo

NB that the behavior of a set is not sprecified if the value of an object is changed in a manner that affects equals() method compararisions while the object is in set. Now, let review common set operations permitted by java.util.Set and its implementations.

Add a new element

Sets use add (E e) method as any other collection. This operation returns a boolean value that is true if element was inserted successfully, or false is element presents already. To validate an uniqueness of an element, Java uses equals() logic for an element. Let have a quick example:

@Test
public void addToSetTest(){
    Set<Integer> numbers = new HashSet<>();
    numbers.add(1);
    numbers.add(2);
    numbers.add(3);
    numbers.add(2); // allows no duplicates
    int length = numbers.size();
    assertEquals(3, length);
}

You can observe here, that while we add 4 elements, the size of numbers is 3, because one element (2) is already presented. Therefore:

boolean result = numbers.add(2);
assertFalse(result);

Remove an element

To remove an element we use a method remove(Object obj) that is also common for all collections. As same with the aforesaid case of an addition, delete operation uses equals() to determine an existence of an element in the set and returns boolean value, that is true is the element obj was removed. So, set will not contain the element once the call returns. Take a look on the code snippet below:

@Test
public void removeFromSetTest(){
    Set<Integer> numbers = new HashSet<>();
    numbers.add(1);
    numbers.add(2);
    numbers.add(3);
    boolean removed = numbers.remove(3);
    assertTrue(removed);
    assertEquals(2, numbers.size());
}

NB pay your attention on the method signature: it accepts Object, rather than type. So, ClassCastException may be thrown in case when the argument does not match its type to the type of set.

Replace an element

Unlike lists, that allows to replace an arbitary element, sets do not allow such behavior. By default, if set already holds an element, new addition will not be added and will not replace the existing one. However, we can write such logic manually:

if (set.add(e) == false){
	set.remove(e);
	set.add(e);
}

In this code we first check if set does have an element e. We can do it using add() method, that, as we mentioned already, returns false is an element already there. In this case we first remove the existing element and then add replacing element.

Get an element from set

We previously discussed, that java.util.Collection itself does not offer ways to access the specific element, as it is strictly depends on the logic of concrete data structures. For example, lists and arrays permit developers to get a value based on numeric index [0; length – 1]. Set does not have these methods, but we implement it in two ways:

  1. Using iterator
  2. Using stream API

Let have an example:

TreeSet<Integer> numbers = new TreeSet<>();
numbers.add(50);
numbers.add(10);
numbers.add(15);
numbers.add(8);

We create TreeSet that contains integer numbers. NB, that TreeSet has sorted elements. Than, we can use iterator to get the first element:

Iterator<Integer> iterator = numbers.iterator();
Integer first = iterator.next();
assertEquals(Integer.valueOf(8), first);

If we need specific element we can iterate through the set and find the desired one like this:

if (iterator.hasNext()){
	Integer value = iterator.next();
	if (value == e){
		// we found it!
	}
}

Another approach is usage of streams. We assume that set is ordered (in this case, Person entity implements Comparable)

TreeSet<Person> people = new TreeSet<>();

people.add(new Person("Jana", "Cermakova"));
people.add(new Person("Tereza", "Vodickova"));
people.add(new Person("Zuzana", "Novakova"));
people.add(new Person("Jana", "Vojtechova"));

Person query = new Person("Zuzana", "Novakova");

Optional<Person> result = people.stream().filter(person -> person.equals(query)).findFirst();
Person person = result.get();

That was about vanilla Java. As we know, Java collections are mutable. For this, it is better to use immutable data structures. Vavr library also offers sets, but they are immutable, unlike Java ones. Let review them.

Sets in Vavr

In Vavr, sets are defined by io.vavr.collection.Set class that is part of Vavr Collection API. The following diagram demonstrates the place of sets among other Vavr data structures:

Graph 2. Sets in Vavr

NB that sets are Traversable, but corresponds to a separate group and do not inherit Seq as arrays. Also, generally, sets in Vavr permit null elements (as long as only one null presents in a set).

Add a new element

Vavr collections are immutable, so any data manipulation returns a new instance and the existing is not affected. In the example code below we create a new set using factory ofAll method (used to create Vavr traversables from primitives) and after adding a new number, we actually create a new set:

@Test
public void addToSetTest(){
    Set<Integer> numbers = HashSet.ofAll(15, 25, 10, 19, 93);
    assertEquals(5, numbers.size());

    // vavr collections are immutable

    Set<Integer> result = numbers.add(100);
    assertEquals(5, numbers.size());
    assertEquals(6, result.size());
}

We can clearly assert that numbers is immutable, as its length is not affected after adding 100.

Remove an element

When we delete an element from the Vavr set, we as in the case with addition, do not affect existing collection. For this use remove(Element e) method from Traversable. Rather, we create a new one. Take a look on the code snippet below:

@Test
public void removeFromSetTest(){
    Set<Integer> numbers = HashSet.ofAll(15, 25, 10, 19, 93);

    // vavr collections are immutable

    Set<Integer> results = numbers.remove(25);
    assertFalse(results.contains(25));
    assertTrue(numbers.contains(25));
}

s in the previous example, remove() did not changed numbers values, but returned a new set. We can assert this, due to the fact, that results does not contain 25, but numbers does.

Replace an element

In vanilla Java we have to implement replacement manually, however Vavr includes such method out of the box. For this, let say we have a set of cars, where we need to replace car (they are compared based on license plate) because its fuel level was decreased. For this, use replace(E old, E one) method.

@Test
public void replaceTest(){
    Set<FuelableCar> cars = HashSet
       .of(new FuelableCar(1234, "Skoda rapid", 100), 
                new FuelableCar(2314, "VW Gold", 100),
                new FuelableCar(8543, "Audi A6", 100));

    System.out.println(cars);
    assertEquals(3, cars.size());

        // now, reduce fuel level in audi

    FuelableCar audi = new FuelableCar(8543, "Audi A6", 85);
    Set<FuelableCar> result = cars.replace(cars.last(), audi);
    System.out.println(result);
}

We use last() method here to refer to the last element. As cars are sorted based on their license plate numbers, 8543 is the last value in set. Now, we can replace it with the new element with reduced fuel level. When you run this code, you will see that car was replaced:

Graph 3. Result of replacing an element in Vavr

Conclusion

Let sum up. In this post we explored sets implementations available for Java developers. Set is an implementation of the mathematical concept of a finite set, therefore is a data structure that does not allow duplicate elements. Java Collections Framework includes its implementation; another [better] approach – is immutable Set collection from Vavr library.

References

  • Grzegorz Piwowarek. Vavr Collections and Java Stream API Collectors (2017) 4Comprehension, read here

Leave a Reply

Your email address will not be published. Required fields are marked *