By Pankaj Kumar
Java Stream distinct() method returns a new stream of distinct elements. It’s useful in removing duplicate elements from the collection before processing them.
Let’s see how to use stream distinct() method to remove duplicate elements from a collection.
jshell> List<Integer> list = List.of(1, 2, 3, 4, 3, 2, 1);
list ==> [1, 2, 3, 4, 3, 2, 1]
jshell> List<Integer> distinctInts = list.stream().distinct().collect(Collectors.toList());
distinctInts ==> [1, 2, 3, 4]
Since distinct() is a intermediate operation, we can use forEach() method with it to process only the unique elements.
jshell> List<Integer> list = List.of(1, 2, 3, 4, 3, 2, 1);
list ==> [1, 2, 3, 4, 3, 2, 1]
jshell> list.stream().distinct().forEach(x -> System.out.println("Processing " + x));
Processing 1
Processing 2
Processing 3
Processing 4
Let’s look at a simple example of using distinct() to remove duplicate elements from a list.
package com.journaldev.java;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
public class JavaStreamDistinct {
public static void main(String[] args) {
List<Data> dataList = new ArrayList<>();
dataList.add(new Data(10));
dataList.add(new Data(20));
dataList.add(new Data(10));
dataList.add(new Data(20));
System.out.println("Data List = "+dataList);
List<Data> uniqueDataList = dataList.stream().distinct().collect(Collectors.toList());
System.out.println("Unique Data List = "+uniqueDataList);
}
}
class Data {
private int id;
Data(int i) {
this.setId(i);
}
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
@Override
public String toString() {
return String.format("Data[%d]", this.id);
}
}
Output:
Data List = [Data[10], Data[20], Data[10], Data[20]]
Unique Data List = [Data[10], Data[20], Data[10], Data[20]]
The distinct() method didn’t remove the duplicate elements. It’s because we didn’t implement the equals() method in the Data class. So the superclass Object equals() method was used to identify equal elements. The Object class equals() method implementation is:
public boolean equals(Object obj) {
return (this == obj);
}
Since the Data objects had the same ids’ but they were referring to the different objects, they were considered not equal. That’s why it’s very important to implement equals() method if you are planning to use stream distinct() method with custom objects. Note that both equals() and hashCode() methods are used by Collection classes API to check if two objects are equal or not. So it’s better to provide an implementation for both of them.
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + id;
return result;
}
@Override
public boolean equals(Object obj) {
System.out.println("Data equals method");
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Data other = (Data) obj;
if (id != other.id)
return false;
return true;
}
Tip: You can easily generate equals() and hashCode() method using “Eclipse > Source > Generate equals() and hashCode()” menu option. The output after adding equals() and hashCode() implementation is:
Data List = [Data[10], Data[20], Data[10], Data[20]]
Data equals method
Data equals method
Unique Data List = [Data[10], Data[20
Reference: Stream distinct() API Doc
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
Java and Python Developer for 20+ years, Open Source Enthusiast, Founder of https://www.askpython.com/, https://www.linuxfordevices.com/, and JournalDev.com (acquired by DigitalOcean). Passionate about writing technical articles and sharing knowledge with others. Love Java, Python, Unix and related technologies. Follow my X @PankajWebDev
Do you know how ‘dataList.stream().distinct().collect(Collectors.toList());’ this line actually works? I need to loop to check if data from two array matches one another where one array contains 80K data. And the two loops are taking too much time. So i need to know any mechanism how i can loop through two arraylist and reduce my time. Thank You.
- Taslima
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.