How do I get the intersection between two arrays as a new array?

JavaC++CAlgorithm

Java Problem Overview


I faced this problem many times during various situations. It is generic to all programming languages although I am comfortable with C or Java.

Let us consider two arrays (or collections):

char[] A = {'a', 'b', 'c', 'd'};
char[] B = {'c', 'd', 'e', 'f'};

How do I get the common elements between the two arrays as a new array? In this case, the intersection of array A and B is char[] c = {'c', 'd'}.

I want to avoid the repeated iteration of one array inside the other array which will increase the execution time by (length of A times length of B) which is too much in the case of huge arrays.

Is there any way we could do a single pass in each array to get the common elements?

Java Solutions


Solution 1 - Java

foreach element e in array A
    insert e into hash table H

foreach element e in array B
    if H contains e 
        print e
  

This algorithm is O(N) in time and O(N) in space.

To avoid the extra space, you can use the sorting based approach.

Solution 2 - Java

The lower bound on efficiency is O(n) - you need to at least read all the elements. Then there are several apporaches:

Dumb simplest approach

Search for every element from array one in array two. Time complexity O(n^2).

Sorting approach

You need to sort only array one, then search for elements from array two using binary search. Time complexity: sorting O(nlogn), searching O(n * logn) = O(nlogn), total O(nlogn).

Hash approach

Create a hash table from array one elements. Search for elements form second table in the hash table. The time complexity depends on the hash function. You can achieve O(1) for searches in the optimal case (all elements will have different hash value), but O(n) in the worst case (all elements will have the same hash value). Total time complexity: O(n^x), where x is a factor of hash function efficiency (between 1 and 2).

Some hash functions are guaranteed to build a table with no collisions. But the building no longer takes strictly O(1) time for every element. It will be O(1) in most cases, but if the table is full or a collision is encountered, then the table needs to be rehashed - taking O(n) time. This happens not so often, much less frequently than clean adds. So the AMORTISED time complexity is O(1). We don't care about some of the adds taking O(n) time, as long as the majority of adds takes O(1) time.

But even so, in an extreme case, the table must be rehashed every single insertion, so the strict time complexity would be O(n^2)

Solution 3 - Java

There are a few methods in some languages that I'm aware of that do exactly what you want, have you considered looking at some of these implementations?

PHP - http://php.net/manual/en/function.array-intersect.php">array_intersect()</a>

$array1 = array("a" => "green", "red", "blue");
$array2 = array("b" => "green", "yellow", "red");
$result = array_intersect($array1, $array2);
print_r($result);

>> green
   red

Java - List.retainAll

Collection listOne = new ArrayList(Arrays.asList("milan","dingo", "elpha", "hafil", "meat", "iga", "neeta.peeta"));
Collection listTwo = new ArrayList(Arrays.asList("hafil", "iga", "binga", "mike", "dingo"));

listOne.retainAll( listTwo );
System.out.println( listOne );

>> dingo, hafil, iga

Solution 4 - Java

Since this looks to me like a string algorithm, I'll assume for a moment that its not possible to sort this sequence (hence string) then you can use Longest Common Sequence algorithm (LCS)

Assuming the input size is constant, then the problem has a complexity of O(nxm), (length of the two inputs)

Solution 5 - Java

    public static void main(String[] args) {
        char[] a = {'a', 'b', 'c', 'd'};
        char[] b = {'c', 'd', 'e', 'f'};
        System.out.println(intersect(a, b));
    }

    private static Set<Character> intersect(char[] a, char[] b) {
        Set<Character> aSet = new HashSet<Character>();
        Set<Character> intersection = new HashSet<Character>();
        for (char c : a) {
            aSet.add(c);
        }
        for (char c : b) {
            if (aSet.contains(c)) {
                intersection.add(c);
            }
        }
        return intersection;
    }

Solution 6 - Java

int s[256] // for considering all ascii values, serves as a hash function

for(int i=0;i<256;i++)
s[i]=0;

char a[]={'a','b','c','d'};
char b[]={'c','d','e','f'};

for(int i=0;i<sizeof(a);i++)
{
   s[a[i]]++;
 }

 for(int i=0;i<sizeof(b);i++)//checker function
 {
     if(s[b[i]]>0)
       cout<<b[i]; 
  }


  complexity O(m+n);
  m- length of array a
  n- length of array b

Solution 7 - Java

Google Guava

There are already many good answers to this, but if you want the one-liner approach using a library for lazy-coding, I'd go with Google Guava (for Java) and its Sets.intersection method.

(no compiler at hand, bear with me)

char[] A = {'a', 'b', 'c', 'd'};
char[] B = {'c', 'd', 'e', 'f'};

Set<Character> intersection = Sets.intersection(
    Sets.newHashSet<Character>(Chars.asList(a)),
    Sets.newHashSet<Character>(Chars.asList(b))
);

Obviously, this is assuming both arrays wouldn't have duplicates, in which case using a set data structure would make more sense and allow for this sort of operation more efficiently, especially if you don't start from an array of primitives from the start.

May or may not fit your use case, but sort of the no-brainer approach for the general case.

Solution 8 - Java

  1. Sort both the arrays.
  2. Then do loop until they have have elements common Or one of the arrays reaches its end.

Asymptotically, this takes the complexity of sorting. i.e. O(NlogN) where N is the length of longer input array.

Solution 9 - Java

If you care about duplicates, use a hash map to index list A, with the key being the element, and the value being a number of how many times that element has been seen.

You iterate through the first and for every element in A and if it does not exist in the map, put it in there with a value of 1, if it already exists in the map, add one to that value.

Next, iterate through B, and if the value exists, subtract 1. If not, put -1 in the value on the table for that element.

Finally, iterate through the map and for any element that has a value != 0, print out as a difference.

private static <T> List<T> intersectArrays(List<T> a, List<T> b) {
    Map<T, Long> intersectionCountMap = new HashMap<T, Long>((((Math.max(a.size(), b.size()))*4)/3)+1);
    List<T> returnList = new LinkedList<T>();
    for(T element : a) {
        Long count = intersectionCountMap.get(element);
        if (count != null) {
            intersectionCountMap.put(element, count+1);
        } else {
            intersectionCountMap.put(element, 1L);
        }
    }
    for (T element : b) {
        Long count = intersectionCountMap.get(element);
        if (count != null) {
            intersectionCountMap.put(element, count-1);
        } else {
            intersectionCountMap.put(element, -1L);
        }            
    }
    for(T key : intersectionCountMap.keySet()) {
        Long count = intersectionCountMap.get(key);
        if (count != null && count != 0) {
            for(long i = 0; i < count; i++) {
                returnList.add(key);
            }
        }
    }
    return returnList;
}

This should run in O(n), as we're only iterating the Lists each once, and the Map once. The Data structures here used in Java should be efficient, as the HashMap is constructed with a capacity that can handle the largest size of the lists.

I'm using a LinkedList for the return as it provides us a way of adding and iterating through a list for our unknown sized intersection.

Solution 10 - Java

The best way is not to start with arrays at all. Arrays are optimal for random access to elements, but not optimal for searching (which is what finding the intersection is all about). As you are talking about intersection, you must be regarding the arrays as sets. So use a more appropriate data structure (in Java, a Set). Then the task is much more efficient.

Solution 11 - Java

You can use tree, but time will be O(n(log n)) and elements must be comparable

Solution 12 - Java

First, sort the two arrays using best sorting algorithm.
Then, with linear search, you can get the common elements.

If an extra space is provided then we can use hash table to do that.

Solution 13 - Java

in ruby you can just say

a = ['a', 'b', 'c', 'd']
b = ['c', 'd', 'e', 'f']
c = a & b

c contains ['c','d']

Solution 14 - Java

Sort two arrays first, then iterate them, if they are the same element, add to to be returned array.

Code is here:

public static void printArr(int[] arr){
    for (int a:arr){
        System.out.print(a + ", ");
    }
    System.out.println();
}

public static int[] intersectionOf(int[] arr1, int[] arr2){
    Arrays.sort(arr1);
    Arrays.sort(arr2);

    printArr(arr1);
    printArr(arr2);

    int i=0, j=0, k=0;
    int[] arr = new int[Math.min(arr1.length, arr2.length)];

    while( i < arr1.length && j < arr2.length){
        if(arr1[i] < arr2[j]){
            i++;
        } else if(arr1[i] > arr2[j]){
            j++;
        } else {
            arr[k++] = arr1[i++];
            j++;
        }
    }
    return Arrays.copyOf(arr, k);
}

public static void main(String[] args) {
    int[] arr1 = {1, 2, 6};
    int[] arr2 = {10, 2, 5, 1};
    printArr(intersectionOf(arr1,arr2));
}

outputs:

arr1: 1, 2, 6, 
arr2: 1, 2, 5, 10, 
arr: 1, 2, 

Solution 15 - Java

Assuming you are dealing with ANSI characters. The approach should be similar for Unicode, just change the range.

char[] A = {'a', 'b', 'c', 'd'};
char[] B = {'c', 'd', 'e', 'f'};
int[] charset = new int[256]

for(int i=0; i<A.length; i++) {
  charset[A[i]]++;
}

Now iterate over the B and you can check if the corresponding charset value for the character being iterated is greater than 0. You can store them in a list or any other collection.

This approach takes O(n) time complexity and a constant space for your checks not taking into account your new array/list being used to hold the common elements.

This is better than the HashSet/Hashtable approach in terms of space complexity.

Solution 16 - Java

You could use HashSet in .NET 3.5 or later. Example c# code:

HashSet<int> set1 = new HashSet<int>(new int[]{8, 12, 13, 15});
   
HashSet<int> set2 = new HashSet<int>(new int[] { 15, 16, 7, 8, 9 });
 
set1.IntersectWith(set2);

foreach (int i in set1)

   Console.Write(i+ " ");
  
  

//output: 8 15

Solution 17 - Java

Sort one of the arrays (m Log(m) ) now Pick each element from other array and do a binary search in first array(the sorted one) ->n Log(m)

Total Time Complexity :- (n+m)Log(m).

Solution 18 - Java

I hope the following would be useful. These are two different approached:

  • Simple Intersection where you compare all the elements from one array to another array.

  • Sorting and searching based approach which sorts one array and search second array element in first array using binary search.

//

public class IntersectionOfUnsortedArrays {
	public static void main(String[] args) {
		int[] arr1 = { 12, 4, 17 };
		int[] arr2 = { 1, 12, 7, 17 };
		System.out.println("Intersection Using Simple Comparision");
		printArray(simpleIntersection(arr1, arr2));
		System.out.println("Intersection Using Sort and Binary Search");
		printArray(sortingBasedIntersection(arr1, arr2));
	}

	/*
	 * Simple intersection based on the comparison without any sorting.
	 * Complexity O(n^2)
	 */
	public static int[] simpleIntersection(int[] a, int[] b) {
		int minlen = a.length > b.length ? b.length : a.length;
		int c[] = new int[minlen];
		int k=0;
		for(int i=0;i<a.length;i++){
			for(int j=0;j<b.length;j++){
				if(a[i]==b[j]){
					c[k++]=a[i];
				}
			}
		}
		int arr[] = new int[k];
		// copy the final array to remove unwanted 0's from the array c
		System.arraycopy(c, 0, arr, 0, k);
		return arr;
	}

	/*
	 * Sorting and Searching based intersection.
	 * Complexity Sorting O(n^2) + Searching O(log n)
	 */
	
	public static int[] sortingBasedIntersection(int[] a, int[] b){
		insertionSort(a);
		int minlen = a.length > b.length ? b.length : a.length;
		int c[] = new int[minlen];
		int k=0;
		for(int i=0;i<b.length;i++){
			int result = binarySearch(a,0,a.length,b[i]);
			if(result > -1){
				c[k++] = a[result];
			}
		}
		int arr[] = new int[k];
		// copy the final array to remove unwanted 0's from the array c
		System.arraycopy(c, 0, arr, 0, k);
		return arr;
	}
	
	public static void insertionSort(int array[]) {
		for (int i = 1; i < array.length; i++) {
			int j = i;
			int b = array[i];
			while ((j > 0) && (array[j - 1] > b)) {
				array[j] = array[j - 1];
				j--;
			}
			array[j] = b;
		}
	}
	
	static int binarySearch(int arr[], int low, int high, int num) {
		if (high < low)
			return -1;
		int mid = (low + high) / 2;
		if (num == arr[mid])
			return mid;
		if (num > arr[mid])
			return binarySearch(arr, (mid + 1), high, num);
		else
			return binarySearch(arr, low, (mid - 1), num);
	}
	
	public static void printArray(int[] array) {
		for (int value : array) {
			System.out.print(" "+value);
		}
		System.out.println("\n");
	}
}

Solution 19 - Java

If the collections are already sorted, as shown in the question, then the best solution (not mentioned yet) is a merge-sort-like algorithm that runs in O(n+m).

Compare the first elements of each collection. If they're the same, add the element to the intersection set and pop both elements from their collections. If the elements are different, pop the element that's greater, in comparison, to the other element. Repeat until one collection is empty.

Solution 20 - Java

Using Java 8 features, here is an algorithm which honours duplicates within a list instead of turning a list into a set. No sorting, so no n log n.

  1. Convert one of the lists into a map, with value being number of occurrences (cost: O(n)).
  2. For each item in the other list, if the item exists in the map, decrease occurrence by one (cost: O(n)).

Therefore, overall cost is O(n). Code:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class Dup {
  public static void main(String[] args) {
	List<Integer> listA = Arrays.asList(3, 1, 4, 1, 9, 5, 9);
	List<Integer> listB = Arrays.asList(2, 6, 5, 3, 5, 8, 9, 7, 9, 3, 2, 3);
	findCommons(listA, listB);
  }

  static void findCommons(List<Integer> listA, List<Integer> listB) {
	Map<Integer, Long> mapA = 
		listA.stream().collect(
			Collectors.groupingBy(Integer::intValue, Collectors.counting()));
	
	List<Integer> commons = new ArrayList<>();
	listB.stream()
		.filter(e -> mapA.get(e) != null)
		.filter(e -> mapA.get(e) > 0)
		.forEach(e -> {
			mapA.put(e, mapA.get(e) - 1);
			commons.add(e);
		});
	
	System.out.println(commons);
  }
}

Code above will give this output: [5, 3, 9, 9].

Solution 21 - Java

import java.util.Scanner;

public class arraycommon {

public static void main(String[] args) {
    Scanner sc=new Scanner(System.in);
	// display common element in two diffrent array
	int sizea,sizeb,i=0,j=0,k=0;
	int count=0;
	System.out.println("enter the size array A:"+'\n');
	sizea=sc.nextInt();
	System.out.println("enter the size array B"+'\n');
	sizeb=sc.nextInt();
	int a[]=new int[sizea];
	int b[]=new int[sizeb];
	int c[]=new int[sizea];
	
	
	System.out.println("enter the element in array A:"+'\n');
	for (i = 0; i < sizea; i++) {
		
		a[i]=sc.nextInt();
	}
	System.out.println("enter the element in array B:"+'\n');
	for (i = 0; i < sizeb; i++) {
		
		b[i]=sc.nextInt();
	}
	System.out.println("the element in array A:"+'\n');
	for (i = 0; i < sizea; i++) {
		
		System.out.print(a[i]+" ");
		
	}
	System.out.println('\n');
	System.out.println("the element in array B:"+'\n');
	for (i = 0; i < sizeb; i++) 
	{
		
		System.out.print(b[i]+" ");
	}
	
	for (i = 0; i <sizea; i++) 
	{
		for (j = 0; j < sizeb; j++) 
		{
		   if(a[i]==b[j])
		   {
			   count++;
			   c[k]=a[i];
			   k=k+1;
		   }
		}
	}
	System.out.println('\n');
	System.out.println("element common in array is");
	
	if(count==0)
	{
		System.out.println("sorry no common elements");
	}
	else
	{
		for (i = 0; i <count; i++) 
		{
		
		System.out.print(c[i]+" ");
		}
	}
	
}

}

Solution 22 - Java

    simply search each element of first array with each element of second array and stored matched result in third array
class Union
{
  public static void main(String[] args) {
  char a[] ={'f','g','d','v','a'};
  char b[] ={'a','b','c','d','e'};
  char temp[] = new char[5];
  int p=0;
  for(int i=0;i<a.length;i++)
  {
    for(int j=0;j<b.length;j++)
    {
      if(a[i]==b[j])     //searches if both array has common element
      {

        temp[p] = a[i];   //if match found store it in a new array
        p++;
      }

    }

  }
  for(int k=0;k<temp.length;k++)
  {
      System.out.println(temp[k]);
  }

  }
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRanjan SarmaView Question on Stackoverflow
Solution 1 - JavacodaddictView Answer on Stackoverflow
Solution 2 - JavaJakub ZaverkaView Answer on Stackoverflow
Solution 3 - JavaMikeView Answer on Stackoverflow
Solution 4 - JavaMoataz ElmasryView Answer on Stackoverflow
Solution 5 - JavaMik378View Answer on Stackoverflow
Solution 6 - JavaSumit Kumar SahaView Answer on Stackoverflow
Solution 7 - JavahaylemView Answer on Stackoverflow
Solution 8 - JavaP.PView Answer on Stackoverflow
Solution 9 - JavaNicholasView Answer on Stackoverflow
Solution 10 - JavaRaedwaldView Answer on Stackoverflow
Solution 11 - JavaYolaView Answer on Stackoverflow
Solution 12 - JavakishoreView Answer on Stackoverflow
Solution 13 - JavaColin MacKenzie - IIIView Answer on Stackoverflow
Solution 14 - JavaAlan DongView Answer on Stackoverflow
Solution 15 - JavaVamshidhar BeharaView Answer on Stackoverflow
Solution 16 - JavaSanjView Answer on Stackoverflow
Solution 17 - JavanavguptaView Answer on Stackoverflow
Solution 18 - JavaDeepak SinghviView Answer on Stackoverflow
Solution 19 - JavaNickView Answer on Stackoverflow
Solution 20 - JavamohsenmadiView Answer on Stackoverflow
Solution 21 - Javauser6885473View Answer on Stackoverflow
Solution 22 - JavaAkash SalunkheView Answer on Stackoverflow