Two elements in array whose xor is maximum

ArraysAlgorithmBit ManipulationXor

Arrays Problem Overview


Given an array of integers ,You have to find two elements whose XOR is maximum.

There is naive approach --just by picking each element and xoring with other elements and then comparing the results to find the pair.

Other than this ,Is there any efficient algorithm?

Arrays Solutions


Solution 1 - Arrays

I think I have a O(n lg U) algorithm for this, where U is the largest number. The idea is similar to user949300's, but with a bit more detail.

The intuition is as follows. When you're XORing two numbers together, to get the maximum value, you want to have a 1 at the highest possible position, and then of the pairings that have a 1 at this position, you want a pairing with a 1 at the next possible highest position, etc.

So the algorithm is as follows. Begin by finding the highest 1 bit anywhere in the numbers (you can do this in time O(n lg U) by doing O(lg U) work per each of the n numbers). Now, split the array into two pieces - one of the numbers that have a 1 in that bit and the group with 0 in that bit. Any optimal solution must combine a number with a 1 in the first spot with a number with a 0 in that spot, since that would put a 1 bit as high as possible. Any other pairing has a 0 there.

Now, recursively, we want to find the pairing of numbers from the 1 and 0 group that has the highest 1 in them. To do this, of these two groups, split them into four groups:

  • Numbers starting with 11
  • Numbers starting with 10
  • Numbers starting with 01
  • Numbers starting with 00

If there are any numbers in the 11 and 00 group or in the 10 and 01 groups, their XOR would be ideal (starting with 11). Consequently, if either of those pairs of groups isn't empty, recursively compute the ideal solution from those groups, then return the maximum of those subproblem solutions. Otherwise, if both groups are empty, this means that all the numbers must have the same digit in their second position. Consequently, the optimal XOR of a number starting with 1 and a number starting with 0 will end up having the next second bit cancel out, so we should just look at the third bit.

This gives the following recursive algorithm that, starting with the two groups of numbers partitioned by their MSB, gives the answer:

  • Given group 1 and group 0 and a bit index i:
    • If the bit index is equal to the number of bits, return the XOR of the (unique) number in the 1 group and the (unique) number in the 0 group.
    • Construct groups 11, 10, 01, and 00 from those groups.
    • If group 11 and group 00 are nonempty, recursively find the maximum XOR of those two groups starting at bit i + 1.
    • If group 10 and group 01 are nonempty, recursively find the maximum XOR of those two groups, starting at bit i + 1.
    • If either of the above pairings was possible, then return the maximum pair found by the recursion.
    • Otherwise, all of the numbers must have the same bit in position i, so return the maximum pair found by looking at bit i + 1 on groups 1 and 0.

To start off the algorithm, you can actually just partition the numbers from the initial group into two groups - numbers with MSB 1 and numbers with MSB 0. You then fire off a recursive call to the above algorithm with the two groups of numbers.

As an example, consider the numbers 5 1 4 3 0 2. These have representations

101  001  100   011   000   010

We begin by splitting them into the 1 group and the 0 group:

101  100
001  011  000  010

Now, we apply the above algorithm. We split this into groups 11, 10, 01, and 00:

11:
10:  101  100
01:  011  010
00:  000  001

Now, we can't pair any 11 elements with 00 elements, so we just recurse on the 10 and 01 groups. This means we construct the 100, 101, 010, and 011 groups:

101: 101
100: 100
011: 011
010: 010

Now that we're down to buckets with just one element in them, we can just check the pairs 101 and 010 (which gives 111) and 100 and 011 (which gives 111). Either option works here, so we get that the optimal answer is 7.

Let's think about the running time of this algorithm. Notice that the maximum recursion depth is O(lg U), since there are only O(log U) bits in the numbers. At each level in the tree, each number appears in exactly one recursive call, and each of the recursive calls does work proportional to the total number of numbers in the 0 and 1 groups, because we need to distribute them by their bits. Consequently, there are O(log U) levels in the recursion tree, and each level does O(n) work, giving a total work of O(n log U).

Hope this helps! This was an awesome problem!

Solution 2 - Arrays

This can be solved in O(NlogN) time complexity using Trie.

  • Construct a trie. For each integer key, each node of the trie will hold every bit(0 or 1) starting from most significant bit.
  • Now for each arr[i] element of arr[0, 1, ..... N] - Perform query to retrieve the maximum xor value possible for arr[i]. We know xor of different type of bits(0 ^ 1 or 1 ^ 0) is always 1. So during query for each bit, try to traverse node holding opposite bit. This will make that particular bit 1 result in maximizing xor value. If there is no node with opposite bit, only then traverse the same bit node. - After query, insert arr[i] into trie. - For each element, keep track the maximum Xor value possible. - During walking through each node, build the other key for which the Xor is being maximized.

For N elements, we need one query(O(logN)) and one insertion(O(logN)) for each element. So the overall time complexity is O(NlogN).

You can find nice pictorial explanation on how it works in this thread.

Here is C++ implementation of the above algorithm:

const static int SIZE = 2;
const static int MSB = 30;
class trie {
private:
    struct trieNode {
        trieNode* children[SIZE];
        trieNode() {
            for(int i = 0; i < SIZE; ++i) {
                children[i] = nullptr;
            }
        }
        ~trieNode() {
            for(int i = 0; i < SIZE; ++i) {
                delete children[i];
                children[i] = nullptr;
            }
        }
    };
    trieNode* root;
public:
    trie(): root(new trieNode()) {
    }
    ~trie() {
        delete root;
        root = nullptr;
    }
    
    void insert(int key) {
        trieNode* pCrawl = root;
        for(int i = MSB; i >= 0; --i) {
            bool bit = (bool)(key & (1 << i));
            if(!pCrawl->children[bit]) {
                pCrawl->children[bit] = new trieNode();
            }
            pCrawl = pCrawl->children[bit];
        }
    }
    
    int query(int key, int& otherKey) {
        int Xor = 0;
        trieNode *pCrawl = root;
        for(int i = MSB; i >= 0; --i) {
            bool bit = (bool)(key & (1 << i));
            if(pCrawl->children[!bit]) {
                pCrawl = pCrawl->children[!bit];
                Xor |= (1 << i);
                if(!bit) {
                    otherKey |= (1 << i); 
                } else {
                    otherKey &= ~(1 << i);
                }
            } else {
                if(bit) {
                    otherKey |= (1 << i); 
                } else {
                    otherKey &= ~(1 << i);
                }
                pCrawl = pCrawl->children[bit];
            }
        }
        return Xor;
    }
};

pair<int, int> findMaximumXorElements(vector<int>& arr) {
    int n = arr.size();
    int maxXor = 0;
    pair<int, int> result; 
    if(n < 2) return result;
    trie* Trie = new trie();
    Trie->insert(0); // insert 0 initially otherwise first query won't find node to traverse
    for(int i = 0; i < n; i++) {
        int elem = 0;
        int curr = Trie->query(arr[i], elem);
        if(curr > maxXor) {
            maxXor = curr;
            result = {arr[i], elem};
        }
        Trie->insert(arr[i]);
    }
    delete Trie;
    return result;
}

Solution 3 - Arrays

Ignoring the sign bit, one of the values must be one of the values with the highest significant bit set. Unless all the values have that bit set, in which case you go to the next highest significant bit that isn't set in all the values. So you could pare down the possibilities for the 1st value by looking at the HSB. For example, if the possibilities are

0x100000
0x100ABC
0x001ABC
0x000ABC

The 1st value of the max pair must be either 0x100000 or 0x10ABCD.

@internal Server Error I don't think smallest is necessarily correct. I don't have a great idea for paring down the 2nd value. Just any value that isn't in the list of possible 1st values. In my example, 0x001ABC or 0x000ABC.

Solution 4 - Arrays

A very interesting problem! Here is my idea:

  • First build a binary tree from all the numbers by using the binary representation and sort them into the tree most significant bit first (add leading zeros to match the longest number). When done each path from the root to any leaf represents one number from the original set.
  • Let a and b be pointers to a tree node and initialize them at the root.
  • Now move a and b down the tree, trying to use opposite edges at each step, i.e. if a moves down a 0-edge, b moves down a 1-edge unless its not possible.

If a and b reach a leaf, the should point to two numbers with "very few" identical bits.

I just made this algorithm up and do not know if its correct or how to prove it. However it should be in O(n) running time.

Solution 5 - Arrays

Make a recursive function that takes two lists of integers, A and B, as its arguments. As its return value, it returns two integers, one from A and one from B, which maximize the XOR of the two. If all the integers are 0, return (0,0). Otherwise, the function does some processing and calls itself recursively twice, but with smaller integers. In one of the recursive calls, it considers taking an integer from list A to supply a 1 to bit k, and in the other call it considers taking an integer from list B to supply a 1 to bit k.

I don't have time now to fill in the details, but maybe this will be enough for to see the answer? Also, I'm not sure if the run time will be better than N^2, but it probably will be.

Solution 6 - Arrays

We can find the maximum number in O(n) time then loop through the array doing xor with each element. Assuming xor operation cost is O(1) we can find max xor of two numbers in O(n) time.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAnil AryaView Question on Stackoverflow
Solution 1 - ArraystemplatetypedefView Answer on Stackoverflow
Solution 2 - ArraysKaidulView Answer on Stackoverflow
Solution 3 - Arraysuser949300View Answer on Stackoverflow
Solution 4 - ArraysNiklasMMView Answer on Stackoverflow
Solution 5 - ArraysDavid GraysonView Answer on Stackoverflow
Solution 6 - Arraysuser4131265View Answer on Stackoverflow