Using a byte array as Map key

JavaHashmapBytearray

Java Problem Overview


Do you see any problem with using a byte array as Map key? I could also do new String(byte[]) and hash by String but it is more straightforward to use byte[].

Java Solutions


Solution 1 - Java

It's okay so long as you only want reference equality for your key - arrays don't implement "value equality" in the way that you'd probably want. For example:

byte[] array1 = new byte[1];
byte[] array2 = new byte[1];

System.out.println(array1.equals(array2));
System.out.println(array1.hashCode());
System.out.println(array2.hashCode());

prints something like:

false
1671711
11394033

(The actual numbers are irrelevant; the fact that they're different is important.)

Assuming you actually want equality, I suggest you create your own wrapper which contains a byte[] and implements equality and hash code generation appropriately:

public final class ByteArrayWrapper
{
    private final byte[] data;
    
    public ByteArrayWrapper(byte[] data)
    {
        if (data == null)
        {
            throw new NullPointerException();
        }
        this.data = data;
    }
    
    @Override
    public boolean equals(Object other)
    {
        if (!(other instanceof ByteArrayWrapper))
        {
            return false;
        }
        return Arrays.equals(data, ((ByteArrayWrapper)other).data);
    }
    
    @Override
    public int hashCode()
    {
        return Arrays.hashCode(data);
    }
}

Note that if you change the values within the byte array after using the ByteArrayWrapper, as a key in a HashMap (etc) you'll have problems looking up the key again... you could take a copy of the data in the ByteArrayWrapper constructor if you want, but obviously that will be a waste of performance if you know you won't be changing the contents of the byte array.

EDIT: As mentioned in the comments, you could also use ByteBuffer for this (in particular, its ByteBuffer#wrap(byte[]) method). I don't know whether it's really the right thing, given all the extra abilities that ByteBuffers have which you don't need, but it's an option.

Solution 2 - Java

The problem is that byte[] uses object identity for equals and hashCode, so that

byte[] b1 = {1, 2, 3}
byte[] b2 = {1, 2, 3}

will not match in a HashMap. I see three options:

  1. Wrapping in a String, but then you have to be careful about encoding issues (you need to make certain that the byte -> String -> byte gives you the same bytes).
  2. Use List<Byte> (can be expensive in memory).
  3. Do your own wrapping class, writing hashCode and equals to use the contents of the byte array.

Solution 3 - Java

We can use ByteBuffer for this (This is basically the byte[] wrapper with a comparator)

HashMap<ByteBuffer, byte[]> kvs = new HashMap<ByteBuffer, byte[]>();
byte[] k1 = new byte[]{1,2 ,3};
byte[] k2 = new byte[]{1,2 ,3};
byte[] val = new byte[]{12,23,43,4};

kvs.put(ByteBuffer.wrap(k1), val);
System.out.println(kvs.containsKey(ByteBuffer.wrap(k2)));

will print

true

Solution 4 - Java

You could use java.math.BigInteger. It has a BigInteger(byte[] val) constructor. It's a reference type, so could be used as a key for hashtable. And .equals() and .hashCode() are defined as for respective integer numbers, which means BigInteger has consistent equals semantics as byte[] array.

Solution 5 - Java

I am very surprised that the answers are not pointing out the most simple alternative.

Yes, it is not possible to use a HashMap, but nobody prevents you from using a SortedMap as alternative. The only thing is to write a Comparator which needs to compare the arrays. It is not as performant as a HashMap, but if you want a simple alternative, here you go (you can replace SortedMap with Map if you want to hide the implementation):

 private SortedMap<int[], String>  testMap = new TreeMap<>(new ArrayComparator());

 private class ArrayComparator implements Comparator<int[]> {
    @Override
    public int compare(int[] o1, int[] o2) {
      int result = 0;
      int maxLength = Math.max(o1.length, o2.length);
      for (int index = 0; index < maxLength; index++) {
        int o1Value = index < o1.length ? o1[index] : 0;
        int o2Value = index < o2.length ? o2[index] : 0;
        int cmp     = Integer.compare(o1Value, o2Value);
        if (cmp != 0) {
          result = cmp;
          break;
        }
      }
      return result;
    }
  }

This implementation can be adjusted for other arrays, the only thing you must be aware of is that equal arrays (= equal length with equal members) must return 0 and that you have a determistic order

Solution 6 - Java

I believe that arrays in Java do not necessarily implement the hashCode() and equals(Object) methods intuitively. That is, two identical byte arrays will not necessarily share the same hash code and they will not necessarily claim to be equal. Without these two traits, your HashMap will behave unexpectedly.

Therefore, I recommend against using byte[] as keys in a HashMap.

Solution 7 - Java

You should use create a class somthing like ByteArrKey and overload hashcode and equal methods, remember the contract between them.

This will give you greater flexibility as you can skip 0 entries that are appended at the end of byte array, specially if you copy only some part form the other byte buffer.

This way you will decide how both objects SHOULD be equal.

Solution 8 - Java

Here is a solution using TreeMap, Comparator interface and java method java.util.Arrays.equals(byte[], byte[]);

NOTE: The ordering in the map is not relevant with this method

SortedMap<byte[], String> testMap = new TreeMap<>(new ArrayComparator());

static class ArrayComparator implements Comparator<byte[]> {
    @Override
    public int compare(byte[] byteArray1, byte[] byteArray2) {

        int result = 0;

        boolean areEquals = Arrays.equals(byteArray1, byteArray2);

        if (!areEquals) {
            result = -1;
        }

        return result;
    }
}

Solution 9 - Java

I see problems since you should use Arrays.equals and Array.hashCode, in place of default array implementations

Solution 10 - Java

Arrays.toString(bytes)

Solution 11 - Java

You could also convert the byte[] to a 'safe' string using Base32 or Base64, for example:

byte[] keyValue = new byte[] {…};
String key = javax.xml.bind.DatatypeConverter.printBase64Binary(keyValue);

of course there are many variants of the above, like:

String key = org.apache.commons.codec.binary.Base64.encodeBase64(keyValue);

Solution 12 - Java

Also, We can create own custom ByteHashMap like this,

ByteHashMap byteMap = new ByteHashMap();
byteMap.put(keybyteArray,valueByteArray);

Here is the complete implementation

public class ByteHashMap implements Map<byte[], byte[]>, Cloneable,
        Serializable {
    
    private Map<ByteArrayWrapper, byte[]> internalMap = new HashMap<ByteArrayWrapper, byte[]>();

    public void clear() {
        internalMap.clear();
    }

    public boolean containsKey(Object key) {
        if (key instanceof byte[])
            return internalMap.containsKey(new ByteArrayWrapper((byte[]) key));
        return internalMap.containsKey(key);
    }

    public boolean containsValue(Object value) {
        return internalMap.containsValue(value);
    }

    public Set<java.util.Map.Entry<byte[], byte[]>> entrySet() {
        Iterator<java.util.Map.Entry<ByteArrayWrapper, byte[]>> iterator = internalMap
                .entrySet().iterator();
        HashSet<Entry<byte[], byte[]>> hashSet = new HashSet<java.util.Map.Entry<byte[], byte[]>>();
        while (iterator.hasNext()) {
            Entry<ByteArrayWrapper, byte[]> entry = iterator.next();
            hashSet.add(new ByteEntry(entry.getKey().data, entry
                    .getValue()));
        }
        return hashSet;
    }

    public byte[] get(Object key) {
        if (key instanceof byte[])
            return internalMap.get(new ByteArrayWrapper((byte[]) key));
        return internalMap.get(key);
    }

    public boolean isEmpty() {
        return internalMap.isEmpty();
    }

    public Set<byte[]> keySet() {
        Set<byte[]> keySet = new HashSet<byte[]>();
        Iterator<ByteArrayWrapper> iterator = internalMap.keySet().iterator();
        while (iterator.hasNext()) {
            keySet.add(iterator.next().data);
        }
        return keySet;
    }

    public byte[] put(byte[] key, byte[] value) {
        return internalMap.put(new ByteArrayWrapper(key), value);
    }

    @SuppressWarnings("unchecked")
    public void putAll(Map<? extends byte[], ? extends byte[]> m) {
        Iterator<?> iterator = m.entrySet().iterator();
        while (iterator.hasNext()) {
            Entry<? extends byte[], ? extends byte[]> next = (Entry<? extends byte[], ? extends byte[]>) iterator
                    .next();
            internalMap.put(new ByteArrayWrapper(next.getKey()), next
                    .getValue());
        }
    }

    public byte[] remove(Object key) {
        if (key instanceof byte[])
            return internalMap.remove(new ByteArrayWrapper((byte[]) key));
        return internalMap.remove(key);
    }

    public int size() {
        return internalMap.size();
    }

    public Collection<byte[]> values() {
        return internalMap.values();
    }

    private final class ByteArrayWrapper {
        private final byte[] data;

        public ByteArrayWrapper(byte[] data) {
            if (data == null) {
                throw new NullPointerException();
            }
            this.data = data;
        }

        public boolean equals(Object other) {
            if (!(other instanceof ByteArrayWrapper)) {
                return false;
            }
            return Arrays.equals(data, ((ByteArrayWrapper) other).data);
        }

        public int hashCode() {
            return Arrays.hashCode(data);
        }
    }

    private final class ByteEntry implements Entry<byte[], byte[]> {
        private byte[] value;
        private byte[] key;

        public ByteEntry(byte[] key, byte[] value) {
            this.key = key;
            this.value = value;
        }

        public byte[] getKey() {
            return this.key;
        }

        public byte[] getValue() {
            return this.value;
        }

        public byte[] setValue(byte[] value) {
            this.value = value;
            return value;
        }

    }
}

Solution 13 - Java

Other answers have not pointed out that not all byte[] covert into unique String. I fell into this trap doing new String(byteArray) as keys to a map only to find that many negative bytes are mapped to the same string. Here is a test that demonstrates that problem:

    @Test
    public void testByteAsStringMap() throws Exception {
        HashMap<String, byte[]> kvs = new HashMap<>();
        IntStream.range(Byte.MIN_VALUE, Byte.MAX_VALUE).forEach(b->{
            byte[] key = {(byte)b};
            byte[] value = {(byte)b};
            kvs.put(new String(key), value);
        });
        Assert.assertEquals(255, kvs.size());
    }

It will throw:

> java.lang.AssertionError: Expected :255 Actual :128

It does that because a String is a sequence of character code points and any conversion from a byte[] is based on some byte encoding. In the above case, the platform default encoding happens to map many negative bytes to the same character. Another fact about String is that it always takes and gives a copy of its internal state. If the original bytes came from a String that was a copy, then wrapping it as a String to use it as the key to a map takes a second copy. That may generate a lot of garbage that might be avoidable.

There is a good answer here that suggests using java.nio.ByteBuffer with ByteBuffer.wrap(b). The problem with that is that byte[] is mutable and it doesn't take a copy so you must be careful to take a defensive copy of any the arrays passed to you with ByteBuffer.wrap(b.clone()) else the keys of your map will be corrupted. If you look at the result of a map with ByteBuffer keys in a debugger you will see that the buffers have a lot of internal references designed to track reading to and writing from each buffer. So the objects are much more heavyweight than wrapping in a simple String. Finally, even a string holds more state than needed. Looking at it in my debugger it stores characters as a two-byte UTF16 array and also stores a four-byte hashcode.

My preferred approach is to have Lombok generate at compile time the boilerplate to make a lightweight byte array wrapper that doesn't store additional state:

import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.ToString;

@ToString
@EqualsAndHashCode
@Data(staticConstructor="of")
class ByteSequence {
    final byte[] bytes;
}

This then passes the test that checks that all possible bytes map to a unique string:

    byte[] bytes(int b){
        return new byte[]{(byte)b};
    }

    @Test
    public void testByteSequenceAsMapKey() {
        HashMap<ByteSequence, byte[]> kvs = new HashMap<>();
        IntStream.range(Byte.MIN_VALUE, Byte.MAX_VALUE).forEach(b->{
            byte[] key = {(byte)b};
            byte[] value = {(byte)b};
            kvs.put(ByteSequence.of(key), value);
        });
        Assert.assertEquals(255, kvs.size());
        byte[] empty = {};
        kvs.put(ByteSequence.of(empty), bytes(1));
        Assert.assertArrayEquals(bytes(1), kvs.get(ByteSequence.of(empty)));
    }

You then don't have to worry about getting the equals and hashcode logic correct as it is supplied by Lombok where it does Arrays.deepEquals which is documented at https://projectlombok.org/features/EqualsAndHashCode Note that lombok isn't a runtime dependency only a compile-time dependency and you can install an opensource plugin to your IDE so that your IDE "sees" all the generated boilerplate methods.

With this implementation, you still have to worry about above the mutability of the byte. If someone passes you a byte[] that might be mutated, you should take a defensive copy using clone():

kvs.put(ByteSequence.of(key.clone()), value);

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionshikharView Question on Stackoverflow
Solution 1 - JavaJon SkeetView Answer on Stackoverflow
Solution 2 - JavaKathy Van StoneView Answer on Stackoverflow
Solution 3 - Javabyte_arrayView Answer on Stackoverflow
Solution 4 - JavaArtem OboturovView Answer on Stackoverflow
Solution 5 - JavaThorsten S.View Answer on Stackoverflow
Solution 6 - JavaAdam PaynterView Answer on Stackoverflow
Solution 7 - JavaMilind PatilView Answer on Stackoverflow
Solution 8 - JavamatdevView Answer on Stackoverflow
Solution 9 - JavadfaView Answer on Stackoverflow
Solution 10 - Javadf.View Answer on Stackoverflow
Solution 11 - JavaChristof RView Answer on Stackoverflow
Solution 12 - JavaRakesh ChaudhariView Answer on Stackoverflow
Solution 13 - Javasimbo1905View Answer on Stackoverflow