Finding cartesian product with PHP associative arrays

PhpAlgorithmAssociative ArrayCartesian Product

Php Problem Overview


Say that I have an array like the following:

Array
(
    [arm] => Array
        (
            [0] => A
            [1] => B
            [2] => C
        )
    [gender] => Array
        (
            [0] => Female
            [1] => Male
        )
    [location] => Array
        (
            [0] => Vancouver
            [1] => Calgary
        )
)

How can I find the cartesian product while preserving the keys of the outer associative array and using them in the inner ones? The result of the algorithm should be this:

Array
(
    [0] => Array
        (
            [arm] => A
            [gender] => Female
            [location] => Vancouver
        )

    [1] => Array
        (
            [arm] => A
            [gender] => Female
            [location] => Calgary
        )

    [2] => Array
        (
            [arm] => A
            [gender] => Male
            [location] => Vancouver
        )

...etc.

I've looked up quite a number of cartesian product algorithms but I'm getting stuck on the specifics of how to preserve the associative keys. The current algorithm I am using gives numerical indices only:

    $result = array();
    foreach ($map as $a) {
        if (empty($result)) {
            $result = $a;
            continue;
        }
        $res = array();
        foreach ($result as $r) {
            foreach ($a as $v) {
                $res[] = array_merge((array)$r, (array)$v);
            }
        }
        $result = $res;
    }
    
    print_r($result);

Any help would be appreciated.

Php Solutions


Solution 1 - Php

Here's a solution I wouldn't be ashamed to show.

Rationale

Assume that we have an input array $input with N sub-arrays, as in your example. Each sub-array has Cn items, where n is its index inside $input, and its key is Kn. I will refer to the ith item of the nth sub-array as Vn,i.

The algorithm below can be proved to work (barring bugs) by induction:

  1. For N = 1, the cartesian product is simply array(0 => array(K1 => V1,1), 1 => array(K1 => V1,2), ... ) -- C1 items in total. This can be done with a simple foreach.

  2. Assume that $result already holds the cartesian product of the first N-1 sub-arrays. The cartesian product of $result and the Nth sub-array can be produced this way:

  3. In each item (array) inside $product, add the value KN => VN,1. Remember the resulting item (with the added value); I 'll refer to it as $item.

4a) For each array inside $product:

4b) For each value in the set VN,2 ... VN,CN, add to $product a copy of $item, but change the value with the key KN to VN,m (for all 2 <= m <= CN).

The two iterations 4a (over $product) and 4b (over the Nth input sub-array) ends up with $result having CN items for every item it had before the iterations, so in the end $result indeed contains the cartesian product of the first N sub arrays.

Therefore the algorithm will work for any N.

This was harder to write than it should have been. My formal proofs are definitely getting rusty...

Code

function cartesian($input) {
    $result = array();

    while (list($key, $values) = each($input)) {
        // If a sub-array is empty, it doesn't affect the cartesian product
        if (empty($values)) {
            continue;
        }

        // Seeding the product array with the values from the first sub-array
        if (empty($result)) {
            foreach($values as $value) {
                $result[] = array($key => $value);
            }
        }
        else {
            // Second and subsequent input sub-arrays work like this:
            //   1. In each existing array inside $product, add an item with
            //      key == $key and value == first item in input sub-array
            //   2. Then, for each remaining item in current input sub-array,
            //      add a copy of each existing array inside $product with
            //      key == $key and value == first item of input sub-array

            // Store all items to be added to $product here; adding them
            // inside the foreach will result in an infinite loop
            $append = array();

            foreach($result as &$product) {
                // Do step 1 above. array_shift is not the most efficient, but
                // it allows us to iterate over the rest of the items with a
                // simple foreach, making the code short and easy to read.
                $product[$key] = array_shift($values);
                
                // $product is by reference (that's why the key we added above
                // will appear in the end result), so make a copy of it here
                $copy = $product;

                // Do step 2 above.
                foreach($values as $item) {
                    $copy[$key] = $item;
                    $append[] = $copy;
                }

                // Undo the side effecst of array_shift
                array_unshift($values, $product[$key]);
            }

            // Out of the foreach, we can add to $results now
            $result = array_merge($result, $append);
        }
    }

    return $result;
}

Usage

$input = array(
    'arm' => array('A', 'B', 'C'),
    'gender' => array('Female', 'Male'),
    'location' => array('Vancouver', 'Calgary'),
);

print_r(cartesian($input));

Solution 2 - Php

Here is optimized version of @Jon's cartesian function:

function cartesian($input) {
	$result = array(array());

	foreach ($input as $key => $values) {
		$append = array();

		foreach($result as $product) {
			foreach($values as $item) {
				$product[$key] = $item;
				$append[] = $product;
			}
		}

		$result = $append;
	}

	return $result;
}

Read more about the math behind this algorithm: http://en.wikipedia.org/wiki/Cartesian_product

See more examples of this algorithm in different languages: https://rosettacode.org/wiki/Cartesian_product_of_two_or_more_lists

Solution 3 - Php

In PHP 7 @Serg's answer can be shortened to:

function cartesian(array $input)
{
    $result = [[]];
    foreach ($input as $key => $values) {
        $append = [];
        foreach ($values as $value) {
            foreach ($result as $data) {
                $append[] = $data + [$key => $value];
            }
        }
        $result = $append;
    }

    return $result;
}

Solution 4 - Php

Here's what I could come up with:

function inject($elem, $array) {
	return array_map(function ($n) use ($elem) { return array_merge((array)$elem, (array)$n); }, $array);
}

function zip($array1, $array2) {
	return array_reduce($array1, function ($v, $n) use ($array2) { return array_merge($v, inject($n, $array2));  }, array());
}

function cartesian_product($array) {
	$keys = array_keys($array);
	$prod = array_shift($array);
	$prod = array_reduce($array, 'zip', $prod);
	return array_map(function ($n) use ($keys) { return array_combine($keys, $n); }, $prod);
}

(Using pseudo array/list/dictionary notation below since PHP is simply too verbose for such things.)

The inject function transforms a, [b] into [(a,b)], i.e. it injects a single value into each value of an array, returning an array of arrays. It doesn't matter whether a or b already is an array, it'll always return a two dimensional array.

inject('a', ['foo', 'bar'])
    =>  [('a', 'foo'), ('b', 'bar')]

The zip function applies the inject function to each element in an array.

zip(['a', 'b'], ['foo', 'bar'])
    =>  [('a', 'foo'), ('a', 'bar'), ('b', 'foo'), ('b', 'bar')]

Note that this actually produces a cartesian product, so zip is a slight misnomer. Simply applying this function to all elements in a data set in succession gives you the cartesian product for an array of any length.

zip(zip(['a', 'b'], ['foo', 'bar']), ['42', '76'])
    =>  [('a', 'foo', '42'), ('a', 'foo', '76'), ('a', 'bar', '42'), …]

This does not contain the keys, but since the elements are all in order within the result set, you can simply re-inject the keys into the result.

array_combine(['key1', 'key2', 'key3'], ['a', 'foo', '42'])
    =>  [ key1 : 'a', key2 : 'foo', key3 : '42' ]

Applying this to all elements in the product gives the desired result.

You can collapse the above three functions into a single long statement if you wish (which would also clear up the misnomers).


An "unrolled" version without anonymous functions for PHP <= 5.2 would look like this:

function inject($elem, $array) {
	$elem = (array)$elem;
	foreach ($array as &$a) {
		$a = array_merge($elem, (array)$a);
	}
	return $array;
}

function zip($array1, $array2) {
	$prod = array();
	foreach ($array1 as $a) {
		$prod = array_merge($prod, inject($a, $array2));
	}
	return $prod;
}

function cartesian_product($array) {
	$keys = array_keys($array);
	$prod = array_shift($array);
	$prod = array_reduce($array, 'zip', $prod);
	
	foreach ($prod as &$a) {
		$a = array_combine($keys, $a);
	}
	return $prod;
}

Solution 5 - Php

Why not use a recursive generator ... memory issues: close to none
(and it´s beautiful)

function cartesian($a)
{
	if ($a)
	{
		if($u=array_pop($a))
			foreach(cartesian($a)as$p)
				foreach($u as$v)
					yield $p+[count($p)=>$v];
	}
	else
		yield[];
}

note: this does not preserve keys; but it´s a start.

This should do (not tested):

function acartesian($a)
{
	if ($a)
	{
		$k=end(array_keys($a));
		if($u=array_pop($a))
			foreach(acartesian($a)as$p)
				foreach($u as$v)
					yield $p+[$k=>$v];
	}
	else
		yield[];
}

Solution 6 - Php

Another solution:

function getAllVariations($input) {
    $result = array();
    $cnt = array_product(array_map('count', $input));
    $step = 1;
    foreach ($input as $key=>$array) {
        for ($i=0; $i<$cnt; $i++) {
            foreach ($array as $value) {
                for ($k=0; $k<$step; $k++) {
                    $result[$i+$k][$key] = $value;
                }
                $i += $step;
            }
            $i--;
        }
        $step = $step * count($array);
    }
    return $result;
}

Usage:

$input = array(
    'arm' => array('A', 'B', 'C'),
    'gender' => array('Female', 'Male'),
    'location' => array('Vancouver', 'Calgary'),
    'name' => array('Rio', 'Mark')
);

echo "<pre>";
var_dump(getAllVariations($input));

Solution 7 - Php

I quickly adjusted your code a bit , my attempt is crude i think but see if this works as you want:

$result = array();
$nm = '';
foreach ($map as $name => $a) {
    if (empty($result)) {
        $result = $a;
        $nm = $name;
        continue;
    }
	
    $res = array();
    foreach ($result as $r) {
		foreach ($a as $v) {
			$myr = $r;
			$myv = $v;
			if(!is_array($r)) $myr = array($nm => $r);
			if(!is_array($v)) $myv = array($name => $v);
			
			$res[] = array_merge($myr, $myv);
        }
    }
    $result = $res;
}
echo "<pre>";
print_r($result);

Solution 8 - Php

If memory consumption is important or you don't need all the combinations in the end you could use an iterator to generate one combination at a time. If you need all the combinations you can use iterator_to_array.

function cartezianIterator($inputArray)
{
    $maximumPosition = array_map('count', $inputArray);
    $position = array_pad([], count($inputArray), 0);

    while (false !== ($item = buildItemAtPosition($inputArray, $position))) {

        yield $item;

        $position = incrementPosition($position, $maximumPosition);
    }
}

function buildItemAtPosition($inputArray, $positions)
{
    if ($positions[0] >= count($inputArray[0])) {
        return false;
    }

    $item = [];
    foreach ($inputArray as $rowIndex => $row) {
        $position = $positions[$rowIndex];

        $item[] = $row[$position];
    }

    return $item;
}

function incrementPosition($position, $maximumPosition)
{
    $digitToIncrement = count($position) - 1;

    do {
        $position[$digitToIncrement]++;

        if ($position[$digitToIncrement] < $maximumPosition[$digitToIncrement] || 0 === $digitToIncrement) {
            //no overflow
            break;
        }

        //overflow, reset to zero and increment parent digit
        $position[$digitToIncrement] = 0;

        $digitToIncrement--;
    } while ($digitToIncrement >= 0);

    return $position;
}

Then, to get one solution at a time you could use a foreach or next, like this:

$iterator = cartezianIterator($inputArray);

//of course, you need to do something with the result...
$combination = next($iterator);
$combination = next($iterator);
$combination = next($iterator);
$combination = next($iterator);
$combination = next($iterator);
$combination = next($iterator);

This solution is very very fast if you need only a few combinations. Also, the memory consumption is very low (it uses a flat array to store some integers).

Note: recursive functions are not used.

Solution 9 - Php

Why not use a database to do this?

It's easy in MySQL..

table arm
   id integer primary key
   label char

table gender
   id integer primary key
   gender enum('male','female')

table location
   id integer primary key
   city varchar(255)

Then do a query

$query = mysql_query(" 
  SELECT a.label, g.gender, l.city
  FROM arm a
  CROSS JOIN gender g
  CROSS JOIN location l
  ORDER BY a.id
") or die("Could not execute query");

while($row = mysql_fetch_array($query) )
{
   ....
}

And read that out:

Solution 10 - Php

One algorithm is to expand at each step the previous results with the current step items:

function cartezian1($inputArray)
{
    $results = [];

    foreach ($inputArray as $group) {
        $results = expandItems($results, $group);
    }

    return $results;
}

function expandItems($sourceItems, $tails)
{
    $result = [];

    if (empty($sourceItems)) {
        foreach ($tails as $tail) {
            $result[] = [$tail];
        }
        return $result;
    }

    foreach ($sourceItems as $sourceItem) {
        foreach ($tails as $tail) {
            $result[] = array_merge($sourceItem, [$tail]);
        }
    }

    return $result;
}

This solution uses memory to store the all combinations then returns them all at once. So, it's fast but it needs a lot of memory. Also, recursive functions are not used.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLotus NotesView Question on Stackoverflow
Solution 1 - PhpJonView Answer on Stackoverflow
Solution 2 - PhpSergiy SokolenkoView Answer on Stackoverflow
Solution 3 - PhpfreytagView Answer on Stackoverflow
Solution 4 - PhpdecezeView Answer on Stackoverflow
Solution 5 - PhpTitusView Answer on Stackoverflow
Solution 6 - PhpRespantView Answer on Stackoverflow
Solution 7 - PhpSabeen MalikView Answer on Stackoverflow
Solution 8 - PhpConstantin GalbenuView Answer on Stackoverflow
Solution 9 - PhpJohanView Answer on Stackoverflow
Solution 10 - PhpConstantin GalbenuView Answer on Stackoverflow