append multiple values for one key in a dictionary

Python Problem Overview

I am new to python and I have a list of years and values for each year. What I want to do is check if the year already exists in a dictionary and if it does, append the value to that list of values for the specific key.

So for instance, I have a list of years and have one value for each year:

What I want to do is populate a dictionary with the years as keys and those single digit numbers as values. However, if I have 2009 listed twice, I want to append that second value to my list of values in that dictionary, so I want:

2010: 2  
2009: 4, 7  
1989: 8

Right now I have the following:

d = dict()  
years = []  

(get 2 column list of years and values)

for line in list:    
    year = line[0]   
    value = line[1]  

for line in list:  
    if year in d.keys():  
        d[value].append(value)  
    else:  
        d[value] = value  
        d[year] = year

Python Solutions

Solution 1 - Python

If I can rephrase your question, what you want is a dictionary with the years as keys and an array for each year containing a list of values associated with that year, right? Here's how I'd do it:

years_dict = dict()

for line in list:
    if line[0] in years_dict:
        # append the new number to the existing array at this slot
        years_dict[line[0]].append(line[1])
    else:
        # create a new array in this slot
        years_dict[line[0]] = [line[1]]

What you should end up with in years_dict is a dictionary that looks like the following:

{
    "2010": [2],
    "2009": [4,7],
    "1989": [8]
}

In general, it's poor programming practice to create "parallel arrays", where items are implicitly associated with each other by having the same index rather than being proper children of a container that encompasses them both.

Solution 2 - Python

You would be best off using collections.defaultdict (added in Python 2.5). This allows you to specify the default object type of a missing key (such as a list).

So instead of creating a key if it doesn't exist first and then appending to the value of the key, you cut out the middle-man and just directly append to non-existing keys to get the desired result.

A quick example using your data:

>>> from collections import defaultdict
>>> data = [(2010, 2), (2009, 4), (1989, 8), (2009, 7)]
>>> d = defaultdict(list)
>>> d
defaultdict(<type 'list'>, {})
>>> for year, month in data:
...     d[year].append(month)
... 
>>> d
defaultdict(<type 'list'>, {2009: [4, 7], 2010: [2], 1989: [8]})

This way you don't have to worry about whether you've seen a digit associated with a year or not. You just append and forget, knowing that a missing key will always be a list. If a key already exists, then it will just be appended to.

Solution 3 - Python

You can use setdefault.

for line in list:  
    d.setdefault(year, []).append(value)

This works because setdefault returns the list as well as setting it on the dictionary, and because a list is mutable, appending to the version returned by setdefault is the same as appending it to the version inside the dictionary itself. If that makes any sense.

Solution 4 - Python

d = {} 

# import list of year,value pairs

for year,value in mylist:
    try:
        d[year].append(value)
    except KeyError:
        d[year] = [value]

The Python way - it is easier to receive forgiveness than ask permission!

Solution 5 - Python

Here is an alternative way of doing this using the not in operator:

# define an empty dict
years_dict = dict()

for line in list:
    # here define what key is, for example,
    key = line[0]
    # check if key is already present in dict
    if key not in years_dict:
        years_dict[key] = []
    # append some value 
    years_dict[key].append(some.value)

Solution 6 - Python

It's easier if you get these values into a list of tuples. To do this, you can use list slicing and the zip function.

data_in = [2010,2,2009,4,1989,8,2009,7]
data_pairs = zip(data_in[::2],data_in[1::2])

Zip takes an arbitrary number of lists, in this case the even and odd entries of data_in, and puts them together into a tuple.

Now we can use the setdefault method.

data_dict = {}
for x in data_pairs:
    data_dict.setdefault(x[0],[]).append(x[1])

setdefault takes a key and a default value, and returns either associated value, or if there is no current value, the default value. In this case, we will either get an empty or populated list, which we then append the current value to.

Solution 7 - Python

If you want a (almost) one-liner:

from collections import deque
d = {}
deque((d.setdefault(year, []).append(value) for year, value in source_of_data), maxlen=0)

Using dict.setdefault, you can encapsulate the idea of "check if the key already exists and make a new list if not" into a single call. This allows you to write a generator expression which is consumed by deque as efficiently as possible since the queue length is set to zero. The deque will be discarded immediately and the result will be in d.

This is something I just did for fun. I don't recommend using it. There is a time and a place to consume arbitrary iterables through a deque, and this is definitely not it.

Content Type	Original Author	Original Content on Stackoverflow
Question	anon	View Question on Stackoverflow
Solution 1 - Python	Faisal	View Answer on Stackoverflow
Solution 2 - Python	jathanism	View Answer on Stackoverflow
Solution 3 - Python	Daniel Roseman	View Answer on Stackoverflow
Solution 4 - Python	Hugh Bothwell	View Answer on Stackoverflow
Solution 5 - Python	USER_1	View Answer on Stackoverflow
Solution 6 - Python	erik	View Answer on Stackoverflow
Solution 7 - Python	Mad Physicist	View Answer on Stackoverflow