Creating a dictionary from a csv file?
PythonCsvDictionaryList ComprehensionPython Problem Overview
I am trying to create a dictionary from a csv file. The first column of the csv file contains unique keys and the second column contains values. Each row of the csv file represents a unique key, value pair within the dictionary. I tried to use the csv.DictReader
and csv.DictWriter
classes, but I could only figure out how to generate a new dictionary for each row. I want one dictionary. Here is the code I am trying to use:
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
for rows in reader:
k = rows[0]
v = rows[1]
mydict = {k:v for k, v in rows}
print(mydict)
When I run the above code I get a ValueError: too many values to unpack (expected 2)
. How do I create one dictionary from a csv file? Thanks.
Python Solutions
Solution 1 - Python
I believe the syntax you were looking for is as follows:
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
mydict = {rows[0]:rows[1] for rows in reader}
Alternately, for python <= 2.7.1, you want:
mydict = dict((rows[0],rows[1]) for rows in reader)
Solution 2 - Python
Open the file by calling open and then using csv.DictReader.
input_file = csv.DictReader(open("coors.csv"))
You may iterate over the rows of the csv file dict reader object by iterating over input_file.
for row in input_file:
print(row)
OR To access first line only
dictobj = csv.DictReader(open('coors.csv')).next()
UPDATE In python 3+ versions, this code would change a little:
reader = csv.DictReader(open('coors.csv'))
dictobj = next(reader)
Solution 3 - Python
import csv
reader = csv.reader(open('filename.csv', 'r'))
d = {}
for row in reader:
k, v = row
d[k] = v
Solution 4 - Python
This isn't elegant but a one line solution using pandas.
import pandas as pd
pd.read_csv('coors.csv', header=None, index_col=0, squeeze=True).to_dict()
If you want to specify dtype for your index (it can't be specified in read_csv if you use the index_col argument because of a bug):
import pandas as pd
pd.read_csv('coors.csv', header=None, dtype={0: str}).set_index(0).squeeze().to_dict()
Solution 5 - Python
You have to just convert csv.reader to dict:
~ >> cat > 1.csv
key1, value1
key2, value2
key2, value22
key3, value3
~ >> cat > d.py
import csv
with open('1.csv') as f:
d = dict(filter(None, csv.reader(f)))
print(d)
~ >> python d.py
{'key3': ' value3', 'key2': ' value22', 'key1': ' value1'}
Solution 6 - Python
You can also use numpy for this.
from numpy import loadtxt
key_value = loadtxt("filename.csv", delimiter=",")
mydict = { k:v for k,v in key_value }
Solution 7 - Python
Assuming you have a CSV of this structure:
"a","b"
1,2
3,4
5,6
And you want the output to be:
[{'a': '1', ' "b"': '2'}, {'a': '3', ' "b"': '4'}, {'a': '5', ' "b"': '6'}]
A zip function (not yet mentioned) is simple and quite helpful.
def read_csv(filename):
with open(filename) as f:
file_data=csv.reader(f)
headers=next(file_data)
return [dict(zip(headers,i)) for i in file_data]
If you prefer pandas, it can also do this quite nicely:
import pandas as pd
def read_csv(filename):
return pd.read_csv(filename).to_dict('records')
Solution 8 - Python
One-liner solution
import pandas as pd
dict = {row[0] : row[1] for _, row in pd.read_csv("file.csv").iterrows()}
Solution 9 - Python
For simple csv files, such as the following
id,col1,col2,col3
row1,r1c1,r1c2,r1c3
row2,r2c1,r2c2,r2c3
row3,r3c1,r3c2,r3c3
row4,r4c1,r4c2,r4c3
You can convert it to a Python dictionary using only built-ins
with open(csv_file) as f:
csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]
(_, *header), *data = csv_list
csv_dict = {}
for row in data:
key, *values = row
csv_dict[key] = {key: value for key, value in zip(header, values)}
This should yield the following dictionary
{'row1': {'col1': 'r1c1', 'col2': 'r1c2', 'col3': 'r1c3'},
'row2': {'col1': 'r2c1', 'col2': 'r2c2', 'col3': 'r2c3'},
'row3': {'col1': 'r3c1', 'col2': 'r3c2', 'col3': 'r3c3'},
'row4': {'col1': 'r4c1', 'col2': 'r4c2', 'col3': 'r4c3'}}
Note: Python dictionaries have unique keys, so if your csv file has duplicate ids
you should append each row to a list.
for row in data:
key, *values = row
if key not in csv_dict:
csv_dict[key] = []
csv_dict[key].append({key: value for key, value in zip(header, values)})
Solution 10 - Python
I'd suggest adding if rows
in case there is an empty line at the end of the file
import csv
with open('coors.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('coors_new.csv', mode='w') as outfile:
writer = csv.writer(outfile)
mydict = dict(row[:2] for row in reader if row)
Solution 11 - Python
If you are OK with using the numpy package, then you can do something like the following:
import numpy as np
lines = np.genfromtxt("coors.csv", delimiter=",", dtype=None)
my_dict = dict()
for i in range(len(lines)):
my_dict[lines[i][0]] = lines[i][1]
Solution 12 - Python
with pandas, it is much easier, for example.
assuming you have the following data as CSV and let's call it test.txt
/ test.csv
(you know CSV is a sort of text file )
a,b,c,d
1,2,3,4
5,6,7,8
now using pandas
import pandas as pd
df = pd.read_csv("./text.txt")
df_to_doct = df.to_dict()
for each row, it would be
df.to_dict(orient='records')
and that's it.
Solution 13 - Python
You can use this, it is pretty cool:
import dataconverters.commas as commas
filename = 'test.csv'
with open(filename) as f:
records, metadata = commas.parse(f)
for row in records:
print 'this is row in dictionary:'+rowenter code here
Solution 14 - Python
Try to use a defaultdict
and DictReader
.
import csv
from collections import defaultdict
my_dict = defaultdict(list)
with open('filename.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
for line in csv_reader:
for key, value in line.items():
my_dict[key].append(value)
It returns:
{'key1':[value_1, value_2, value_3], 'key2': [value_a, value_b, value_c], 'Key3':[value_x, Value_y, Value_z]}
Solution 15 - Python
Many solutions have been posted and I'd like to contribute with mine, which works for a different number of columns in the CSV file. It creates a dictionary with one key per column, and the value for each key is a list with the elements in such column.
input_file = csv.DictReader(open(path_to_csv_file))
csv_dict = {elem: [] for elem in input_file.fieldnames}
for row in input_file:
for key in csv_dict.keys():
csv_dict[key].append(row[key])
Solution 16 - Python
If you have:
- Only 1 key and 1 value as key,value in your csv
- Do not want to import other packages
- Want to create a dict in one shot
Do this:
mydict = {y[0]: y[1] for y in [x.split(",") for x in open('file.csv').read().split('\n') if x]}
What does it do?
It uses list comprehension to split lines and the last "if x" is used to ignore blank line (usually at the end) which is then unpacked into a dict using dictionary comprehension.
Solution 17 - Python
The question derailed us from the correct solution... which requires taking a step back and asking if we chose the correct format to store dictionary data? For a dictionary, a CSV file is a lossy format that silently casts all numeric values to string values... so the correct answer would be IMO to save it to JSON in the first place.
And then simply:
import json
my_dict = json.load(open('my_file.json', 'r'))