How do I read and write CSV files with Python?

PythonCsv

Python Problem Overview


I have a file example.csv with the contents

1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3

How do I read this example.csv with Python?

Similarly, if I have

data = [(1, "A towel,", 1.0),
        (42, " it says, ", 2.0),
        (1337, "is about the most ", -1),
        (0, "massively useful thing ", 123),
        (-2, "an interstellar hitchhiker can have.", 3)]

How do I write data to a CSV file with Python?

Python Solutions


Solution 1 - Python

Here are some minimal complete examples how to read CSV files and how to write CSV files with Python.

Python 3: Reading a CSV file

Pure Python

import csv

# Define data
data = [
    (1, "A towel,", 1.0),
    (42, " it says, ", 2.0),
    (1337, "is about the most ", -1),
    (0, "massively useful thing ", 123),
    (-2, "an interstellar hitchhiker can have.", 3),
]

# Write CSV file
with open("test.csv", "wt") as fp:
    writer = csv.writer(fp, delimiter=",")
    # writer.writerow(["your", "header", "foo"])  # write header
    writer.writerows(data)

# Read CSV file
with open("test.csv") as fp:
    reader = csv.reader(fp, delimiter=",", quotechar='"')
    # next(reader, None)  # skip the headers
    data_read = [row for row in reader]

print(data_read)

After that, the contents of data_read are

[['1', 'A towel,', '1.0'],
 ['42', ' it says, ', '2.0'],
 ['1337', 'is about the most ', '-1'],
 ['0', 'massively useful thing ', '123'],
 ['-2', 'an interstellar hitchhiker can have.', '3']]

Please note that CSV reads only strings. You need to convert to the column types manually.

A Python 2+3 version was here before (link), but Python 2 support is dropped. Removing the Python 2 stuff massively simplified this answer.

mpu

Have a look at my utility package mpu for a super simple and easy to remember one:

import mpu.io
data = mpu.io.read('example.csv', delimiter=',', quotechar='"', skiprows=None)
mpu.io.write('example.csv', data)

Pandas

import pandas as pd

# Read the CSV into a pandas data frame (df)
#   With a df you can do many things
#   most important: visualize data with Seaborn
df = pd.read_csv('myfile.csv', sep=',')
print(df)

# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]

# or export it as a list of dicts
dicts = df.to_dict().values()

See read_csv docs for more information. Please note that pandas automatically infers if there is a header line, but you can set it manually, too.

If you haven't heard of Seaborn, I recommend having a look at it.

Other

Reading CSV files is supported by a bunch of other libraries, for example:

Created CSV file

1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3

Common file endings

.csv

Working with the data

After reading the CSV file to a list of tuples / dicts or a Pandas dataframe, it is simply working with this kind of data. Nothing CSV specific.

Alternatives

For your application, the following might be important:

  • Support by other programming languages
  • Reading / writing performance
  • Compactness (file size)

See also: Comparison of data serialization formats

In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python

Solution 2 - Python

Writing a CSV file

First you need to import csv

For eg:

import csv

with open('eggs.csv', 'wb') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=' ',
                        quotechar='|', quoting=csv.QUOTE_MINIMAL)
    spamwriter.writerow(['Spam'] * 5 + ['Baked Beans'])
    spamwriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam'])

Solution 3 - Python

If needed- read a csv file without using the csv module:

rows = []
with open('test.csv') as f:
    for line in f:
        # strip whitespace
        line = line.strip()
        # separate the columns
        line = line.split(',')
        # save the line for use later
        rows.append(line)

Solution 4 - Python

If you are working with CSV data and want a solution with a smaller footprint than pandas, you can try my package, littletable. Can be pip-installed, or just dropped in as a single .py file with your own code, so very portable and suitable for serverless apps.

Reading CSV data is as simple as calling csv_import:

data = """\
1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3"""

import littletable as lt
tbl = lt.Table().csv_import(data, fieldnames="number1,words,number2".split(','))
tbl.present()

Prints:

  Number1   Words                                  Number2  
 ────────────────────────────────────────────────────────── 
  1         A towel,                               1.0      
  42         it says,                              2.0      
  1337      is about the most                      -1       
  0         massively useful thing                 123      
  -2        an interstellar hitchhiker can have.   3    

(littletable uses the rich module for presenting Tables.)

littletable doesn't automatically try to convert numeric data, so a numeric transform function is needed for the numeric columns.

def get_numeric(s):
    try:
        return int(s)
    except ValueError:
        try:
            return float(s)
        except ValueError:
            return s

tbl = lt.Table().csv_import(
    data,
    fieldnames="number1,words,number2".split(','),
    transforms={}.fromkeys("number1 number2".split(), get_numeric)
)
tbl.present()

This gives:

  Number1   Words                                  Number2  
 ────────────────────────────────────────────────────────── 
        1   A towel,                                   1.0  
       42    it says,                                  2.0  
     1337   is about the most                           -1  
        0   massively useful thing                     123  
       -2   an interstellar hitchhiker can have.         3  

The numeric columns are right-justified instead of left-justified.

littletable also has other ORM-ish features, such as indexing, joining, pivoting, and full-text search. Here is a table of statistics on the numeric columns:

tbl.stats("number1 number2".split()).present()

  Name       Mean   Min    Max   Variance              Std_Dev   Count   Missing  
 ──────────────────────────────────────────────────────────────────────────────── 
  number1   275.6    -2   1337   352390.3    593.6247130974249       5         0  
  number2    25.6    -1    123     2966.8   54.468339427597755       5         0  

or transposed:

tbl.stats("number1 number2".split(), by_field=False).present()

  Stat                 Number1              Number2 
 ─────────────────────────────────────────────────── 
  mean                   275.6                 25.6
  min                       -2                   -1
  max                     1337                  123
  variance            352390.3               2966.8
  std_dev    593.6247130974249   54.468339427597755
  count                      5                    5
  missing                    0                    0

Other formats can be output too, such as Markdown:

print(tbl.stats("number1 number2".split(), by_field=False).as_markdown())

| stat | number1 | number2 |
|---|---:|---:|
| mean | 275.6 | 25.6 |
| min | -2 | -1 |
| max | 1337 | 123 |
| variance | 352390.3 | 2966.8 |
| std_dev | 593.6247130974249 | 54.468339427597755 |
| count | 5 | 5 |
| missing | 0 | 0 |

Which would render from Markdown as

stat number1 number2
mean 275.6 25.6
min -2 -1
max 1337 123
variance 352390.3 2966.8
std_dev 593.6247130974249 54.468339427597755
count 5 5
missing 0 0

Lastly, here is a text search on the words for any entry with the word "hitchhiker":

tbl.create_search_index("words")
for match, score in tbl.search.words("hitchhiker"):
    print(match)

Prints:

namespace(number1=-2, words='an interstellar hitchhiker can have.', number2=3)

Solution 5 - Python

import csv
with open(fileLocation+'example.csv',newline='') as File: #the csv file is stored in a File object
                                                       
    reader=csv.reader(File)       #csv.reader is used to read a file
	for row in reader:
    	print(row)

Solution 6 - Python

To read a csv file using Pandas

use pd.read_csv("D:\\sample.csv")

using only python :

fopen=open("D:\\sample.csv","r") 

print(fopen.read())

To create and write into a csv file

The below example demonstrate creating and writing a csv file. to make a dynamic file writer we need to import a package import csv, then need to create an instance of the file with file reference Ex:

with open("D:\sample.csv","w",newline="") as file_writer

Here if the file does not exist with the mentioned file directory then python will create a same file in the specified directory, and w represents write, if you want to read a file then replace w with r or to append to existing file then a.

newline="" specifies that it removes an extra empty row for every time you create row so to eliminate empty row we use newline="", create some field names(column names) using list like:

fields=["Names","Age","Class"]

Then apply to writer instance like:

writer=csv.DictWriter(file_writer,fieldnames=fields)

Here using Dictionary writer and assigning column names, to write column names to csv we use writer.writeheader() and to write values we use writer.writerow({"Names":"John","Age":20,"Class":"12A"}) ,while writing file values must be passed using dictionary method , here the key is column name and value is your respective key value.

Import csv:

with open("D:\sample.csv","w",newline="") as file_writer:

fields=["Names","Age","Class"]

writer=csv.DictWriter(file_writer,fieldnames=fields)

writer.writeheader()

writer.writerow({"Names":"John","Age":21,"Class":"12A"})

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMartin ThomaView Question on Stackoverflow
Solution 1 - PythonMartin ThomaView Answer on Stackoverflow
Solution 2 - PythonSyed Abdul RehmanView Answer on Stackoverflow
Solution 3 - PythonwwiiView Answer on Stackoverflow
Solution 4 - PythonPaulMcGView Answer on Stackoverflow
Solution 5 - PythonSiddharth Kumar ShuklaView Answer on Stackoverflow
Solution 6 - Pythonprasanna kumarView Answer on Stackoverflow