How to add header row to a pandas DataFrame

PythonCsvPandasHeader

Python Problem Overview


I am reading a csv file into pandas. This csv file constists of four columns and some rows, but does not have a header row, which I want to add. I have been trying the following:

Cov = pd.read_csv("path/to/file.txt", sep='\t')
Frame=pd.DataFrame([Cov], columns = ["Sequence", "Start", "End", "Coverage"])
Frame.to_csv("path/to/file.txt", sep='\t')

But when I apply the code, I get the following Error:

ValueError: Shape of passed values is (1, 1), indices imply (4, 1)

What exactly does the error mean? And what would be a clean way in python to add a header row to my csv file/pandas df?

Python Solutions


Solution 1 - Python

You can use names directly in the read_csv

> names : array-like, default None List of column names to use. If file > contains no header row, then you should explicitly pass header=None

Cov = pd.read_csv("path/to/file.txt", 
                  sep='\t', 
                  names=["Sequence", "Start", "End", "Coverage"])

Solution 2 - Python

Alternatively you could read you csv with header=None and then add it with df.columns:

Cov = pd.read_csv("path/to/file.txt", sep='\t', header=None)
Cov.columns = ["Sequence", "Start", "End", "Coverage"]

Solution 3 - Python

col_Names=["Sequence", "Start", "End", "Coverage"]
my_CSV_File= pd.read_csv("yourCSVFile.csv",names=col_Names)

having done this, just check it with:

my_CSV_File.head()

Solution 4 - Python

To fix your code you can simply change [Cov] to Cov.values, the first parameter of pd.DataFrame will become a multi-dimensional numpy array:

Cov = pd.read_csv("path/to/file.txt", sep='\t')
Frame=pd.DataFrame(Cov.values, columns = ["Sequence", "Start", "End", "Coverage"])
Frame.to_csv("path/to/file.txt", sep='\t')

But the smartest solution still is use pd.read_excel with header=None and names=columns_list.

Solution 5 - Python

Simple And Easy Solution:

import pandas as pd

df = pd.read_csv("path/to/file.txt", sep='\t')
headers =  ["Sequence", "Start", "End", "Coverage"]
df.columns = headers

> NOTE: Make sure your header length and CSV file header length should not mismatch.

Solution 6 - Python

Since this is mentioned that we are reading from a csv, so the delimiter should be ','[as default, not need to mention]' and the given file has no header so header=None`

Sample Code :

import pandas as pd
data = pd.read_csv('path/to/file.txt',header=None)
data.columns = ["Sequence", "Start", "End", "Coverage"]
print(data.head()) #Print the first rows

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionsequence_hardView Question on Stackoverflow
Solution 1 - PythonLebView Answer on Stackoverflow
Solution 2 - PythonAnton ProtopopovView Answer on Stackoverflow
Solution 3 - PythonBhardwaj JoshiView Answer on Stackoverflow
Solution 4 - PythonromulomaduView Answer on Stackoverflow
Solution 5 - PythonShoaib Muhammad ArifView Answer on Stackoverflow
Solution 6 - Pythonuser3636989View Answer on Stackoverflow