start index at 1 for Pandas DataFrame

PythonPandasCsvDataframeIndexing

Python Problem Overview


I need the index to start at 1 rather than 0 when writing a Pandas DataFrame to CSV.

Here's an example:

In [1]: import pandas as pd

In [2]: result = pd.DataFrame({'Count': [83, 19, 20]})

In [3]: result.to_csv('result.csv', index_label='Event_id')                               

Which produces the following output:

In [4]: !cat result.csv
Event_id,Count
0,83
1,19
2,20

But my desired output is this:

In [5]: !cat result2.csv
Event_id,Count
1,83
2,19
3,20

I realize that this could be done by adding a sequence of integers shifted by 1 as a column to my data frame, but I'm new to Pandas and I'm wondering if a cleaner way exists.

Python Solutions


Solution 1 - Python

Index is an object, and default index starts from 0:

>>> result.index
Int64Index([0, 1, 2], dtype=int64)

You can shift this index by 1 with

>>> result.index += 1 
>>> result.index
Int64Index([1, 2, 3], dtype=int64)

Solution 2 - Python

Just set the index before writing to CSV.

df.index = np.arange(1, len(df))

And then write it normally.

Solution 3 - Python

source: https://stackoverflow.com/questions/32249960/in-python-pandas-start-row-index-from-1-instead-of-zero-without-creating-additi/32249984#32249984

Working example:

import pandas as pdas
dframe = pdas.read_csv(open(input_file))
dframe.index = dframe.index + 1

Solution 4 - Python

Another way in one line:

df.shift()[1:]

Solution 5 - Python

This worked for me

 df.index = np.arange(1, len(df)+1)

Solution 6 - Python

You can use this one:

import pandas as pd

result = pd.DataFrame({'Count': [83, 19, 20]})
result.index += 1
print(result)

or this one, by getting the help of numpy library like this:

import pandas as pd
import numpy as np

result = pd.DataFrame({'Count': [83, 19, 20]})
result.index = np.arange(1, len(result)+1)
print(result)

np.arange will create a numpy array and return values within a given interval which is (1, len(result)+1) and finally you will assign that array to result.index.

Solution 7 - Python

use this

df.index = np.arange(1, len(df)+1)

Solution 8 - Python

In my opinion best practice is to set the index with a RangeIndex

import pandas as pd

result = pd.DataFrame(
    {'Count': [83, 19, 20]}, 
    index=pd.RangeIndex(start=1, stop=4, name='index')
)
>>> result
       Count
index       
1         83
2         19
3         20

I prefer this, because you can define the range and a possible step and a name for the index in one line.

Solution 9 - Python

Add ".shift()[1:]" while creating a data frame

data = pd.read_csv(r"C:\Users\user\path\data.csv").shift()[1:]

Solution 10 - Python

Fork from the original answer, giving some cents:

  • if I'm not mistaken, starting from version 0.23, index object is RangeIndex type

From the official doc:

> RangeIndex is a memory-saving special case of Int64Index limited to representing monotonic ranges. Using RangeIndex may in some instances improve computing speed.

In case of a huge index range, that makes sense, using the representation of the index, instead of defining the whole index at once (saving memory).

Therefore, an example (using Series, but it applies to DataFrame also):

>>> import pandas as pd
>>> 
>>> countries = ['China', 'India', 'USA']
>>> ds = pd.Series(countries)
>>> 
>>>
>>> type(ds.index)
<class 'pandas.core.indexes.range.RangeIndex'>
>>> ds.index
RangeIndex(start=0, stop=3, step=1)
>>> 
>>> ds.index += 1
>>> 
>>> ds.index
RangeIndex(start=1, stop=4, step=1)
>>> 
>>> ds
1    China
2    India
3      USA
dtype: object
>>> 

As you can see, the increment of the index object, changes the start and stop parameters.

Solution 11 - Python

This adds a column that accomplishes what you want

df.insert(0,"Column Name", np.arange(1,len(df)+1))

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionClark FitzgeraldView Question on Stackoverflow
Solution 1 - PythonalkoView Answer on Stackoverflow
Solution 2 - PythonTomAugspurgerView Answer on Stackoverflow
Solution 3 - PythonDungView Answer on Stackoverflow
Solution 4 - PythonImranView Answer on Stackoverflow
Solution 5 - PythonLiu YuView Answer on Stackoverflow
Solution 6 - PythonUtkuView Answer on Stackoverflow
Solution 7 - PythonAmit BahadurView Answer on Stackoverflow
Solution 8 - Pythonmosc9575View Answer on Stackoverflow
Solution 9 - PythonPrashantView Answer on Stackoverflow
Solution 10 - PythonivanleonczView Answer on Stackoverflow
Solution 11 - PythonJenView Answer on Stackoverflow