Access index of last element in data frame

PythonPandas

Python Problem Overview


I've looking around for this but I can't seem to find it (though it must be extremely trivial).

The problem that I have is that I would like to retrieve the value of a column for the first and last entries of a data frame. But if I do:

df.ix[0]['date']

I get:

datetime.datetime(2011, 1, 10, 16, 0)

but if I do:

df[-1:]['date']

I get:

myIndex
13         2011-12-20 16:00:00
Name: mydate

with a different format. Ideally, I would like to be able to access the value of the last index of the data frame, but I can't find how.

I even tried to create a column (IndexCopy) with the values of the index and try:

df.ix[df.tail(1)['IndexCopy']]['mydate']

but this also yields a different format (since df.tail(1)['IndexCopy'] does not output a simple integer).

Any ideas?

Python Solutions


Solution 1 - Python

The former answer is now superseded by .iloc:

>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
    date
17    10
18    18
19    26
20    34
21    42
22    50
23    58
>>> df["date"].iloc[0]
10
>>> df["date"].iloc[-1]
58

The shortest way I can think of uses .iget():

>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
    date
17    10
18    18
19    26
20    34
21    42
22    50
23    58
>>> df['date'].iget(0)
10
>>> df['date'].iget(-1)
58

Alternatively:

>>> df['date'][df.index[0]]
10
>>> df['date'][df.index[-1]]
58

There's also .first_valid_index() and .last_valid_index(), but depending on whether or not you want to rule out NaNs they might not be what you want.

Remember that df.ix[0] doesn't give you the first, but the one indexed by 0. For example, in the above case, df.ix[0] would produce

>>> df.ix[0]
Traceback (most recent call last):
  File "<ipython-input-489-494245247e87>", line 1, in <module>
    df.ix[0]
[...]
KeyError: 0

Solution 2 - Python

Combining @comte's answer and dmdip's answer in https://stackoverflow.com/questions/41217310/get-index-of-a-row-of-a-pandas-dataframe-as-an-integer/42853445#42853445

df.tail(1).index.item()

gives you the value of the index.


Note that indices are not always well defined not matter they are multi-indexed or single indexed. Modifying dataframes using indices might result in unexpected behavior. We will have an example with a multi-indexed case but note this is also true in a single-indexed case.

Say we have

df = pd.DataFrame({'x':[1,1,3,3], 'y':[3,3,5,5]}, index=[11,11,12,12]).stack()

11  x    1
    y    3
    x    1
    y    3
12  x    3
    y    5              # the index is (12, 'y')
    x    3
    y    5              # the index is also (12, 'y')

df.tail(1).index.item() # gives (12, 'y')

Trying to access the last element with the index df[12, "y"] yields

(12, y)    5
(12, y)    5
dtype: int64

If you attempt to modify the dataframe based on the index (12, y), you will modify two rows rather than one. Thus, even though we learned to access the value of last row's index, it might not be a good idea if you want to change the values of last row based on its index as there could be many that share the same index. You should use df.iloc[-1] to access last row in this case though.

Reference

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.item.html

Solution 3 - Python

df.tail(1).index 

seems the most readable

Solution 4 - Python

It may be too late now, I use index method to retrieve last index of a DataFrame, then use [-1] to get the last values:

For example,

df = pd.DataFrame(np.zeros((4, 1)), columns=['A'])
print(f'df:\n{df}\n')

print(f'Index = {df.index}\n')
print(f'Last index = {df.index[-1]}')

The output is

df:
     A
0  0.0
1  0.0
2  0.0
3  0.0

Index = RangeIndex(start=0, stop=4, step=1)

Last index = 3

Solution 5 - Python

You want .iloc with double brackets.

import pandas as pd
df = pd.DataFrame({"date": range(10, 64, 8), "not_date": "fools"})
df.index += 17
df.iloc[[0,-1]][['date']]

You give .iloc a list of indexes - specifically the first and last, [0, -1]. That returns a dataframe from which you ask for the 'date' column. ['date'] will give you a series (yuck), and [['date']] will give you a dataframe.

Solution 6 - Python

Pandas supports NumPy syntax which allows:

df[len(df) -1:].index[0]

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestioneleliasView Question on Stackoverflow
Solution 1 - PythonDSMView Answer on Stackoverflow
Solution 2 - PythonTaiView Answer on Stackoverflow
Solution 3 - PythoncomteView Answer on Stackoverflow
Solution 4 - PythonyoonghmView Answer on Stackoverflow
Solution 5 - PythongrofteView Answer on Stackoverflow
Solution 6 - PythonQuantumView Answer on Stackoverflow