substring of an entire column in pandas dataframe

PythonPandasDataframe

Python Problem Overview


I have a pandas dataframe "df". In this dataframe I have multiple columns, one of which I have to substring. Lets say the column name is "col". I can run a "for" loop like below and substring the column:

for i in range(0,len(df)):
  df.iloc[i].col = df.iloc[i].col[:9]

But I wanted to know, if there is an option where I don't have to use a "for" loop, and do it directly using an attribute.I have huge amount of data, and if I do this, the data will take a very long time process.

Python Solutions


Solution 1 - Python

Use the str accessor with square brackets:

df['col'] = df['col'].str[:9]

Or str.slice:

df['col'] = df['col'].str.slice(0, 9)

Solution 2 - Python

In case the column isn't a string, use astype to convert it:

df['col'] = df['col'].astype(str).str[:9]

Solution 3 - Python

I needed to convert a single column of strings of form nn.n% to float. I needed to remove the % from the element in each row. The attend data frame has two columns.

attend.iloc[:,1:2]=attend.iloc[:,1:2].applymap(lambda x: float(x[:-1]))

Its an extenstion to the original answer. In my case it takes a dataframe and applies a function to each value in a specific column. The function removes the last character and converts the remaining string to float.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionthenakulchawlaView Question on Stackoverflow
Solution 1 - PythonayhanView Answer on Stackoverflow
Solution 2 - PythonElton da MataView Answer on Stackoverflow
Solution 3 - PythonRadiumcolaView Answer on Stackoverflow