Pandas replace a character in all column names

PythonPandas

Python Problem Overview


I have data frames with column names (coming from .csv files) containing ( and ) and I'd like to replace them with _.

How can I do that in place for all columns?

Python Solutions


Solution 1 - Python

Use str.replace:

df.columns = df.columns.str.replace("[()]", "_")

Sample:

df = pd.DataFrame({'(A)':[1,2,3],
                   '(B)':[4,5,6],
                   'C)':[7,8,9]})

print (df)
   (A)  (B)  C)
0    1    4   7
1    2    5   8
2    3    6   9

df.columns = df.columns.str.replace(r"[()]", "_")
print (df)
   _A_  _B_  C_
0    1    4   7
1    2    5   8
2    3    6   9

Solution 2 - Python

Older pandas versions don't work with the accepted answer above. Something like this is needed:

df.columns = [c.replace("[()]", "_") for c in list(df.columns)]

Solution 3 - Python

The square brackets are used to demarcate a range of characters you want extracted. for example:

r"[Nn]ational"

will extract both occurences where we have "National" and "national" i.e it extracts N or n.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCedric H.View Question on Stackoverflow
Solution 1 - PythonjezraelView Answer on Stackoverflow
Solution 2 - PythonJamesRView Answer on Stackoverflow
Solution 3 - PythonagbalutemiView Answer on Stackoverflow