How do I retrieve the number of columns in a Pandas data frame?

PythonPandasDataframe

Python Problem Overview


How do you programmatically retrieve the number of columns in a pandas dataframe? I was hoping for something like:

df.num_columns

Python Solutions


Solution 1 - Python

Like so:

import pandas as pd
df = pd.DataFrame({"pear": [1,2,3], "apple": [2,3,4], "orange": [3,4,5]})
    
len(df.columns)
3

Solution 2 - Python

Alternative:

df.shape[1]

(df.shape[0] is the number of rows)

Solution 3 - Python

If the variable holding the dataframe is called df, then:

len(df.columns)

gives the number of columns.

And for those who want the number of rows:

len(df.index)

For a tuple containing the number of both rows and columns:

df.shape

Solution 4 - Python

Surprised I haven't seen this yet, so without further ado, here is:

df.columns.size

Solution 5 - Python

df.info() function will give you result something like as below. If you are using read_csv method of Pandas without sep parameter or sep with ",".

raw_data = pd.read_csv("a1:\aa2/aaa3/data.csv")
raw_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5144 entries, 0 to 5143
Columns: 145 entries, R_fighter to R_age

Solution 6 - Python

There are multiple option to get column number and column information such as:
let's check them.

local_df = pd.DataFrame(np.random.randint(1,12,size=(2,6)),columns =['a','b','c','d','e','f'])

  1. local_df.shape[1] --> Shape attribute return tuple as (row & columns) (0,1).

  2. local_df.info() --> info Method will return detailed information about data frame and it's columns such column count, data type of columns, Not null value count, memory usage by Data Frame

  3. len(local_df.columns) --> columns attribute will return index object of data frame columns & len function will return total available columns.

  4. local_df.head(0) --> head method with parameter 0 will return 1st row of df which actually nothing but header.

Assuming number of columns are not more than 10. For loop fun: li_count =0 for x in local_df: li_count =li_count + 1 print(li_count)

Solution 7 - Python

In order to include the number of row index "columns" in your total shape I would personally add together the number of columns df.columns.size with the attribute pd.Index.nlevels/pd.MultiIndex.nlevels:

Set up dummy data

import pandas as pd

flat_index = pd.Index([0, 1, 2])
multi_index = pd.MultiIndex.from_tuples([("a", 1), ("a", 2), ("b", 1), names=["letter", "id"])

columns = ["cat", "dog", "fish"]

data = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat_df = pd.DataFrame(data, index=flat_index, columns=columns)
multi_df = pd.DataFrame(data, index=multi_index, columns=columns)

# Show data
# -----------------
# 3 columns, 4 including the index
print(flat_df)
    cat  dog  fish
id                
0     1    2     3
1     4    5     6
2     7    8     9

# -----------------
# 3 columns, 5 including the index
print(multi_df)
           cat  dog  fish
letter id                
a      1     1    2     3
       2     4    5     6
b      1     7    8     9

Writing our process as a function:

def total_ncols(df, include_index=False):
    ncols = df.columns.size
    if include_index is True:
        ncols += df.index.nlevels
    return ncols

print("Ignore the index:")
print(total_ncols(flat_df), total_ncols(multi_df))

print("Include the index:")
print(total_ncols(flat_df, include_index=True), total_ncols(multi_df, include_index=True))

This prints:

Ignore the index:
3 3

Include the index:
4 5

If you want to only include the number of indices if the index is a pd.MultiIndex, then you can throw in an isinstance check in the defined function.

As an alternative, you could use df.reset_index().columns.size to achieve the same result, but this won't be as performant since we're temporarily inserting new columns into the index and making a new index before getting the number of columns.

Solution 8 - Python

#use a regular expression to parse the column count
#https://docs.python.org/3/library/re.html

buffer = io.StringIO()
df.info(buf=buffer)
s = buffer.getvalue()
pat=re.search(r"total\s{1}[0-9]\s{1}column",s)
print(s)
phrase=pat.group(0)
value=re.findall(r'[0-9]+',phrase)[0]
print(int(value))

Solution 9 - Python

import pandas as pd
df = pd.DataFrame({"pear": [1,2,3], "apple": [2,3,4], "orange": [3,4,5]})


print(len(list(df.iterrows())))

gives length of rows

3

[Program finished]

Solution 10 - Python

here is:

  • pandas
    • excel engine: xlsxwriter

several method to get column count:

  • len(df.columns) -> 28

    • enter image description here
  • df.shape[1] -> 28

    • here: df.shape = (592, 28)
    • related
      • rows count: df.shape[0] -> 592
  • df.columns.shape[0] -> 28

    • here: df.columns.shape = (28,)
      • enter image description here
  • df.columns.size -> 28

Solution 11 - Python

This worked for me len(list(df)).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser1802143View Question on Stackoverflow
Solution 1 - PythonJohnView Answer on Stackoverflow
Solution 2 - PythonmklnView Answer on Stackoverflow
Solution 3 - PythonmultigoodverseView Answer on Stackoverflow
Solution 4 - PythonmouwsyView Answer on Stackoverflow
Solution 5 - PythonAshishSingh007View Answer on Stackoverflow
Solution 6 - PythonAshishSingh007View Answer on Stackoverflow
Solution 7 - PythonCameron RiddellView Answer on Stackoverflow
Solution 8 - PythonGolden LionView Answer on Stackoverflow
Solution 9 - PythonSubhamView Answer on Stackoverflow
Solution 10 - PythoncrifanView Answer on Stackoverflow
Solution 11 - PythonTanmay GhanekarView Answer on Stackoverflow