how to read certain columns from Excel using Pandas - Python

Python Problem Overview

I am reading from an Excel sheet and I want to read certain columns: column 0 because it is the row-index, and columns 22:37. Now here is what I do:

import pandas as pd
import numpy as np
file_loc = "path.xlsx"
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], parse_cols = 37)
df= pd.concat([df[df.columns[0]], df[df.columns[22:]]], axis=1)

But I would hope there is better way to do that! I know if I do parse_cols=[0, 22,..,37] I can do it, but for large datasets this doesn't make sense.

I also did this:

s = pd.Series(0)
s[1]=22
for i in range(2,14):
    s[i]=s[i-1]+1
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], parse_cols = s)

But it reads the first 15 columns which is the length of s.

Python Solutions

Solution 1 - Python

You can use column indices (letters) like this:

import pandas as pd
import numpy as np
file_loc = "path.xlsx"
df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], usecols="A,C:AA")
print(df)

Corresponding documentation:

> usecols : int, str, list-like, or callable default None > > - If None, then parse all columns. > - If str, then indicates comma separated list of Excel column letters and column ranges (e.g. “A:E” or “A,C,E:F”). Ranges are inclusive of both sides. > - If list of int, then indicates list of column numbers to be parsed. > - If list of string, then indicates list of column names to be parsed. > > New in version 0.24.0. > > - If callable, then evaluate each column name against it and parse the column if the callable returns True. > > Returns a subset of the columns according to behavior above. > > New in version 0.24.0.

Solution 2 - Python

parse_cols is deprecated, use usecols instead

that is:

df = pd.read_excel(file_loc, index_col=None, na_values=['NA'], usecols = "A,C:AA")

Solution 3 - Python

"usecols" should help, use range of columns (as per excel worksheet, A,B...etc.) below are the examples

1. Selected Columns

df = pd.read_excel(file_location,sheet_name='Sheet1', usecols="A,C,F")

2. Range of Columns and selected column

df = pd.read_excel(file_location,sheet_name='Sheet1', usecols="A:F,H")

3. Multiple Ranges

df = pd.read_excel(file_location,sheet_name='Sheet1', usecols="A:F,H,J:N")

4. Range of columns

df = pd.read_excel(file_location,sheet_name='Sheet1', usecols="A:N")

Content Type	Original Author	Original Content on Stackoverflow
Question	Ana	View Question on Stackoverflow
Solution 1 - Python	MartyIX	View Answer on Stackoverflow
Solution 2 - Python	Leoli	View Answer on Stackoverflow
Solution 3 - Python	Uday Kiran	View Answer on Stackoverflow

how to read certain columns from Excel using Pandas - Python

Python Problem Overview

Python Solutions

Solution 1 - Python

Solution 2 - Python

Solution 3 - Python

Why my $.ajax showing "preflight is invalid redirect error"?

React-Redux: Combining reducers: Unexpected Keys

Attributions