iterate through all rows in specific column openpyxl
ExcelPython 2.7OpenpyxlExcel Problem Overview
I cannot figure out how to iterate through all rows in a specified column with openpyxl.
I want to print all of the cell values for all rows in column "C"
Right now I have:
from openpyxl import workbook
path = 'C:/workbook.xlsx'
wb = load_workbook(filename = path)
ws=wb.get_sheet_by_name('Sheet3')
for row in ws.iter_rows():
for cell in row:
if column == 'C':
print cell.value
Excel Solutions
Solution 1 - Excel
Why can't you just iterate over column 'C' (version 2.4.7):
for cell in ws['C']:
print cell.value
Solution 2 - Excel
You can specify a range to iterate over with ws.iter_rows()
:
import openpyxl
wb = openpyxl.load_workbook('C:/workbook.xlsx')
ws = wb['Sheet3']
for row in ws.iter_rows('C{}:C{}'.format(ws.min_row,ws.max_row)):
for cell in row:
print cell.value
Edit: per Charlie Clark you can alternately use ws.get_squared_range()
:
# ...
ws.get_squared_range(min_col=1, min_row=1, max_col=1, max_row=10)
# ...
Edit 2: per your comment you want the cell values in a list:
import openpyxl
wb = openpyxl.load_workbook('c:/_twd/2016-06-23_xlrd_xlwt/input.xlsx')
ws = wb.get_sheet_by_name('Sheet1')
mylist = []
for row in ws.iter_rows('A{}:A{}'.format(ws.min_row,ws.max_row)):
for cell in row:
mylist.append(cell.value)
print mylist
Solution 3 - Excel
You can also do this.
for row in ws.iter_rows():
print(row[2].value)
With this you are still iterating through the rows (but not the cells) and only pulling the values from column C in the row to print.
Solution 4 - Excel
Some of the solutions above don't quite work very well (maybe because of latest version of 'openpyxl'). After trying out different things, I used this:
Printing all rows with all columns:
import openpyxl
sheet = openpyxl.load_workbook('myworkbook.xlsx')['Sheet1']
# Iterating through All rows with all columns...
for i in range(1, sheet.max_row+1):
row = [cell.value for cell in sheet[i]] # sheet[n] gives nth row (list of cells)
print(row) # list of cell values of this row
Printing all rows with specific columns (e.g. 'E' to 'L'):
# For example we need column 'E' to column 'L'
start_col = 4 # 'E' column index
end_col = 11 # 'L' column index
for i in range(1, sheet.max_row+1):
row = [cell.value for cell in sheet[i][start_col:end_col+1]]
print(row) # list of cell values of this row
Please keep these points in mind:
- sheet[N] gives the list of 'Cell' objects of Nth row. (N is a number starting from 1)
- To get the first column cell of a row, use sheet[N][0]. (Because sheet[N] is a 'tuple' which can be indexed starting from zero 0).
Solution 5 - Excel
I do it like this. I'm not sure what I'm doing but it does avoid the cells with no values.
from openpyxl import load_workbook
wb = load_workbook(filename = 'exelfile.xlsx')
ws = wb['sheet1']
for col in ws['A']:
print (col.value)
Solution 6 - Excel
it can go as:
import openpyxl
path = 'C:/workbook.xlsx'
# since is a print, read_only is useful for making it faster.
wb = openpyxl.load_workbook(filename = path, read_only=True)
# by sheet name
ws=wb['Sheet3']
# non-Excel notation is col 'A' = 1, col 'B' = 2, col 'C' = 3.
# from row = 1 (openpyxl sheets starts at 1, not 0) to no max
for row in ws.iter_cols(min_row=1, min_col=3, max_col=3):
# for each row there is one cell object (since min_col = max_col)
for cell in row:
# so we print the value
print(f'C{row}: ', cell.value)
Solution 7 - Excel
You can use coordinate property in cell object.
coordinate property contains cell address in string format.
For example,
from openpyxl import workbook
path = 'C:/workbook.xlsx'
wb = load_workbook(filename = path)
ws=wb.get_sheet_by_name('Sheet3')
for row in ws.iter_rows():
for cell in row:
if 'C' in cell.coordinate:
print cell.value
Solution 8 - Excel
listaClientes =[]
for row in datos.iter_rows(min_row=2, min_col=3, max_col=3):
for cell in row:
listaClientes.append(cell.value)