How to use OpenCV's connectedComponentsWithStats in Python?

Python Problem Overview

I am looking for an example of how to use OpenCV's connectedComponentsWithStats() function in Python. Note this is only available with OpenCV 3 or newer. The official documentation only shows the API for C++, even though the function exists when compiled for Python. I could not find it anywhere online.

Python Solutions

Solution 1 - Python

The function works as follows:

# Import the cv2 library
import cv2
# Read the image you want connected components of
src = cv2.imread('/directorypath/image.bmp')
# Threshold it so it becomes binary
ret, thresh = cv2.threshold(src,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# You need to choose 4 or 8 for connectivity type
connectivity = 4  
# Perform the operation
output = cv2.connectedComponentsWithStats(thresh, connectivity, cv2.CV_32S)
# Get the results
# The first cell is the number of labels
num_labels = output[0]
# The second cell is the label matrix
labels = output[1]
# The third cell is the stat matrix
stats = output[2]
# The fourth cell is the centroid matrix
centroids = output[3]

Labels is a matrix the size of the input image where each element has a value equal to its label.

Stats is a matrix of the stats that the function calculates. It has a length equal to the number of labels and a width equal to the number of stats. It can be used with the OpenCV documentation for it:

> Statistics output for each label, including the background label, see > below for available statistics. Statistics are accessed via > stats[label, COLUMN] where available columns are defined below. > > - cv2.CC_STAT_LEFT The leftmost (x) coordinate which is the inclusive start of the bounding box in the horizontal direction. > - cv2.CC_STAT_TOP The topmost (y) coordinate which is the inclusive start of the bounding box in the vertical direction. > - cv2.CC_STAT_WIDTH The horizontal size of the bounding box > - cv2.CC_STAT_HEIGHT The vertical size of the bounding box > - cv2.CC_STAT_AREA The total area (in pixels) of the connected component

Centroids is a matrix with the x and y locations of each centroid. The row in this matrix corresponds to the label number.

Solution 2 - Python

I have come here a few times to remember how it works and each time I have to reduce the above code to :

_, thresh = cv2.threshold(src,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
connectivity = 4  # You need to choose 4 or 8 for connectivity type
num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(thresh , connectivity , cv2.CV_32S)

Hopefully, it's useful for everyone :)

Solution 3 - Python

Adding to Zack Knopp answer, If you are using a grayscale image you can simply use:

import cv2
import numpy as np

src = cv2.imread("path\\to\\image.png", 0)
binary_map = (src > 0).astype(np.uint8)
connectivity = 4 # or whatever you prefer

output = cv2.connectedComponentsWithStats(binary_map, connectivity, cv2.CV_32S)

When I tried using Zack Knopp answer on a grayscale image it didn't work and this was my solution.

Content Type	Original Author	Original Content on Stackoverflow
Question	Zack Knopp	View Question on Stackoverflow
Solution 1 - Python	Zack Knopp	View Answer on Stackoverflow
Solution 2 - Python	Dan Erez	View Answer on Stackoverflow
Solution 3 - Python	Barel Levy	View Answer on Stackoverflow

How to use OpenCV's connectedComponentsWithStats in Python?

Python Problem Overview

Python Solutions

Solution 1 - Python

Solution 2 - Python

Solution 3 - Python

kotlin data class + bean validation jsr 303

Should full backup content xml file be empty or not added at all to include all?

Attributions