How to use OpenCV's connectedComponentsWithStats in Python?
PythonOpencvConnected ComponentsPython Problem Overview
I am looking for an example of how to use OpenCV's connectedComponentsWithStats()
function in Python. Note this is only available with OpenCV 3 or newer. The official documentation only shows the API for C++, even though the function exists when compiled for Python. I could not find it anywhere online.
Python Solutions
Solution 1 - Python
The function works as follows:
# Import the cv2 library
import cv2
# Read the image you want connected components of
src = cv2.imread('/directorypath/image.bmp')
# Threshold it so it becomes binary
ret, thresh = cv2.threshold(src,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# You need to choose 4 or 8 for connectivity type
connectivity = 4
# Perform the operation
output = cv2.connectedComponentsWithStats(thresh, connectivity, cv2.CV_32S)
# Get the results
# The first cell is the number of labels
num_labels = output[0]
# The second cell is the label matrix
labels = output[1]
# The third cell is the stat matrix
stats = output[2]
# The fourth cell is the centroid matrix
centroids = output[3]
Labels is a matrix the size of the input image where each element has a value equal to its label.
Stats is a matrix of the stats that the function calculates. It has a length equal to the number of labels and a width equal to the number of stats. It can be used with the OpenCV documentation for it:
> Statistics output for each label, including the background label, see > below for available statistics. Statistics are accessed via > stats[label, COLUMN] where available columns are defined below. > > - cv2.CC_STAT_LEFT The leftmost (x) coordinate which is the inclusive start of the bounding box in the horizontal direction. > - cv2.CC_STAT_TOP The topmost (y) coordinate which is the inclusive start of the bounding box in the vertical direction. > - cv2.CC_STAT_WIDTH The horizontal size of the bounding box > - cv2.CC_STAT_HEIGHT The vertical size of the bounding box > - cv2.CC_STAT_AREA The total area (in pixels) of the connected component
Centroids is a matrix with the x and y locations of each centroid. The row in this matrix corresponds to the label number.
Solution 2 - Python
I have come here a few times to remember how it works and each time I have to reduce the above code to :
_, thresh = cv2.threshold(src,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
connectivity = 4 # You need to choose 4 or 8 for connectivity type
num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(thresh , connectivity , cv2.CV_32S)
Hopefully, it's useful for everyone :)
Solution 3 - Python
Adding to Zack Knopp
answer,
If you are using a grayscale image you can simply use:
import cv2
import numpy as np
src = cv2.imread("path\\to\\image.png", 0)
binary_map = (src > 0).astype(np.uint8)
connectivity = 4 # or whatever you prefer
output = cv2.connectedComponentsWithStats(binary_map, connectivity, cv2.CV_32S)
When I tried using Zack Knopp
answer on a grayscale image it didn't work and this was my solution.