Turnover Box Project: How to use image thresholding to extract masks for individual turnover boxes?

This project involves two types of incoming material turnover boxes: one is an open-top turnover box, and the other is a lidded turnover box; the point cloud quality is consistently good.

Visual recognition requirements

Identify the turnover box model and calculate the pick point.

Previous approach

There are usually significant gaps among turnover boxes, and we used multiple model edge matching to directly recognize turnover boxes of difference specifications and calculate the pick points. However, due to the similar dimensions of turnover boxes, the multiple matching matching might produce incorrect turnover box specification labels, which made the robot use the wrong gripper.

Currently planned approach

We plan to use an object detection model to identify what specifications individual turnover boxes belong to. However, the prerequisite is to first obtain a mask for each individual turnover box.

Current challenges

After matching, use Step “Extract Point Cloud in 3D Box” provides a complete turnover box point cloud.

When projecting the 3D point cloud into a 2D image and only applying morphological transformations, if the dilation kernel size is too small, the obtained mask for open-top turnover boxes will be incomplete, and if the dilation kernel size is too large, the mask for turnover boxes is overly extensive.

How can we obtain an appropriate complete mask for open-top/lidded turnover boxes without adding a separate instance segmentation model, thus allowing for the extraction of a color image for object detection?

Left: Point clouds of individual turnover boxes
Right: Color image of turnover boxes

After projecting the 3D point cloud into a 2D image, perform two different image thresholding processes on the masks.

  1. Image Thresholding (1): Choose a fixed threshold mode and generate a mask image for intensities lower than the threshold, with the threshold set at 128. Then, perform clustering to obtain a mask that excludes everything except the turnover box.

  2. Image Thresholding (2): Choose a fixed threshold mode and generate a mask image for intensities lower than the threshold, with the threshold set at 255. This will produce a complete mask.

  3. Perform a logical XOR operation on the two sets of masks mentioned above to obtain the mask for an individual turnover box.

  4. Dilate the mask for the individual turnover box by a certain dimension and extract the corresponding color image from the mask. This will yield the color image for an individual turnover box.