How is 2D deep learning combined with 3D matching

Why do we need deep learning?

Normally, when the workpieces are placed in an orderly manner in the bins, the Mech-Mind Vision System uses 3D Matching to locate these workpieces.

However, when the size of a workpiece is relatively small, and the workpieces are placed randomly in the bins with overlap, the vision system may not be able to achieve accurate recognition in 3D Matching. Moreover, 3D models with a relatively large surfaces may lead to recognition omissions and misrecognitions. This issue is common for workpieces with flat surfaces.

How is deep learning (2D) combined with 3D matching?

Deep learning can greatly enhance recognition accuracy. Initially, deep learning is used to locate the bounding box (as shown in the left figure below) or to perform segmentation (as shown in the right figure below) of the workpieces. Using the bounding boxes as references, we can then utilize 3D matching to recognize the workpieces.

In summary, we recommend combining deep learning and 3D matching when any of the following conditions are met:
(1) The size of the workpieces is small.
(2) The workpieces are randomly placed in the bin.
(3) The workpieces have flat surfaces with small thickness.

What are good images for deep learning model training?

Before conducting deep learning, we need 20-50 images to train the deep learning models. These images should meet the following requirements:

(1) The images captured should encompass the entire bin, with a minimum gap of 50mm between the edges of the bin and the camera’s field of view (FOV). The following image serves as a bad example of image capture since the edges of the box in the image are too close to the camera’s FOV.


(2) In the actual situation, the number of workpieces inside a bin varies as the bin goes from full to empty. Therefore, to simulate an actual project, the camera should capture images of the bin with varying quantities of workpieces in different positions. This variability will help the deep learning models learn to recognize workpieces under different scenarios and conditions.

How does the Mech-Mind Vision System train the deep learning models?

The Mech-Mind Vision System currently offers two methods to train the deep learning models:
(1) Save the images and send them to Mech-Mind engineers.
In this approach, Mech-Mind teams take the responsibility for labeling, training, and validating the deep learning models. Customers provide the images, and Mech-Mind teams handle the entire training processes of the deep learning models.

(2) Mech-DLK (Deep Learning Kit)
Mech-Mind provides an offline training platform called Mech-DLK for customers whose companies enforce strict information safety policies. Mech-DLK allows customers to perform trainings on their own premises, ensuring the confidentiality of workpiece images.

Customers can choose the training methods that align with their specific requirements and privacy policies.

DLK(Deep Learning Kit by Mech-Mind)

After customers complete the labeling, training and validating of the deep learning models in their computers, the deep learning models can be directly deployed in the Mech-Mind Vision System.