Currently, Mech-Vision does not have a built-in Step specifically for merging 2D color images the way Merge Point Cloud works for point clouds.
However, here are a few approaches for your depalletizing deep learning use case:
Single camera, single position: If the field of view can cover the entire bin, using a single capture is the simplest approach.
Use the merged point cloud for DL: Mech-DLK models can work with point cloud data. If you merge point clouds from two captures, you can run the DL model on the merged 3D data.
Custom Script Step: You could use OpenCV in a Custom Script Step to stitch two images together, though this requires calibration of the camera positions.
Could you share more details about your setup (working distance, bin size, why a single capture is insufficient)? This will help us suggest the best approach.