In the scenario of neatly stacked turnover boxes, using 3D matching or deep learning?

In the scenario of neatly stacked turnover boxes, the surface point cloud has good quality. However, due to the close proximity of the objects, it is difficult to extract edges, leading to occasional misalignment in the matching. How can this issue be resolved?
For such scenarios, would it be better to use a deep learning model for segmentation before performing the matching process?

There are two approaches to try:

Scenario 1: First, attempt to use the new version of the 3D edge extraction algorithm. If it can provide sufficient constraints on the freedom of the turnover box with its edge features, then the full-scene edge matching can be adopted.

Scenario 2: If stable 3D edge features still cannot be obtained, then deep learning is still required.