For large single-layer gripping projects, such as grabbing car windows and doors, is it possible to avoid Viz path planning?

Can Viz be avoided when using a standard interface and teach points on the robot?

yes, and you can use this command to solve: https://docs.mech-mind.net/en/robot-integration/latest/standard-interface-development-manual/tcp-ip-socket.html#tcpip-get-vision-targets