Training-free framework that converts SAM3 into a real-time multi-class open-vocabulary detector. Achieves 55.8 AP on COCO val2017 (80 classes) at 15.8 FPS (4 classes, 1008px) on a single RTX 4080.
Abstract: Recent real-time detection transformers (DETRs) have gained popularity due to their simplicity and efficiency. However, these detectors do not explicitly model object rotation, especially in ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results