Abstract: Object detection is a fundamental task in computer vision that involves accurately locating and classifying objects within images or video frames. In remote sensing, this task is ...
Rex-Omni is a 3B-parameter Multimodal Large Language Model (MLLM) that redefines object detection and a wide range of other visual perception tasks as a simple next-token prediction problem.
Abstract: Source-Free Object Detection (SFOD) enables knowledge transfer from a source domain to an unsupervised target domain for object detection without access to source data. Most existing SFOD ...