Questionary Module Python Demo

A Robust Feature Downsampling Module for Remote-Sensing Visual Tasks

Abstract: Remote-sensing (RS) images present unique challenges for computer vision (CV) due to lower resolution, smaller objects, and fewer features. Mainstream backbone networks show promising ...

GitHub

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

This is the repo for the Video-LLaMA project, which is working on empowering large language models with video and audio understanding capabilities. Video-LLaMA is built on top of BLIP-2 and MiniGPT-4.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

A Robust Feature Downsampling Module for Remote-Sensing Visual Tasks

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Trending now