Vim Visual Model Example

AI Decodes Visual Brain Activity—and Writes Captions for It

A non-invasive imaging technique can translate scenes in your head into sentences. It could help to reveal how the brain ...

IEEE

Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding

Abstract: The goal of visual grounding is to establish connections between target objects and textual descriptions. Large Language Models (LLMs) have demonstrated strong comprehension abilities across ...

IEEE

GroupMamba: Efficient Group-Based Visual State Space Model

Abstract: State-Space models (SSMs) have recently shown promise in capturing long-range dependencies with subquadratic computational complexity, making them attractive for various applications.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

AI Decodes Visual Brain Activity—and Writes Captions for It

Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding

GroupMamba: Efficient Group-Based Visual State Space Model

Trending now