Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling

Jan 1, 2024ยท
Georgios Pantazopoulos
,
Malvina Nikandrou
,
Alessandro Suglia
,
Oliver Lemon
,
Arash Eshghi
ยท 0 min read
Type
Publication
CoRR