Shaking Up VLMs: Comparing Transformers and Structured State Space Models for Vision & Language Modeling

Jan 1, 2024ยท
Georgios Pantazopoulos
,
Malvina Nikandrou
,
Alessandro Suglia
,
Oliver Lemon
,
Arash Eshghi
ยท 0 min read
Type
Publication
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024, Miami, FL, USA, November 12-16, 2024