|
MaskInversion: Localized Embeddings via Optimization of Explainability Maps
Walid Bousselham,
Sofian Chaybouti
Christian Rupprecht,
Vittorio Ferrari,
Hilde Kuehne
arXiv, 2024
Project Page
/
Code
/
arXiv
|
|
LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity
Walid Bousselham,
Angie Boggust,
Sofian Chaybouti
Hendrik Strobelt
Hilde Kuehne
arXiv, 2024
Project Page
/
Code
/
arXiv
/
Demo
|
|
Grounding Everything: Emerging Localization Properties in Vision-Language Transformers
Walid Bousselham,
Felix Petersen,
Vittorio Ferrari,
Hilde Kuehne
CVPR, 2024
Code
/
arXiv
/
Demo
|
|
Learning Situation Hyper-Graphs for Video Question Answering
Aisha Urooj,
Hilde Kuehne,
Bo Wu,
Kim Chheu,
Walid Bousselham,
Chuang Gan,
Niels Lobo,
Mubarak Shah
CVPR, 2023
Code
/
arXiv
|
|
Efficient Self-Ensemble for Semantic Segmentation
Walid Bousselham,
Guillaume Thibault,
Lucas Pagano,
Archana Machireddy,
Joe Gray,
Young Hwan Chang,
Xubo Song
BMVC, 2022
Code
/
arXiv
/
video
|
|
MaskInversion
A library for generating localized embeddings of CLIP-like models via optimization of explainability maps.
pip install maskinversion_torch
GitHub
/
PyPI
|
|
LeGrad
An explainability method for Vision Transformers that, given a text prompt, generates a heatmap localizing the part of the image that is important for the model to recognize the text prompt.
pip install legrad_torch
GitHub
/
PyPI
|
|
GEM (Grounding Everything Method)
A library for exploring emerging localization properties in Vision-Language Transformers.
pip install gem_torch
GitHub
/
PyPI
|
|