Exportar Publicação

A publicação pode ser exportada nos seguintes formatos: referência da APA (American Psychological Association), referência do IEEE (Institute of Electrical and Electronics Engineers), BibTeX e RIS.

Exportar Referência (APA)
Sarwar, F., Garrido, N. & Margarida Silveira (2026). Enhancing Mammogram-Based Breast Cancer Prediction From Pretrained Vision-Language Models: the Role of Soft Prompts and Bidirectional Fusion. In 2026 IEEE 23rd International Symposium on Biomedical Imaging (ISBI). (pp. 1-5). London, United Kingdom: IEEE.
Exportar Referência (IEEE)
F. Sarwar et al.,  "Enhancing Mammogram-Based Breast Cancer Prediction From Pretrained Vision-Language Models: the Role of Soft Prompts and Bidirectional Fusion", in 2026 IEEE 23rd Int. Symp. on Biomedical Imaging (ISBI), London, United Kingdom, IEEE, 2026, pp. 1-5
Exportar BibTeX
@inproceedings{sarwar2026_1781332698095,
	author = "Sarwar, F. and Garrido, N. and Margarida Silveira",
	title = "Enhancing Mammogram-Based Breast Cancer Prediction From Pretrained Vision-Language Models: the Role of Soft Prompts and Bidirectional Fusion",
	booktitle = "2026 IEEE 23rd International Symposium on Biomedical Imaging (ISBI)",
	year = "2026",
	editor = "",
	volume = "",
	number = "",
	series = "",
	doi = "10.1109/ISBI61048.2026.11515820",
	pages = "1-5",
	publisher = "IEEE",
	address = "London, United Kingdom",
	organization = "",
	url = "https://ieeexplore.ieee.org/document/11515820"
}
Exportar RIS
TY  - CPAPER
TI  - Enhancing Mammogram-Based Breast Cancer Prediction From Pretrained Vision-Language Models: the Role of Soft Prompts and Bidirectional Fusion
T2  - 2026 IEEE 23rd International Symposium on Biomedical Imaging (ISBI)
AU  - Sarwar, F.
AU  - Garrido, N.
AU  - Margarida Silveira
PY  - 2026
SP  - 1-5
SN  - 1945-7928
DO  - 10.1109/ISBI61048.2026.11515820
CY  - London, United Kingdom
UR  - https://ieeexplore.ieee.org/document/11515820
AB  - Recent advances in vision-language models (VLMs) such as CLIP and BLIP have demonstrated strong generalization in visual reasoning tasks. However, their potential for medical image analysis, especially breast cancer prediction from mammograms, remains underexplored. This study investigates how a pretrained VLM can be adapted for full mammographic classification. Unlike prior approaches that rely on costly region-of-interest (ROI) annotations, we process entire mammograms and adapt a general-purpose VLM (EVACLIP) using soft prompts, selective fine-tuning, and bidirectional fusion strategies. We compare different fusion methods, including Concatenation, Gated-Residual, Cross-Modal, Co-Weighted and Bi-Attention. Experiments on the CBISDDSM dataset show that bidirectional fusion methods consistently outperform other fusion approaches, while providing enhanced explainability through improved attention localization. Results also demonstrate that our adapted generalpurpose VLM significantly outperforms a mammographyspecific model (Mammo-CLIP), under domain-shift, in both zero-shot and linear-probe settings. This suggests that largescale general-purpose VLMs, when properly adapted, can outperform domain-specific models, reducing the need for extensive annotation and paired image-text training.
ER  -