Enhancing Mammogram-Based Breast Cancer Prediction From Pretrained Vision-Language Models: the Role of Soft Prompts and Bidirectional Fusion

Fareeha Sarwar; Nuno Miguel de Figueiredo Garrido; Margarida Silveira

Ciência_Iscte Publications Publication Detailed Description

Publication in conference proceedings

Enhancing Mammogram-Based Breast Cancer Prediction From Pretrained Vision-Language Models: the Role of Soft Prompts and Bidirectional Fusion

Fareeha Sarwar (Sarwar, F.); Nuno Miguel de Figueiredo Garrido (Garrido, N.); Margarida Silveira (Margarida Silveira);

2026 IEEE 23rd International Symposium on Biomedical Imaging (ISBI)

Year (definitive publication)

2026

Language

English

Country

United Kingdom

More Information

Visit Link

Web of Science®

This publication is not indexed in Web of Science®

Scopus

This publication is not indexed in Scopus

Google Scholar

Times Cited: 0

(Last checked: 2026-06-09 03:54)

View record in Google Scholar

Overton

This publication is not indexed in Overton

Abstract

Recent advances in vision-language models (VLMs) such as CLIP and BLIP have demonstrated strong generalization in visual reasoning tasks. However, their potential for medical image analysis, especially breast cancer prediction from mammograms, remains underexplored. This study investigates how a pretrained VLM can be adapted for full mammographic classification. Unlike prior approaches that rely on costly region-of-interest (ROI) annotations, we process entire mammograms and adapt a general-purpose VLM (EVACLIP) using soft prompts, selective fine-tuning, and bidirectional fusion strategies. We compare different fusion methods, including Concatenation, Gated-Residual, Cross-Modal, Co-Weighted and Bi-Attention. Experiments on the CBISDDSM dataset show that bidirectional fusion methods consistently outperform other fusion approaches, while providing enhanced explainability through improved attention localization. Results also demonstrate that our adapted generalpurpose VLM significantly outperforms a mammographyspecific model (Mammo-CLIP), under domain-shift, in both zero-shot and linear-probe settings. This suggests that largescale general-purpose VLMs, when properly adapted, can outperform domain-specific models, reducing the need for extensive annotation and paired image-text training.

Acknowledgements

This work was supported by LARSyS FCT funding (DOI: 10.54499/LA/P/0083/2020, 10.54499/UIDP/ 672 50009/2020, 10.54499/UIDB/50009/2020). F. Sarwar gratefully acknowledges the invaluable support of ISCTE-IUL and Instituto de Telecomunicações

Keywords

Vision-Language Models,Fusion Techniques,Breast Cancer Prediction,Multimodal Learning

Fields of Science and Technology Classification

Computer and Information Sciences - Natural Sciences
Electrical Engineering, Electronic Engineering, Information Engineering - Engineering and Technology

Contributions to the Sustainable Development Goals of the United Nations

With the objective to increase the research activity directed towards the achievement of the United Nations 2030 Sustainable Development Goals, the possibility of associating scientific publications with the Sustainable Development Goals is now available in Ciência_Iscte. These are the Sustainable Development Goals identified by the author(s) for this publication. For more detailed information on the Sustainable Development Goals, click here.

Publication Identifiers

DOI (source: author)	10.1109/ISBI61048.2026.11515820
Ciência_Iscte ID	ci-pub-118478

Other Publication Details

Online Publication Year	2026
Publisher	IEEE
Indexes	--
ISSN	1945-7928 (print) 1945-8452 (online)
ISBN	979-8-3315-7764-3 (print) 979-8-3315-7763-6 (online)
Volume
Article Number
Pages	1 - 5	Total Pages	--
Peer Reviewed	Yes
Dissemination Mean	Both (printed and digital)
Editors
Event Title	2026 IEEE 23rd International Symposium on Biomedical Imaging (ISBI)
Event Organizer
City	London, United Kingdom
Event Type	Conference
Event Classification	International
Event Year	2026
Event Publication Type	Poster
Publication Date (online)
Publication Date (print)

Altmetric

Dimensions

PlumX Metrics