Ciência_Iscte Publicações Descrição Detalhada da Publicação Exportar

Exportar Publicação

A publicação pode ser exportada nos seguintes formatos: referência da APA (American Psychological Association), referência do IEEE (Institute of Electrical and Electronics Engineers), BibTeX e RIS.

Exportar Referência (APA)

Maqsood, R., Nunes, P., Conti, C. & Soares, L. D. (2026). Deep spatio-temporal and frequency guided fusion network for event-to-video reconstruction. IEEE Open Journal of Signal Processing. 7, 541-550

Exportar Referência (IEEE)

R. Maqsood et al.,  "Deep spatio-temporal and frequency guided fusion network for event-to-video reconstruction", in IEEE Open Journal of Signal Processing, vol. 7, pp. 541-550, 2026

Exportar BibTeX

@article{maqsood2026_1784820834189,
	author = "Maqsood, R. and Nunes, P. and Conti, C. and Soares, L. D.",
	title = "Deep spatio-temporal and frequency guided fusion network for event-to-video reconstruction",
	journal = "IEEE Open Journal of Signal Processing",
	year = "2026",
	volume = "7",
	number = "",
	doi = "10.1109/OJSP.2026.3693230",
	pages = "541-550",
	url = "https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8782710"
}

Exportar RIS

TY - JOUR
TI - Deep spatio-temporal and frequency guided fusion network for event-to-video reconstruction
T2 - IEEE Open Journal of Signal Processing
VL - 7
AU - Maqsood, R.
AU - Nunes, P.
AU - Conti, C.
AU - Soares, L. D.
PY - 2026
SP - 541-550
SN - 2644-1322
DO - 10.1109/OJSP.2026.3693230
UR - https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8782710
AB - Event-to-video (E2V) reconstruction has gained significant attention recently for its advantages in enabling high dynamic range and fast motion capture capabilities. However, event data encodes only relative brightness changes, lacking the absolute intensity information necessary for accurate reconstruction. Recent methods incorporate previously reconstructed images to provide intensity references but process them in the spatial domain where low- and high-frequency components are highly coupled. This spatial processing typically leads to the degradation of fine details and introduces artifacts such as over-smoothing, blurring and low contrast reconstruction. To address this, we propose a deep spatio-temporal and frequency guided fusion network for E2V reconstruction (DSTFN-E2V), featuring a dual-path architecture with two key components: i) a prior frequency decomposition module (PFDM), and ii) a spatio-temporal event-driven feature extraction module (STEM). The PFDM decouples low- and high-frequency information from previously reconstructed images and current event voxel grid via a 2D discrete wavelet transform, processing the low-frequency subband through residual blocks to preserve structural coherence and intensity references, while an edge-detail refinement module (ERM) enhances edge and texture details from high-frequency subbands. The frequency-specific features from PFDM and the spatio-temporal features from STEM are then integrated through the proposed event-image fusion blocks (EIFBs) that apply cross-attention across three encoder stages, enabling simultaneous structural preservation and detail recovery. Experiments on four real-world datasets demonstrate that DSTFN-E2V achieves state-of-the-art results with 12% SSIM improvements while being 50% faster than recent attention-based methods, with superior edge fidelity and reduced artifacts.
ER -