I am currently working as an Assistant Professor at Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar. I have been a Post-Doctoral Fellow with Image Processing and Computer Vision (IPCV) Lab, Department of Electrical Engineering, IIT Madras under the supervision of Prof. A. N. Rajagoplan. I have recieved my PhD from School of Computing and Electrical Engineering, IIT Mandi under the supervision of Dr. Anil K. Sao. Title of my PhD thesis is "Novel Approaches for Super Resolution of Intensity/Range Image using Sparse Representation".
UWOT-Net: Underwater Object Tracking by Attention Driven Network in Unconstrained Marine Environments Prerana Mukherjee, Srimanta Mandal, Ajay Pediredla Proceedings of the National Conference on Communications Ncc, 2025 Understanding and monitoring underwater ecosystems is crucial for ecological conservation, biodiversity studies, and fisheries management. However, tracking marine life, particularly fish, in uncontrolled underwater environments present significant challenges due to appearance variations, motion complexities, and spatial distortions. In this work, we propose an attention drive network, namely UWOT-Net for robust single and multiple underwater object tracking. UWOT-Net integrates image enhancement module with transformer-based tracking architecture. The image enhancement module helps in addressing the underwater degradation. Unlike conventional methods, UWOT-Net utilizes an encoder-decoder transformer framework to enable accurate data association and trajectory prediction across video sequences through a frame-to-frame set prediction mechanism. The transformer framework enables global context understanding and identity preservation without explicit modeling of motion and appearance cues. Experimental evaluations across BrackishMOT, UTB180, and GMOT40 datasets show-case remarkable efficacy of UWOT-Net, attaining state-of-the-art performance in Multiple Object Tracking Accuracy (MOTA) while minimizing ID switches. Notably, our model achieves 100% MOTA score for 8 sequences and surpasses 80% MOTA for 13 sequences, with a remarkable increase of over 100% MOTA for the Brackish dataset. Moreover, UWOT-Net exhibits adaptability by successfully tracking categories beyond fish, indicating its potential for diverse marine and generic underwater object tracking applications.
Contrastive Attention-Based Network for Self-Supervised Point Cloud Completion Seema Kumari, Preyum Kumar, Srimanta Mandal, Shanmuganathan Raman IEEE Signal Processing Letters, 2025 Point cloud completion aims to reconstruct complete 3D shapes from partial observations, often requiring multiple views or complete data for training. In this paper, we propose an attention-driven, self-supervised autoencoder network that completes 3D point clouds from a single partial observation. Multi-head self-attention captures robust contextual relationships, while residual connections in the autoencoder enhance geometric feature learning. In addition to this, we incorporate a contrastive learning-based loss, which encourages the network to better distinguish structural patterns even in highly incomplete observations. Experimental results on benchmark datasets demonstrate that the proposed approach achieves state-of-the-art performance in single-view point cloud completion.
TRANSFORMER AUGMENTED MULTI-RESOLUTION HASH ENCODING IN DIFFUSION MODEL FOR 3D POINT CLOUD DENOISING Seema Kumari, Utkarsh Mishra, Srimanta Mandal, Shanmuganathan Raman Proceedings International Conference on Image Processing Icip, 2025 Denoising 3D point cloud strives to remove noise from noisy data. Existing methods address the problem by estimating point-wise displacement from the point feature or by learning the distribution of noise. In this paper, we propose to embed the point cloud through a novel multi-resolution hash encoding, and utilize the embedding to learn an optimum transport plan between noisy and corresponding clean point cloud via transformer encoder and a shared-MLP based decoder. The multi-resolution hash encoding uses hierarchical hash-based representations to efficiently capture geometric details at multiple resolutions. Hence, it enables removal of noise while encoding global-to-local structural details. The transformer encoder further improves the model’s ability to learn long-range dependencies and contextual relationships, facilitating improved denoising performance. The optimum transport plan is devised by simulating a denoising diffusion probabilistic model through Schrödinger bridge problem. The proposed method advances state-of-the-art methods through extensive experiments and offers new insights into the synergy between hash encoding, and transformer architectures in the diffusion framework.
Attentions in Deep Framework to Enhance Images Degraded by Non-Homogeneous Haze Akash Dhedhi, Srimanta Mandal, Rajib Lochan Das 2023 IEEE 20th India Council International Conference Indicon 2023, 2023 The availability of dehazing datasets has enabled various deep learning techniques to perform effectively on hazy images. Most of the developed frameworks focus on removing homogeneous haze. However, homogeneous-centric methods produce sub-optimal results on non-homogeneous haze. The primary reason is that the architectures devised to handle homogeneous haze fail to address the non-uniformity of haze in non-homogeneous case. The secondary reason is the unavailability of enough data for the non-homogeneous scenario. Although many works cite the lack of data as a primary concern for poor performance, we find that the results are sub-standard even if the homogeneous-centric networks are trained with non-homogeneous data. Hence, there is a requirement for a network architecture that can handle non-homogeneous haze in a better way. In this work, we propose to use multiple attention mechanisms in parallel along with pre-trained ConvNeXt blocks. Specifically, we use pixel, channel, and residual channel attention mechanisms. Pixel attention can complement channel attention in dealing with space-variant haze when connected in parallel. On the other hand, residual channel attention fetches hazy image-related features and caters to better information flow toward the output. The proposed method by concatenating the attention-based features yields better results than the existing approaches.
Feature Selection Empowered BERT for Detection of Hate Speech with Vocabulary Augmentation PN Desai, T Kewalramani, S Mandal arXiv preprint arXiv:2512.02141 , 2025 2025
Contrastive Attention-Based Network for Self-Supervised Point Cloud Completion S Kumari, P Kumar, S Mandal, S Raman IEEE Signal Processing Letters 32, 4444 - 4448 , 2025 2025 Citations: 1
Multi-fish tracking with underwater image enhancement by deep network in marine ecosystems P Mukherjee, S Mandal, KR Jerripothula, V Maharshi, K Katara Signal Processing: Image Communication 138, 117321 , 2025 2025 Citations: 3
Structure preserving point cloud completion and classification with coarse-to-fine information S Kumari, S Mandal, S Raman Journal of Visual Communication and Image Representation, 104591 , 2025 2025 Citations: 2
Transformer augmented multi-resolution hash encoding in diffusion model for 3D point cloud denoising S Kumari, U Mishra, S Mandal, S Raman 2025 IEEE International Conference on Image Processing (ICIP), 2790-2795 , 2025 2025 Citations: 1
Deep Liveness: Face Liveness Detection Using a Lightweight U-Net-Based Deep Architecture Y Bhadoriya, Y Sorathiya, S Bhilare, S Mandal National Conference on Computer Vision, Pattern Recognition, Image … , 2025 2025
Attention-Based Multi-patch Hierarchical Network with Non-local Information for Smartphone Image Denoising K Savaliya, S Mandal, S Kumari, S Raman National Conference on Computer Vision, Pattern Recognition, Image … , 2025 2025
UWOT-Net: Underwater Object Tracking by Attention Driven Network in Unconstrained Marine Environments P Mukherjee, S Mandal, A Pediredla 2025 National Conference on Communications (NCC), 1-6 , 2025 2025 Citations: 1
PolSAR Image Classification Using Complex-Valued Squeeze and Excitation Network S Makhija, S Mandal, U Pandya, S Chirakkal, D Putrevu International Conference on Pattern Recognition, 270-286 , 2024 2024
Attentions in deep framework to enhance images degraded by non-homogeneous haze A Dhedhi, S Mandal, RL Das 2023 IEEE 20th India Council International Conference (INDICON), 515-520 , 2023 2023 Citations: 2
Low-resolution face recognition using multi-stream cnn in siamese framework R Vachhani, S Mandal, B Gohel 2023 Seventh International Conference on Image Information Processing (ICIIP … , 2023 2023 Citations: 4
Combining Non-local Sparse and Residual Channel Attentions for Single Image Super-resolution Across Modalities M Bhavsar, S Mandal International Conference on Computer Vision and Image Processing, 623-637 , 2022 2022
Increasing Transferability by Imposing Linearity and Perturbation in Intermediate Layer with Diverse Input Patterns M Shah, S Mandal, S Bhilare, A Hati 2022 IEEE International Conference on Signal Processing and Communications … , 2022 2022 Citations: 3
Generating targeted adversarial attacks and assessing their effectiveness in fooling deep neural networks S Gajjar, A Hati, S Bhilare, S Mandal 2022 IEEE International Conference on Signal Processing and Communications … , 2022 2022 Citations: 4
Automatic image colorization using ensemble of deep convolutional neural networks U Oza, A Pipara, S Mandal, P Kumar 2022 IEEE Region 10 Symposium (TENSYMP), 1-6 , 2022 2022 Citations: 8
Temporally Consistent Video Manipulation for Facial Expression Transfer K Rajyaguru, S Mandal, SK Mitra 2022 IEEE IAS Global Conference on Emerging Technologies (GlobConET), 109-114 , 2022 2022
Edge-Preserving classification of polarimetric SAR images using Wishart distribution and conditional random field N Chaudhari, SK Mitra, S Mandal, S Chirakkal, D Putrevu, A Misra International Journal of Remote Sensing 43 (6), 2134-2155 , 2022 2022 Citations: 6
Deep Networks for Image and Video Super-Resolution K Purohit, S Mandal, AN Rajagopalan arXiv preprint arXiv:2201.11996 , 2022 2022 Citations: 1
Image Superresolution using Scale-Recurrent Dense Network K Purohit, S Mandal, AN Rajagopalan arXiv preprint arXiv:2201.11998 , 2022 2022
Mitigating Channel-wise Noise for Single Image Super Resolution S Mandal, K Purohit, AN Rajagopalan arXiv preprint arXiv:2112.07589 , 2021 2021
MOST CITED SCHOLAR PUBLICATIONS
Noise adaptive super-resolution from single image via non-local mean and sparse representation S Mandal, A Bhavsar, AK Sao Signal Processing 132, 134-149 , 2017 2017 Citations: 42
Depth map restoration from undersampled data S Mandal, A Bhavsar, AK Sao IEEE Transactions on Image Processing 26 (1), 119-134 , 2016 2016 Citations: 35
Edge preserving single image super resolution in sparse environment S Mandal, AK Sao 2013 IEEE International Conference on Image Processing, 967-971 , 2013 2013 Citations: 27
Mixed-Dense Connection Networks for Image and Video Super-Resolution K Purohit, S Mandal, AN Rajagopalan Neurocomputing , 2019 2019 Citations: 24
Multi-level Weighted Enhancement for Underwater Image Dehazing K Purohit, S Mandal, AN Rajagopalan Journal of the Optical Society of America A 36 (6), 1098-1108 , 2019 2019 Citations: 22
Local proximity for enhanced visibility in haze S Mandal, AN Rajagopalan IEEE Transactions on Image Processing 29, 2478-2491 , 2019 2019 Citations: 17
Underwater image color correction using ensemble colorization network A Pipara, U Oza, S Mandal Proceedings of the IEEE/CVF international conference on computer vision … , 2021 2021 Citations: 14
Employing structural and statistical information to learn dictionary (s) for single image super-resolution in sparse domain S Mandal, AK Sao Signal Processing: Image Communication 48, 63-80 , 2016 2016 Citations: 14
Scale-recurrent multi-residual dense network for image super-resolution K Purohit, S Mandal, AN Rajagopalan Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 0-0 , 2018 2018 Citations: 12
Explicit and implicit employment of edge-related information in super-resolving distant faces for recognition S Mandal, S Thavalengal, AK Sao Pattern Analysis and Applications 19 (3), 867-884 , 2016 2016 Citations: 11
Hierarchical example-based range-image super-resolution with edge-preservation S Mandal, A Bhavsar, AK Sao 2014 IEEE International Conference on Image Processing (ICIP), 3867-3871 , 2014 2014 Citations: 11
Handwritten digit recognition using Bayesian ResNet P Mhasakar, P Trivedi, S Mandal, SK Mitra SN Computer Science 2 (5), 399 , 2021 2021 Citations: 9
Automatic image colorization using ensemble of deep convolutional neural networks U Oza, A Pipara, S Mandal, P Kumar 2022 IEEE Region 10 Symposium (TENSYMP), 1-6 , 2022 2022 Citations: 8
Super-resolving a single intensity/range image via non-local means and sparse representation S Mandal, A Bhavsar, AK Sao Proceedings of the 2014 Indian Conference on Computer Vision Graphics and … , 2014 2014 Citations: 8
Edge-Preserving classification of polarimetric SAR images using Wishart distribution and conditional random field N Chaudhari, SK Mitra, S Mandal, S Chirakkal, D Putrevu, A Misra International Journal of Remote Sensing 43 (6), 2134-2155 , 2022 2022 Citations: 6
Multi-scale image denoising while preserving edges in sparse domain S Mandal, S Kumari, A Bhavsar, AK Sao 2016 6th European Workshop on Visual Information Processing (EUVIP), 1-6 , 2016 2016 Citations: 6
Fusion-UWnet: multi-channel fusion-based deep CNN for underwater image enhancement P Pradhan, A Mazumder, S Mandal, BN Subudhi OCEANS 2021: San Diego–Porto, 1-5 , 2021 2021 Citations: 5
Low-resolution face recognition using multi-stream cnn in siamese framework R Vachhani, S Mandal, B Gohel 2023 Seventh International Conference on Image Information Processing (ICIIP … , 2023 2023 Citations: 4
Generating targeted adversarial attacks and assessing their effectiveness in fooling deep neural networks S Gajjar, A Hati, S Bhilare, S Mandal 2022 IEEE International Conference on Signal Processing and Communications … , 2022 2022 Citations: 4
EMOTIONCAPS-facial emotion recognition using capsules B Shah, K Bhatt, S Mandal, SK Mitra International Conference on Neural Information Processing, 394-401 , 2020 2020 Citations: 4