Computer Engineering, Artificial Intelligence, Computer Vision and Pattern Recognition, Signal Processing
22
Scopus Publications
Scopus Publications
A multi scale spatial attention based zero shot learning framework for low light image enhancement Muhammad Azeem Aslam, Hassan Khalid, Nisar Ahmed Scientific Reports, 2026 Low-light image enhancement remains a challenging task, particularly in the absence of paired training data. In this study, we present LucentVisionNet, a novel zero-shot learning framework that addresses the limitations of traditional and deep learning-based enhancement methods. The proposed approach integrates multi-scale spatial attention with a deep curve estimation network, enabling fine-grained enhancement while preserving semantic and perceptual fidelity. To further improve generalization, we adopt a recurrent enhancement strategy and optimize the model using a composite loss function comprising six tailored components, including a novel no-reference image quality loss inspired by human visual perception. Extensive experiments on both paired and unpaired benchmark datasets demonstrate that LucentVisionNet consistently outperforms state-of-the-art supervised, unsupervised, and zero-shot methods across multiple full-reference and no-reference image quality metrics. Our framework achieves high visual quality, structural consistency, and computational efficiency, making it well-suited for deployment in real-world applications such as mobile photography, surveillance, and autonomous navigation.
A hybrid attention network for accurate breast tumor segmentation in ultrasound images Muhammad Azeem Aslam, Asim Naveed, Nisar Ahmed, Zhang Ke Scientific Reports, 2025 Breast ultrasound (BUS) imaging is widely recognized as a non-invasive and cost-effective modality for the timely diagnosis of breast cancer. Despite its clinical importance, automatic tumor segmentation remains a highly challenging task because of speckle noise, varying lesion scale, and inherently indistinct boundaries between malignant and healthy tissue. To address these challenges, we introduce a novel hybrid attention-based segmentation framework, named HA-Net, tailored for BUS images. The proposed HA-Net uses a pre-trained DenseNet-121 backbone in the encoder to extract discriminative features, ensuring robustness against imaging artifacts. At the bottleneck, three complementary modules, Global Spatial Attention (GSA), Position Encoding (PE), and Scaled Dot-Product Attention (SDPA), are incorporated to capture long-range dependencies, preserve structural relationships, and model contextual interactions among features. Moreover, a Spatial Feature Enhancement Block (SFEB) is incorporated within the skip connections to refine spatial detail and emphasize tumor-relevant regions, thereby strengthening the decoder's reconstruction capability. To further improve segmentation reliability, a composite loss function is employed by combining Binary Cross-Entropy (BCE) with Jaccard Index loss, ensuring balanced optimization across pixel-level classification and region-level overlap. In comparison to current state-of-the-art (SOTA) approaches, extensive experiments on publicly available BUS datasets show that the proposed HA-Net achieves competent performance, highlighting its potential as an efficient decision-support tool for radiologists.
Improving Arabic Multi-Label Emotion Classification Using Stacked Embeddings and Hybrid Loss Function Yimei Xu, Muhammad Azeem Aslam, Wang Jun, Nisar Ahmed, Muhammad Imran Zaman, Muhammad Hamza, Saba Aslam IEEE Access, 2025 Multi-label emotion classification (MLEC) for low-resource languages like Arabic faces significant challenges due to class imbalance and label correlations, particularly in accurately predicting minority emotions. This study propose a novel framework combining stacked contextual embeddings, meta-learning, and a hybrid loss function to address these issues. First, we generate enriched embeddings by fine-tuning and stacking three Arabic language models (ArabicBERT, MarBERT, AraBERT). These embeddings are processed by a Bi-LSTM meta-learner for sequence learning, followed by a fully connected network for classification. To mitigate class imbalance and leverage label dependencies, we introduce a hybrid loss integrating contrastive learning (CL), label correlation matrices (LCM), and class weighting (CW). Extensive experiments on the SemEval-2018 Task 1-Ec-Ar dataset demonstrate state-of-the-art performance, with a Jaccard accuracy of 0.81, F1-score of 0.67, and Hamming loss of 0.15. Ablation studies confirm the contributions of each component, while class-wise analysis shows our hybrid loss reduces disparities between majority and minority classes by up to 22%. Beyond Arabic MLEC, this work offers a generalizable framework adaptable to other languages and domains, advancing emotion analysis in low-resource settings.
QualityNet: A multi-stream fusion framework with spatial and channel attention for blind image quality assessment Muhammad Azeem Aslam, Xu Wei, Hassan Khalid, Nisar Ahmed, Zhu Shuangtong, Xin Liu, Yimei Xu Scientific Reports, 2024 This study introduces a novel Blind Image Quality Assessment (BIQA) approach leveraging a multi-stream spatial and channel attention model. Our method addresses challenges posed by diverse image content and distortions by integrating feature maps from two distinct backbones. Through spatial and channel attention mechanisms, our algorithm prioritizes regions of interest, enhancing its ability to capture crucial image details. Extensive evaluations on four benchmark datasets demonstrate superior performance compared to existing methods, closely aligning with human perceptual assessment. Our approach exhibits exceptional generalization capabilities on both authentic and synthetic distortion databases. Moreover, it demonstrates a distinctive focus on perceptual foreground information, enhancing its practical applicability. Thorough quantitative analyses underscore the algorithm's superior performance, establishing its dominance over existing methods.
VRL-IQA: Visual Representation Learning for Image Quality Assessment Muhammad Azeem Aslam, Xu Wei, Nisar Ahmed, Gulshan Saleem, Tuba Amin, Hui Caixue IEEE Access, 2024 The growing adoption of digital multimedia devices and the greater reliance on compression and wireless channels for data transmission has brought renewed focus to the traditional challenge of evaluating image quality. Image Quality Assessment (IQA) is needed to optimize bit rate, compression, or processing and communication strategies for these multimedia technologies. Visual representation learning enables the model to undertake upstream training on large-scale data and then fine-tune the model on downstream data using fewer training samples. Data annotation for IQA is expensive due to the difficulty of grading a picture’s quality, the need to gather quality labels from numerous observers, and the diversity of perceptual quality and content of the images. This challenge has limited the amount of the labeled training dataset for IQA to a few thousand. In this study, a deep Convolutional Neural Network is trained on a large-scale image dataset produced by simulating 165 distortion scenarios on 150,000 images resulting in 24.75 million distorted images. These images are labeled via an ensemble of full-reference quality assessment models which assign the quality rating to each distorted image by using its reference image. This trained model is fine-tuned on two datasets TID2013 and Kadid-10K datasets containing simulated distortions and two datasets KonIQ-10K and BIQ2021 containing authentic distortions. The fine-tuning performance has resulted in state-of-the-art IQA performance and yielded a Spearman’s correlation coefficient of 0.921, 0.893, 0.884, and 0.793, respectively. Moreover, comparison with the ImageNet pre-trained model revealed that the proposed VRL-IQA model provides higher performance in terms of Pearson and Spearman’s correlations and achieves the validation criteria with fewer epochs than the ImageNet pre-trained model. These findings contribute to the advancement of IQA, offering a promising approach for robust and accurate quality prediction in various applications.
TQP: An Efficient Video Quality Assessment Framework for Adaptive Bitrate Video Streaming Muhammad Azeem Aslam, Xu Wei, Nisar Ahmed, Gulshan Saleem, Zhu Shuangtong, Yimei Xu, Hu Hongfei IEEE Access, 2024 The increasing popularity of video streaming services and the widespread accessibility of high-speed internet underscore the importance of delivering cost-effective and seamless streaming experiences. Shared internet connections may lead to varying speeds, impacting Quality of Experience (QoE). Rate adaptation techniques aim to ensure smooth video transmission, but overly optimistic adaptations can compromise user experience. Objective video quality assessment is crucial for efficient rate adaptation to ensure smooth QoE. This research proposes a novel method incorporating temporal channel shifting into Convolutional Neural Networks (CNN) for video quality assessment while maintaining the computational simplicity of a 2D CNN model. The proposed approach relies on the EfficientNet architecture, initially pre-trained on quality-aware images, and fine-tune it using datasets of rate-adaptive videos. The model is trained and evaluated on two benchmark datasets, namely “Waterloo sQoE III” and “LIVE Netflix II,” which consist of rate-adaptive videos annotated with subjective quality scores. Experimental results encompass the evaluation of Pearson, Spearman, and Kendall correlation coefficients, along with the computation time ratio for the proposed approach. The outcomes reveal competitive scores of 0.795, 0.652, 0.772, and 0.216 for the “Live Netflix II dataset” and 0.782, 0.713, 0.721, and 0.230 for the “Waterloo sQoE III dataset.” Our proposed method, compared to 24 approaches for “Waterloo sQoE III” and 25 for “LIVE Netflix II,” attains the highest correlation scores while maintaining near-real-time processing efficiency. These results affirm the efficacy of our approach in accurately predicting human judgment (QoE) with computational efficiency.
A Color Image Encryption Scheme Based on Singular Values and Chaos Adnan Malik, Muhammad Ali, Faisal S. Alsubaei, Nisar Ahmed, Harish Kumar CMES Computer Modeling in Engineering and Sciences, 2023 The security of digital images transmitted via the Internet or other public media is of the utmost importance. Image encryption is a method of keeping an image secure while it travels across a non-secure communication medium where it could be intercepted by unauthorized entities. This study provides an approach to color image encryption that could find practical use in various contexts. The proposed method, which combines four chaotic systems, employs singular value decomposition and a chaotic sequence, making it both secure and compression-friendly. The unified average change intensity, the number of pixels’ change rate, information entropy analysis, correlation coefficient analysis, compression friendliness, and security against brute force, statistical analysis and differential attacks are all used to evaluate the algorithm’s performance. Following a thorough investigation of the experimental data, it is concluded that the proposed image encryption approach is secure against a wide range of attacks and provides superior compression friendliness when compared to chaos-based alternatives.
BIQ2021: a large-scale blind image quality assessment database Nisar Ahmed, Shahzad Asif Journal of Electronic Imaging, 2022 Perceptual quality assessment of digital images is becoming increasingly important due to widespread use of digital multimedia devices. Smartphones and high-speed internet are among the technologies that have increased the amount of multimedia content by several folds. Availability of a representative dataset, required for objective quality assessment training, is therefore an important challenge. We present a blind image quality assessment database (BIQ2021). The dataset addresses the challenge of representative images for no-reference image quality assessment by selecting images with naturally occurring distortions and reliable labeling. The dataset contains three set of images: images captured without intention of their use in image quality assessment, images obtained with intentional introduced natural distortions, and images collected from an open-source image sharing platform. Ensuring that the database contains a mix of images from different devices, containing different type of objects, and having varying degree of foreground and background information has been tried. The subjective scoring of these images is carried out in a laboratory environment through single-stimulus method to obtain reliable scores. The database provides details of subjective scoring, statistics of the human subjects, and the standard deviation of each image. The mean opinion scores (MOSs) provided with the dataset make it useful for assessment of visual quality. Moreover, existing blind image quality assessment approaches are tested on the proposed database, and the scores are analyzed using Pearson and Spearman’s correlation coefficients. The image database and the MOS along with relevant statistics are freely available for use and benchmarking.