Computer Vision and Pattern Recognition, Computer Science, Multidisciplinary, Information Systems
29
Scopus Publications
Scopus Publications
BD-KDD: A real-world clinical dataset for kidney disease diagnosis and healthy classification Muhammad Towfiqur Rahman, Salma Akter, Md. Masudul Islam, Md. Shafiqul Islam Data in Brief, 2026 Kidney disease is a major global health concern that requires timely diagnosis and effective monitoring to prevent severe complications and improve patient outcomes. This data article presents BD-KDD, a structured clinical dataset designed to facilitate research on kidney disease diagnosis. The dataset was collected retrospectively from electronic medical records obtained from Popular Diagnostic Center, Savar Branch, Dhaka, Bangladesh, following institutional authorization for academic research. The BD-KDD dataset contains 988 patient records with 26 variables, including demographic attributes, physiological measurements, biochemical laboratory tests, urinalysis indicators, hematological parameters, comorbidity indicators, and clinical symptoms. Key laboratory features include serum creatinine, blood urea, blood glucose, sodium, potassium, hemoglobin, packed cell volume, red blood cell count, and white blood cell count, along with urinalysis indicators such as specific gravity, albumin, and sugar levels. Each record is assigned a binary diagnostic label representing either healthy individuals or kidney disease cases based on clinical evaluation and laboratory findings. The curated dataset includes 481 healthy and 507 kidney disease cases and is provided in CSV format together with a dataset dictionary describing variable definitions and coding schemes. BD-KDD offers a valuable resource for biomedical data analysis, health informatics research, and the development of machine learning-based diagnostic models and clinical decision support systems for renal health assessment.
Lower limb and feet wound image dataset Md Darun Nayeem, Md Sakibul Hassan Rifat, Nusrat Jahan Nisita, Md Masudul Islam, Md Saifur Rahman Data in Brief, 2026 A comprehensive wound-related image repository was developed to address critical gaps in existing medical imaging resources, particularly the lack of balanced datasets representing both healthy and pathological lower-limb conditions. The collection comprises 5443 images sourced from two complementary streams: real-world clinical wound cases and controlled acquisition of healthy feet images. The wound component includes 2686 expertly annotated images representing eight clinically significant wound types-diabetic, pressure, trauma, venous, surgical, arterial, cellulitis, and miscellaneous categories. These images were gathered across diverse clinical environments between 2015 and 2019 and meticulously annotated by certified wound specialists, ensuring high-quality segmentation masks including peri‑wound regions. The healthy-foot component consists of 2757 images captured from volunteer participants in naturalistic settings using consumer-grade smartphone cameras. Each participant contributed eight multi-angle images under consistent protocols, enabling robust representation of anatomical variability across sex, skin tone, and foot structure. All images were standardized through controlled resizing procedures, while the wound dataset underwent additional mask generation and augmentation strategies to support downstream segmentation and classification tasks. This unified dataset provides a balanced foundation for developing machine learning models capable of distinguishing between normal and pathological foot conditions while supporting advanced tasks such as wound segmentation, severity assessment, and clinical decision support. By integrating healthy and wound images within a single accessible collection, the dataset mitigates class imbalance issues prevalent in existing resources and enables scalable, generalizable deep learning research in wound detection, monitoring, and medical image analysis.
AsianVehicle: An image dataset of traditional Asian vehicles Md. Darun Nayeem, Md. Sadikujjaman, Rehnuma Tabassum, Anika Ivnath, Md. Masudul Islam Data in Brief, 2026 The AsianVehicle dataset introduces a comprehensive image collection of four traditional Bangladeshi vehicle types-Auto Rickshaw, Rickshaw, Rickshaw Van, and Leguna-captured to support research in computer vision and cultural informatics. These vehicles, integral to South and Southeast Asian transport systems, represent unique visual and structural characteristics seldom documented in existing global datasets. A total of 4000 RGB images were collected across various urban and semi-urban areas of Mirpur, Dhaka, using smartphone cameras under natural daylight conditions to preserve authentic colors, textures, and environmental diversity. The dataset encompasses variations in viewing angles, backgrounds, and illumination, reflecting real-world scenarios where such vehicles operate. All images are provided in properly processed form, enabling users to apply customized preprocessing, and labeling strategies according to their research needs. Beyond supporting machine learning tasks such as vehicle classification, segmentation, or detection, this dataset contributes to the digital preservation of traditional Asian transport designs that are gradually disappearing due to modernization. Its open accessibility facilitates comparative studies on model generalization, cross-domain adaptation, and low-resource visual recognition. By bridging cultural representation and artificial intelligence research, AsianVehicle offers a valuable foundation for both technical innovation and the preservation of regional identity within data-driven applications.
A dual gene-signature framework for glioma survival prediction with multi-cohort validation Romeo Macline D'Costa, Md. Shafiqul Islam, Md. Masudul Islam Immunobiology, 2026 Despite the proliferation of prognostic gene signatures for glioma, clinical translation remains stalled by poor reproducibility and overfitting. In this study, we address this stability crisis by developing a robust “Dual-Signature Framework” using stability selection—a rigorous resampling method—rather than standard regression. Analyzing RNA-seq data from 1351 patients across the TCGA ( n = 694) and CGGA ( n = 657) cohorts, we constructed two distinct models. The primary 20-gene “Data-Driven” signature achieved superior predictive accuracy (C-index: 0.7392), significantly outperforming 14 published benchmark models and the current best single-gene predictor ( HOXA5 ). In parallel, we derived a 7-gene “Biology-Driven” signature (including HOXA5 , CHI3L1 , MMP14 ) that retained 98% of the predictive power (C-index: 0.7252) while prioritizing mechanistic interpretability. Both models successfully stratified patients into distinct risk groups with high statistical significance (Log-rank p < 0.001) in external validation. Comprehensive subgroup analyses across 19 clinical and molecular subgroups demonstrated robust performance (C-index range: 0.59–0.85), with extended calibration analysis confirming excellent probability estimation (Brier score 0.20 for 5-year predictions). By integrating stability-driven feature selection with biological pathway constraints, this study provides a reproducible, high-performance alternative to unstable “black box” models, offering a translation-ready tool for personalized glioma risk assessment. • Developed a dual-signature framework for glioma survival prediction. • Achieved strong performance with C-index of 0.7392 in validation. • Validated model across TCGA and CGGA multi-cohort datasets. • Proposed a 7-gene interpretable model with near-equal performance. • Demonstrated robust risk stratification across clinical subgroups.
BanglaMUSE: A multimodal Bangla sentiment dataset of text–audio pairs for speech and sentiment analysis Md. Darun Nayeem, Zarin Rafa, Tasnuva Tasnim Nova, Yasin Rahman, Abdul Mumeet Pathan, Md. Masudul Islam Data in Brief, 2026 This article describes a publicly available multimodal Bangla sentiment dataset designed to support research in speech processing, sentiment analysis, and low-resource language modeling. The dataset comprises two synchronized modalities: sentiment-annotated Bangla text and corresponding speech recordings. It contains 1,000 manually curated Bangla sentences evenly distributed across positive and negative sentiment classes, alongside 4,000 aligned audio recordings produced by four native speakers. Each sentence is recorded independently by all speakers to ensure speaker diversity while maintaining consistent textual content. The text component reflects natural, everyday Bangla language usage and is structured to facilitate sentiment classification and linguistic analysis. The audio recordings were collected under controlled yet realistic acoustic conditions using multiple recording devices, introducing natural variability relevant for real-world speech applications. All samples underwent manual quality verification to ensure accurate text-audio alignment and to remove noisy or duplicated recordings. The dataset is suitable for a wide range of applications, including multimodal sentiment classification, sentiment-aware speech recognition, audio-text alignment, and benchmarking of multimodal learning approaches for low-resource languages. Its modular structure allows straightforward extension with additional speakers, dialects, or sentiment categories. By providing aligned textual and speech data for Bangla, this dataset contributes a valuable resource to the research community and supports broader efforts toward linguistic diversity in artificial intelligence.
A comprehensive image dataset of American Sign Language hand gestures Md. Famidul Islam Pranto, Md. Rifatul Islam, Md. Ali Akbor, Nabonita Ghosh, Md. Rahatun Alam, Sudipto Chaki, Md. Masudul Islam Data in Brief, 2026 We present ASL-HG, a comprehensive American Sign Language (ASL) image dataset designed to advance gesture recognition and assistive technologies. The collection contains 36,000 static images across 36 classes, covering the full English alphabet (A-Z) and digits (0-9). Data were captured from 10 volunteers in Mirpur, Dhaka, Bangladesh, with each participant contributing 100 samples per class, ensuring a balanced distribution across subjects, genders, and skin tones. Unlike many existing ASL datasets, ASL-HG explicitly distinguishes between the letter "O" and the digit "0″ by including the standard two-handed ASL "zero" sign used in practical alphanumeric communication. The dataset is released in two complementary forms: raw images with natural indoor and outdoor backgrounds, and a MediaPipe-processed version with hand-segmented crops and predefined 80-20 train-test splits. This design supports both custom pre-processing and immediate model training. ASL-HG is intended to serve as a benchmark resource for developing robust and fair ASL recognition systems, reducing communication barriers for deaf and speech-impaired users, and enabling broader research in gesture-based human-computer interaction.
A comprehensive deep learning framework for rice variety classification with real-time deployment Md. Masudul Islam, Md. Shafiqul Islam, Md. Golam Moazzam, Mohammad Shorif Uddin Smart Agricultural Technology, 2026 Rice is a staple food for more than half of the global population and a cornerstone of agricultural economies. With the growing digitalization of agri-food supply chains, automated rice variety recognition has become essential for breeding, seed authentication, quality assurance, and fraud prevention. However, most existing studies are constrained by small datasets often including fewer than twenty varieties limiting model generalization and real-world applicability. To address this gap, we developed a comprehensive deep learning framework that evaluates (i) transfer learning models, (ii) a stacked ensemble of 13 meta-classifiers, and (iii) both raw and pretrained Vision Transformer (ViT-B/16) models on a large-scale and diverse image dataset comprising 62 rice varieties. Experimental results demonstrate that the stacked ensemble achieved the highest classification accuracy, outperforming individual CNN and transformer architectures. For real-world use, three top-performing models VGG16, the stacked ensemble, and Google’s ViT-B/16 are deployed through a lightweight web platform (RiceVision) and a mobile application, enabling real-time rice variety identification. This study underscores the translational potential of AI-driven computer vision in advancing smart agriculture and digital seed supply chain management.
A comprehensive image dataset of jute diseases Md. Masudul Islam, Md. Ripon Sheikh Data in Brief, 2026 This Data Descriptor presents the Jute Diseases Image Dataset; a curated collection of 1390 high-resolution images aimed at supporting the development of machine learning models for timely identification and accurate diagnosis of jute (Corchorus) plant diseases. The dataset is categorized into five classes: Dieback (300), Holed (300), Mosaic (240), Stem Soft Rot (270), and Fresh (280) representing healthy leaves. Images were captured under varied natural lighting and directional conditions across diverse jute cultivation areas to enhance model generalizability. A rigorous pre-processing pipeline was applied, including uniform resizing to 1024 × 1024 pixels and removal of duplicate images to ensure data integrity. The dataset is organized into two components: a raw, pre-processed set and an augmented train-test split version, enabling immediate use in machine learning workflows. Additionally, Grad-CAM and Guided Grad-CAM techniques were applied to sample images to visualize and validate model attention on disease-relevant regions. This resource addresses the lack of labelled jute disease imagery and supports timely disease management, particularly for stakeholders in Bangladesh and other major jute-producing regions.
Artificial Intelligence Literacy and Sustainable Development: An Ethical Governance and Development Goals Framework Md. Masudul Islam, Mirza Niaz Morshed, Md. Shafiqul Islam Sustainable Development, 2026 AI literacy provides foundational competencies that support ethical, transparent, and sustainable technological development, although higher‐order capabilities such as governance, critical evaluation, and strategic decision‐making extend beyond basic literacy into advanced levels of AI competency. This study positions AI literacy as a governance capacity that complements and strengthens all 17 SDGs. It introduces a six‐level taxonomy of artificial intelligence reasoning and ethics that extends traditional learning models by incorporating ethical judgement and strategic foresight. This taxonomy forms the foundation of an integrated framework linking education, governance, and sustainable development. A survey of 300 participants from diverse professional backgrounds within a national context which reveals strong technical awareness but limited ethical and governance readiness, highlighting critical gaps in public capacity to manage artificial intelligence responsibly. Findings show that ethical reasoning and reflective thinking are the strongest predictors of sustainable and trustworthy artificial intelligence use. The study proposed to embed literacy‐based competencies into curricula, institutional policies, and governance mechanisms to accelerate equitable and responsible progress toward sustainable development goals.
Ecosupplychain: a Blockchain-Enabled Dual-Flow Framework for Counterfeit Prevention, Profit Transparency, and Sustainable Recycling Md. Humayan Kabir Rupok, Rakibul Hasan, Md. Masudul Islam, Md. Shafiqul Islam International Conference on Advanced Communication Technology Icact, 2026 Modern supply chains promise efficiency and global connectivity, yet continue to face persistent challenges such as counterfeit infiltration, opaque profit distribution, and the lack of sustainable recycling mechanisms. To address these issues, this paper presents EcoSupplyChain, a unified blockchain-based framework that integrates forward (production to consumer) and reverse (recycling) logistics into a transparent and tamperresistant ecosystem. The system involves five primary actors: government authority, factory, supplier, store, and buyer, operating on the Ethereum blockchain through smart contracts for automated verification and reward distribution. The framework tackles three major challenges: (1) product authenticity verification using cryptographic signatures and QR-based traceability, (2) transparent profit margin validation without compromising commercial confidentiality, and (3) incentivized recycling through cryptocurrency rewards and tax waivers for manufacturers. The forward chain ensures immutable tracking and profit verification across all supply stages, while the reverse chain promotes recycling participation and material recovery. By merging financial fairness, counterfeit prevention, and sustainability within a single decentralized architecture, EcoSupplyChain establishes a scalable model for secure, transparent, and environmentally responsible supply chain management.
State-of-the-art reformation of web programming course curriculum in digital Bangladesh International Journal of Advanced Computer Science and Applications, 2020