Mubarak Albarka Umar

@uaeu.ac.ae

United Arab Emirates University

Mubarak Albarka Umar
9

Scopus Publications

506

Scholar Citations

9

Scholar h-index

8

Scholar i10-index

Scopus Publications

  • An explainable artificial intelligence and Internet of Things framework for monitoring and predicting cardiovascular disease
    Mubarak Albarka Umar, Najah AbuAli, Khaled Shuaib, Ali Ismail Awad
    Engineering Applications of Artificial Intelligence, 2025
    Cardiovascular disease (CVD) is a leading cause of death globally. The unpredictability and severity of CVDs, such as sudden cardiac arrests, necessitate real-time monitoring and prediction using the Internet of Things (IoT) and artificial intelligence (AI) for timely intervention. Existing AI models and IoT frameworks for CVD prediction often lack integration of diverse data and fail to provide transparency in predictions, reducing user confidence and treatment effectiveness. We propose an explainable-IoT framework leveraging eXplainable AI (XAI) in the cloud layer and a mobile application to swiftly communicate predictions and explainability to patients, healthcare providers, and other users, facilitating proactive management and informed decisions. The framework integrates sensor data with cloud-based medical records to improve cardiovascular care and build user trust. Using support vector machine (SVM), random forest (RF), k-nearest neighbor (KNN), and deep neural network (DNN), we develop and evaluate CVD prediction models on two datasets: a heart disease dataset (D1) and the Cleveland dataset (D3). Additionally, we use a synthetic dataset (D2) for comparative analysis. The models are evaluated based on accuracy, precision, recall, area under the curve, time, and F1 score. For D3, SVM achieved the best accuracy (84.62%), while RF performed best on D1 and D2 (92.44% and 98.06%, respectively), comparable to state-of-the-art works. Our results highlight the importance of underrepresented physiological features in CVD datasets and the need for comprehensive datasets to enhance CVD model development. Furthermore, while synthetic data (D2) is effective for initial modeling, it requires validation with real-world data for reliable CVD prediction. • Developed an AI-IoT framework for real-time CVD prediction using diverse datasets. • Compared real and synthetic datasets, emphasizing physiological features for CVD accuracy. • Integrated a mobile app for real-time data and improved CVD prevention and management.
  • Effects of feature selection and normalization on network intrusion detection
    Mubarak Albarka Umar, Zhanfang Chen, Khaled Shuaib, Yan Liu
    Data Science and Management, 2025
    The rapid rise of cyberattacks and the gradual failure of traditional defense systems and approaches led to using artificial intelligence (AI) techniques (such as machine learning (ML) and deep learning (DL)) to build more efficient and reliable intrusion detection systems (IDSs). However, the advent of larger IDS datasets has negatively impacted the performance and computational complexity of AI-based IDSs. Many researchers used data preprocessing techniques such as feature selection and normalization to overcome such issues. While most of these researchers reported the success of these preprocessing techniques on a shallow level, very few studies have been performed on their effects on a wider scale. Furthermore, the performance of an IDS model is subject to not only the utilized preprocessing techniques but also the dataset and the ML/DL algorithm used, which most of the existing studies give little emphasis on. Thus, this study provides an in-depth analysis of feature selection and normalization effects on IDS models built using three IDS datasets: NSL-KDD, UNSW-NB15, and CSE-CIC-IDS2018, and various AI algorithms. A wrapper-based approach, which tends to give superior performance, and min-max normalization methods were used for feature selection and normalization respectively. Numerous IDS models were implemented using the full and feature-selected copies of the datasets with and without normalization. The models were evaluated using popular evaluation metrics in IDS modeling, intra- and inter-model comparisons were performed between models and with state-of-the-art works. Random forest (RF) models performed better on NSL-KDD and UNSW-NB15 datasets with accuracies of 99.86% and 96.01% respectively, whereas ANN achieved the best accuracy of 95.43% on the CSE-CIC-IDS2018 dataset. The RF models also achieved an excellent performance compared to recent works. The results show that normalization and feature selection positively affect IDS modeling. Furthermore, while feature selection benefits simpler algorithms (such as RF), normalization is more useful for complex algorithms like ANNs and DNNs, and algorithms such as NB are unsuitable for IDS modeling. The study also found that the UNSW-NB15 and CSE-CIC-IDS2018 datasets are more complex and more suitable for building and evaluating modern-day IDS than the NSL-KDD dataset. Our findings suggest that prioritizing robust algorithms like RF, alongside complex models such as ANN and DNN, can significantly enhance IDS performance. These insights provide valuable guidance for managers to develop more effective security measures by focusing on high detection rates and low false alert rates. • Research highlight 1 - We underscored the effectiveness of normalization and feature selection in improving Intrusion Detection System (IDS) model performance and computational time, highlighting their differential impact on various AI algorithms. • Research highlight 2 - The CSE-CIC-IDS2018 and UNSW-NB15 datasets are identified as more complex and suitable for modern-day IDS development and evaluation compared to the outdated NSL-KDD dataset, thus providing a more realistic and challenging benchmark for IDS research. • Research highlight 3 - While feature selection significantly benefits simpler algorithms like RF, normalization proves more effective for complex models such as ANNs and DNNs. Algorithms like NB are found unsuitable for IDS modeling. Notably, neural network-based algorithms exhibit superior performance without explicit feature selection, highlighting their capacity to handle high-dimensional and complex data. • Research highlight 4 - Our findings provide actionable insights for managers to improve the efficiency and effectiveness of their security measures. By prioritizing robust algorithms like RF, alongside complex models such as ANN and DNN, and focusing on high detection rates and low false alert rates, managers can develop more effective IDS strategies.
  • Simulating and Evaluating the Performance of a Cloud Computing Datacenter Using Queuing Model
    Mubarak Albarka Umar
    2024 6th International Symposium on Advanced Electrical and Communication Technologies Isaect 2024, 2024
    Cloud computing and virtualization are fundamental to modern computer system design. As cloud computing adoption grows across organizations, evaluating its performance becomes essential. This study simulates and analyzes a cloud datacenter’s performance using queuing models. Specifically, an Mt/M/1/K queuing system is employed, with arrival parameters estimated from real CPU utilization data from the Bitbrains datacenter, modeled as discrete Homogeneous Poisson Processes (HPPs). The simulation results of the modeled system reveal minimal average waiting time and efficient task processing with low delays. Additionally, the study highlights the significant impact of service rates on the average response times of tasks arriving as discrete HPPs. These findings offer valuable insights into cloud datacenter performance, aiding in informed decisions for service upgrades and optimal resource utilization.
  • Cyber-Attack Detection in Smart Grids: A Comparative Analysis of Supervised and Semi-Supervised Methods
    Mubarak Albarka Umar, Khaled Shuaib
    2024 6th International Symposium on Advanced Electrical and Communication Technologies Isaect 2024, 2024
    The advancement of smart grids has addressed many challenges of traditional power grids, yet it has also introduced new vulnerabilities to cyber-attacks that can disrupt power, leading to severe socio-economic impacts like blackouts and grid disturbance. While numerous supervised machine learning methods have been proposed to detect cyber-attacks in smart grids, they require a large dataset of normal and attack instances for training. However, gathering sufficient samples of diverse attack scenarios, especially zero-day attacks, is challenging. In this paper, we develop a semi-supervised model to detect grid attacks using phasor measurement unit (PMU) data from a power system dataset. Principal component analysis (PCA) is applied to select optimal components, and the model is trained using high instances of normal event data, thus enabling it to identify new, unknown attack patterns. We also developed a supervised model for comparison, evaluating both using key metrics. Results demonstrate that the semi-supervised model is more effective in detecting attack events (with 91.2% precision and 90% accuracy) than the supervised approach (90.7% precision and 91.8% accuracy).
  • Autoencoder-based Arrhythmia Detection using Synthetic ECG Generation Technique
    Ali Nawaz, Mubarak Albarka Umar, Khaled Shuaib, Amir Ahmad, Abdelkader Nasreddine Belkacem
    Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society EMBS, 2024
    With a couple of million lives lost annually, cardiovascular disease (CVD) is the leading cause of death globally; about 80% of which are due to arrhythmia. Electrocardiogram (ECG) signals are important for arrhythmia diagnosis, researchers have used various ECG datasets in building arrhythmia detection systems to automate the manual time-consuming diagnostic process. However, existing datasets have class imbalance issues, and the traditional oversampling and undersampling techniques prove ineffective in handling the imbalance problem. We propose a novel approach to handling arrhythmia detection as an anomaly case to address this. In our proposed approach, we first use Generative Adversarial Networks (GANs) to synthetically generate normal training instances from the MIT-BIH arrhythmia dataset and then we use only the synthetically generated normal data to build the anomaly model using autoencoder (AE); employing the AE for unsupervised anomaly detection help in overcoming the GAN convergence issues. We evaluate the model using test data comprising both normal and abnormal samples that are not used by the GAN and compare its performance with other state-of-the-art works. The model achieved improved arrhythmia detection with an AUC-ROC of 0.6768 and an AUC-PR of 0.8537. While effectively tackling data scarcity and imbalance, this work also contributes valuable perspectives to enhance arrhythmia detection systems, providing a foundation for more reliable and adaptable solutions in healthcare.
  • LRCMP: A Sequential Statistical Framework for Predicting Cancer Mortality Rate
    Mubarak Albarka Umar, Ali Nawaz, Tariq Qayyum
    Imtic 2023 7th International Multi Topic ICT Conference 2023 AI Convergence Towards Sustainable Communications, 2023
    Over 10 Million deaths in the world are because of cancer. Cancer is the second leading cause of death after cardiovascular disease. Additionally, Cancer has significant effects on the socioeconomic status of a family. There are several studies about socioeconomic status and cancer. This work firstly focuses on exploring the relationships between socioeconomic status and cancer mortality rate from disparate open-source data using statistical analysis. Initially, the data consists of 34 features which are reduced to 13 most relevant features using the backward selection method. Secondly, based on the cancer data, we build an appropriate model that can predict the cancer death rate. Specifically, a linear regression model is built and trained for cancer mortality rate prediction. Several models were first built and linear regression diagnostics are performed on the models to check for any assumption violations, finally, the most appropriate model is selected and fine-tuned to provide optimized results. The model is assessed and R2 and RMSE are used to evaluate the model's performance, the model achieved an R2 of 81.12% and RMSE score of 12.23 on test data. Our work also highlights the importance of checking regression assumptions in linear regression modeling.
  • A Time Series Regression-based Model for Predicting the Spread of Dengue Disease
    Muhammad Danish Waseem, Ali Nawaz, Uzair Rasheed, Abir Raza, Mubarak Omar Albarka
    2023 International Conference on Robotics and Automation in Industry Icrai 2023, 2023
    Dengue is a viral disease, spread by the mosquito species Aedes aegypti. According to WHO, every year 100-400 million cases of dengue infection are reported worldwide. Dengue mosquito inhibits in tropical regions and proliferates in wet climate conditions. Since it is impossible to clean those regions from the mosquito completely, therefore an analysis of the relationship between different climatic factors and dengue spread is important to forecast the number of cases ahead so that precautionary measures can be taken beforehand to minimize the disease spread. Specifically, to predict the spread we employed two prominent time series models i.e. SARIMA and SARIMAX on the publicly available DengAI dataset. The performance of the models is evaluated by using Mean Absolute Error (MAE), achieving MAE scores of 27.39 and 25.52 on SARIMA and SARIMAX respectively, which reveals that our proposed methodology outperformed other existing machine learning methods.
  • Network Intrusion Detection Using Wrapper-based Decision Tree for Feature Selection
    Mubarak Albarka Umar, Chen Zhanfang, Yan Liu
    ACM International Conference Proceeding Series, 2020
    One of the key challenges of the machine learning (ML) based intrusion detection system (IDS) is the expensive computation time which is largely caused by the redundant, incomplete, and unrelated features contain in the IDS datasets. To overcome such challenges and ensure building efficient and more accurate IDS models, many researchers utilize preprocessing techniques such as normalization and feature selection, and a hybrid modeling approach is typically used. In this work, we propose a hybrid IDS modeling approach with an algorithm for feature selection (FS) and another for building the IDS. The FS method is a wrapper-based FS with a decision tree as the feature evaluator. Five selected ML algorithms are individually used in combination with the proposed FS method to build five IDS models using the UNSW-NB15 dataset. As a baseline, five more IDS models are built, in a single modeling approach, using the full features of the datasets. We evaluate the effectiveness of our proposed method by comparing it with the baseline models and also with state-of-the-art works. Our method achieves the best DR of 97.95% and proved to be quite effective in comparison to state-of-the-art works. We, therefore, recommend its usage especially in IDS modeling with the UNSW-NB15 dataset.
  • Robust estimation and outlier detection based on linear regression model
    Le Cui, Libo Cheng, Xiaoming Jiang, Zhanfang Chen, Albarka
    Journal of Intelligent and Fuzzy Systems, 2019
    Outlier detection has always been a more active research topic in statistical diagnosis. Outliers are ubiquitous at data analysis areas in current and may produce erroneous results. In multivariate linear regression model, the existence of outliers will directly affect the modeling, parameter estimation and prediction. A set of data contains abnormal values, which will have a great impact on the estimation of the mean and standard deviation of the data, and also affect the estimation results of the least squares method. In the paper, on the basis of the linear regression model and deleting model and mean shift model, from the perspective of residual sum of squares and by introducing sample quantile to estimate the overall parameters robustly, a special statistic which is combined with the sample quantile method is used to detected the outliers. Finally, an example is analyzed and compared with the traditional method. The results show that the method is more effective.

RECENT SCHOLAR PUBLICATIONS

  • An explainable artificial intelligence and Internet of Things framework for monitoring and predicting cardiovascular disease
    MA Umar, N AbuAli, K Shuaib, AI Awad
    Engineering Applications of Artificial Intelligence 144, 110138 , 2025
    2025
    Citations: 9
  • Cyber-attack detection in smart grids: A comparative analysis of supervised and semi-supervised methods
    MA Umar, K Shuaib
    2024 6th International Symposium on Advanced Electrical and Communication … , 2024
    2024
    Citations: 1
  • Simulating and evaluating the performance of a cloud computing datacenter using queuing model
    MA Umar
    2024 6th International Symposium on Advanced Electrical and Communication … , 2024
    2024
    Citations: 1
  • Effects of Feature Selection and Normalization on Network Intrusion Detection
    MA Umar, Z Chen, K Shuaib, Y Liu
    Data Science and Management 8 (01), 23-39 , 2024
    2024
    Citations: 130
  • Autoencoder-based arrhythmia detection using synthetic ECG generation technique
    A Nawaz, MA Umar, K Shuaib, A Ahmad, AN Belkacem
    2024 46th Annual International Conference of the IEEE Engineering in … , 2024
    2024
    Citations: 4
  • LRCMP: a sequential statistical framework for predicting cancer mortality rate
    MA Umar, A Nawaz, T Qayyum
    2023 7th international multi-topic ICT conference (IMTIC), 1-8 , 2023
    2023
    Citations: 5
  • A time series regression-based model for predicting the spread of dengue disease
    MD Waseem, A Nawaz, U Rasheed, A Raza, MO Albarka
    2023 International Conference on Robotics and Automation in Industry (ICRAI … , 2023
    2023
    Citations: 3
  • A Hybrid Intrusion Detection with Decision Tree for Feature Selection
    MA Umar, Z Chen, Y Liu
    Information & Security: An International Journal 49 , 2021
    2021
    Citations: 18
  • Fighting Crime and Insecurity in Nigeria: An Intelligent Approach
    MA Umar, A Aliyu Machina, M Ibrahim, JA Nasir, A Saheed Salahudeen, ...
    International Journal of Computer Engineering in Research Trends 8 (01), 6-14 , 2021
    2021
    Citations: 10
  • A Comparative Study of Dynamic Software Testing Techniques
    MA Umar, Z Chen
    International Journal of Advanced Networking and Applications 12 (03), 4575-4584 , 2021
    2021
    Citations: 33
  • Web Load Balancing Method Based On Resource Request Division
    Z Chen, X Jiang, Y Zhang, MA Umar
    International Journal of Software & Hardware Research in Engineering (IJSHRE … , 2020
    2020
  • Network Intrusion Detection Using Wrapper-based Decision Tree for Feature Selection
    MA Umar, Z Chen, Y Liu
    Proceedings of the 2020 International Conference on Internet Computing for … , 2020
    2020
    Citations: 43
  • Analysis and Design of Fire Emergency Application (FEAP)
    MA Umar, AS Salahudeen, J Ehi Okoh, M Siddig Mohamed
    International Journal of Computer Science and Mobile Computing 9 (1), 40-51 , 2020
    2020
    Citations: 8
  • A Study of Automated Software Testing: Automation Tools and Frameworks
    MA Umar, Z Chen
    International Journal of Computer Science Engineering 8 (06), 217-225 , 2019
    2019
    Citations: 119
  • Student Academic Performance Prediction using Artificial Neural Networks: A Case Study
    MA Umar
    International Journal of Computer Applications 178 (Number 48), 24-29 , 2019
    2019
    Citations: 30
  • Robust estimation and outlier detection based on linear regression model
    L Cui, L Cheng, X Jiang, Z Chen, MA Umar
    Journal of Intelligent & Fuzzy Systems, 1-8 , 2019
    2019
    Citations: 8
  • Comprehensive study of software testing: Categories, levels, techniques, and types
    MA Umar
    International Journal of Advance Research, Ideas and Innovations in … , 2019
    2019
    Citations: 84

MOST CITED SCHOLAR PUBLICATIONS

  • Effects of Feature Selection and Normalization on Network Intrusion Detection
    MA Umar, Z Chen, K Shuaib, Y Liu
    Data Science and Management 8 (01), 23-39 , 2024
    2024
    Citations: 130
  • A Study of Automated Software Testing: Automation Tools and Frameworks
    MA Umar, Z Chen
    International Journal of Computer Science Engineering 8 (06), 217-225 , 2019
    2019
    Citations: 119
  • Comprehensive study of software testing: Categories, levels, techniques, and types
    MA Umar
    International Journal of Advance Research, Ideas and Innovations in … , 2019
    2019
    Citations: 84
  • Network Intrusion Detection Using Wrapper-based Decision Tree for Feature Selection
    MA Umar, Z Chen, Y Liu
    Proceedings of the 2020 International Conference on Internet Computing for … , 2020
    2020
    Citations: 43
  • A Comparative Study of Dynamic Software Testing Techniques
    MA Umar, Z Chen
    International Journal of Advanced Networking and Applications 12 (03), 4575-4584 , 2021
    2021
    Citations: 33
  • Student Academic Performance Prediction using Artificial Neural Networks: A Case Study
    MA Umar
    International Journal of Computer Applications 178 (Number 48), 24-29 , 2019
    2019
    Citations: 30
  • A Hybrid Intrusion Detection with Decision Tree for Feature Selection
    MA Umar, Z Chen, Y Liu
    Information & Security: An International Journal 49 , 2021
    2021
    Citations: 18
  • Fighting Crime and Insecurity in Nigeria: An Intelligent Approach
    MA Umar, A Aliyu Machina, M Ibrahim, JA Nasir, A Saheed Salahudeen, ...
    International Journal of Computer Engineering in Research Trends 8 (01), 6-14 , 2021
    2021
    Citations: 10
  • An explainable artificial intelligence and Internet of Things framework for monitoring and predicting cardiovascular disease
    MA Umar, N AbuAli, K Shuaib, AI Awad
    Engineering Applications of Artificial Intelligence 144, 110138 , 2025
    2025
    Citations: 9
  • Analysis and Design of Fire Emergency Application (FEAP)
    MA Umar, AS Salahudeen, J Ehi Okoh, M Siddig Mohamed
    International Journal of Computer Science and Mobile Computing 9 (1), 40-51 , 2020
    2020
    Citations: 8
  • Robust estimation and outlier detection based on linear regression model
    L Cui, L Cheng, X Jiang, Z Chen, MA Umar
    Journal of Intelligent & Fuzzy Systems, 1-8 , 2019
    2019
    Citations: 8
  • LRCMP: a sequential statistical framework for predicting cancer mortality rate
    MA Umar, A Nawaz, T Qayyum
    2023 7th international multi-topic ICT conference (IMTIC), 1-8 , 2023
    2023
    Citations: 5
  • Autoencoder-based arrhythmia detection using synthetic ECG generation technique
    A Nawaz, MA Umar, K Shuaib, A Ahmad, AN Belkacem
    2024 46th Annual International Conference of the IEEE Engineering in … , 2024
    2024
    Citations: 4
  • A time series regression-based model for predicting the spread of dengue disease
    MD Waseem, A Nawaz, U Rasheed, A Raza, MO Albarka
    2023 International Conference on Robotics and Automation in Industry (ICRAI … , 2023
    2023
    Citations: 3
  • Cyber-attack detection in smart grids: A comparative analysis of supervised and semi-supervised methods
    MA Umar, K Shuaib
    2024 6th International Symposium on Advanced Electrical and Communication … , 2024
    2024
    Citations: 1
  • Simulating and evaluating the performance of a cloud computing datacenter using queuing model
    MA Umar
    2024 6th International Symposium on Advanced Electrical and Communication … , 2024
    2024
    Citations: 1
  • Web Load Balancing Method Based On Resource Request Division
    Z Chen, X Jiang, Y Zhang, MA Umar
    International Journal of Software & Hardware Research in Engineering (IJSHRE … , 2020
    2020