Monica Danilevicz

@irta.cat

Researcher at Genomics and Biotechnology Program
Institute of Agrifood Research

Monica Danilevicz

RESEARCH, TEACHING, or OTHER INTERESTS

Agricultural and Biological Sciences, Artificial Intelligence, Biochemistry, Genetics and Molecular Biology, Plant Science
27

Scopus Publications

Scopus Publications

  • Trait Association for Flowering Time in Lentil from Global Multi-Environment Data Using GWAS and Machine Learning
    Shriprabha R. Upadhyaya, Hawlader A. Al-Mamun, Monica F. Danilevicz, Shameela Mohamedikbal, Mohammed Bennamoun, Jacqueline Batley, Kirstin E. Bett, David Edwards
    Plants, 2026
    Flowering time is an important developmental stage in plants, influenced by multiple genes and environmental factors. Understanding its genetic basis and interaction with the environment facilitates the development of improved varieties adapted to different environments. Conventional Genome-Wide Association Studies (GWAS) have been widely used to associate genetic markers with heritable traits, but they do not inherently capture interactions among single nucleotide polymorphisms (SNPs) or between SNPs and the environment. Machine Learning (ML) approaches can model these interactions and improve trait prediction even in the presence of noise and missing data. In this study, multi-environment lentil (Lens culinaris Medik.) data were analysed using GWAS and two widely used ML models, Random Forest and XGBoost, to identify genetic markers associated with flowering time. Model interpretability was enhanced using Explainable AI (XAI) techniques, including SHapley Additive exPlanations. GWAS identified eight significant loci across chromosomes one, two, five and seven, with the most significant SNP located at Chr2_530433205, while ML approaches identified nine markers on chromosomes one, two, three, five and seven, with the most significant SNP at Chr7_523220088. The majority of the identified markers were linked to candidate genes for flowering, while ML also identified potential epistasis. These findings highlight ML as a powerful complementary tool to GWAS for trait association.
  • Application of machine learning and genomics for orphan crop improvement
    Tessa R. MacNish, Monica F. Danilevicz, Philipp E. Bayer, Mitchell S. Bestry, David Edwards
    Nature Communications, 2025
    Orphan crops are important sources of nutrition in developing regions and many are tolerant to biotic and abiotic stressors; however, modern crop improvement technologies have not been widely applied to orphan crops due to the lack of resources available. There are orphan crop representatives across major crop types and the conservation of genes between these related species can be used in crop improvement. Machine learning (ML) has emerged as a promising tool for crop improvement. Transferring knowledge from major crops to orphan crops and using machine learning to improve accuracy and efficiency can be used to improve orphan crops. Machine learning has emerged as a promising tool for crop improvement. Here, the authors review transferring knowledge from major crops to orphan crops and using machine learning to improve accuracy and efficiency of orphan crops breeding.
  • Understanding plant phenotypes in crop breeding through explainable AI
    Monica F. Danilevicz, Shriprabha R. Upadhyaya, Jacqueline Batley, Mohammed Bennamoun, Philipp E. Bayer, David Edwards
    Plant Biotechnology Journal, 2025
    SummaryMachine learning use in plant phenotyping has grown exponentially. These algorithms empowered the use of image data to measure plant traits rapidly and to predict the effect of genetic and environmental conditions on plant phenotype. However, the lack of interpretability in machine learning models has limited their usefulness in gaining insights into the underlying biological processes that drive plant phenotypes. Explainable AI (XAI) emerges to help understand the ‘why’ behind machine learning model predictions and allow researchers to investigate the most influential features that lead to prediction, classification or segmentation results. Understanding the mechanisms behind model prediction is also central to sanity‐checking models, increasing model reliability and identifying dataset biases that may limit the model's applicability across different conditions. This review introduces the concept of XAI and presents current algorithms, emphasizing their suitability for different data types or machine learning algorithms. The use of XAI to leverage trait information is highlighted, showcasing how recent studies employed model explanations to recognize the features that impact plant phenotype. Overall, this review presents a framework for using XAI to gain insights into intricate biological processes driving plant phenotypes, underscoring the significance of transparency and interpretability in machine learning.
  • Plant disease epidemiology in the age of artificial intelligence and machine learning
    Ting Xiang Neik, Aria Dolatabadian, Monica F. Danilevicz, Shriprabha R. Upadhyaya, Fangning Zhang, Jacqueline Batley, David Edwards
    Agriculture Communications, 2025
    Crop diseases pose a major threat to global food security, causing substantial yield losses and economic damage each year. Plant disease epidemiology studies the dynamics of plant-pathogen interactions and their impact on disease outcomes, considering environmental influences at a population level. While recent advances in artificial intelligence (AI) and machine learning (ML) have introduced innovative tools for disease prediction and management, most applications have focused on plant disease detection, classification and prediction using imaging technologies and sensor-based data. However, their use in plant disease epidemiology, particularly in understanding host-pathogen interactions and the ecology and evolution of the pathosystems remains limited due to the complexity of multi-scale interactions. In this review, we first propose an updated ‘disease pyramid’ plant disease epidemiology model, incorporating ecological and evolutionary components into the traditional ‘disease triangle’ model. Following this, we discuss current ML applications in plant disease epidemiology, further highlighting both challenges and opportunities. We offer insights into potential input datasets that could significantly enhance the predictability and accuracy of ML models, while also outlining future directions for this rapidly evolving field. The aim of this review is to draw the reader’s attention to the knowledge gap in the application of ML in plant disease epidemiology and showcase the vast potential for expanding the scope of more in-depth and comprehensive research in this field in the future.
  • Exploring genomic feature selection: A comparative analysis of GWAS and machine learning algorithms in a large-scale soybean dataset
    Hawlader A. Al‐Mamun, Monica F. Danilevicz, Jacob I. Marsh, Cedric Gondro, David Edwards
    Plant Genome, 2025
    The surge in high‐throughput technologies has empowered the acquisition of vast genomic datasets, prompting the search for genetic markers and biomarkers relevant to complex traits. However, grappling with the inherent complexities of high dimensionality and sparsity within these datasets poses formidable hurdles. The immense number of features and their potential redundancy demand efficient strategies for extracting pertinent information and identifying significant markers. Feature selection is important in large genomic data as it helps in enhancing interpretability and computational efficiency. This study focuses on addressing these challenges through a comprehensive investigation into genomic feature selection methodologies, employing a rich soybean ( Glycine max L. Merr.) dataset comprising 966 lines with over 5.5 million single nucleotide polymorphisms. Emphasizing the “ small n large p ” dilemma prevalent in contemporary genomic studies, we compared the efficacy of traditional genome‐wide association studies (GWAS) with two prominent machine learning tools, random forest and extreme gradient boosting, in pinpointing predictive features. Utilizing the expansive soybean dataset, we assessed the performance of these methodologies in selecting features that optimize predictive modeling for various phenotypes. By constructing predictive models based on the selected features, we ascertain the comparative prediction accuracies, thereby illuminating the strengths and limitations of these feature selection methodologies in the realm of genomic data analysis.
  • Global genotype by environment prediction competition reveals that diverse modeling strategies can deliver satisfactory maize yield estimates
    Jacob D Washburn, José Ignacio Varela, Alencar Xavier, Qiuyue Chen, David Ertl, Joseph L Gage, James B Holland, Dayane Cristina Lima, Maria Cinta Romay, Marco Lopez-Cruz, Gustavo de los Campos, Wesley Barber, Cristiano Zimmer, Ignacio Trucillo Silva, Fabiani Rocha, Renaud Rincent, Baber Ali, Haixiao Hu, Daniel E Runcie, Kirill Gusev, Andrei Slabodkin, Phillip Bax, Julie Aubert, Hugo Gangloff, Tristan Mary-Huard, Theodore Vanrenterghem, Carles Quesada-Traver, Steven Yates, Daniel Ariza-Suárez, Argeo Ulrich, Michele Wyler, Daniel R Kick, Emily S Bellis, Jason L Causey, Emilio Soriano Chavez, Yixing Wang, Ved Piyush, Gayara D Fernando, Robert K Hu, Rachit Kumar, Annan J Timon, Rasika Venkatesh, Kenia Segura Abá, Huan Chen, Thilanka Ranaweera, Shin-Han Shiu, Peiran Wang, Max J Gordon, B Kirtley Amos, Sebastiano Busato, Daniel Perondi, Abhishek Gogna, Dennis Psaroudakis, Chun-Peng James Chen, Hawlader A Al-Mamun, Monica F Danilevicz, Shriprabha R Upadhyaya, David Edwards, Natalia de Leon
    Genetics, 2025
    Predicting phenotypes from a combination of genetic and environmental factors is a grand challenge of modern biology. Slight improvements in this area have the potential to save lives, improve food and fuel security, permit better care of the planet, and create other positive outcomes. In 2022 and 2023, the first open-to-the-public Genomes to Fields initiative Genotype by Environment prediction competition was held using a large dataset including genomic variation, phenotype and weather measurements, and field management notes gathered by the project over 9 years. The competition attracted registrants from around the world with representation from academic, government, industry, and nonprofit institutions as well as unaffiliated. These participants came from diverse disciplines, including plant science, animal science, breeding, statistics, computational biology, and others. Some participants had no formal genetics or plant-related training, and some were just beginning their graduate education. The teams applied varied methods and strategies, providing a wealth of modeling knowledge based on a common dataset. The winner's strategy involved 2 models combining machine learning and traditional breeding tools: 1 model emphasized environment using features extracted by random forest, ridge regression, and least squares, and 1 focused on genetics. Other high-performing teams’ methods included quantitative genetics, machine learning/deep learning, mechanistic models, and model ensembles. The dataset factors used, such as genetics, weather, and management data, were also diverse, demonstrating that no single model or strategy is far superior to all others within the context of this competition.
  • Correction to: The Global Assessment of Oilseed Brassica Crop Species Yield, Yield Stability and the Underlying Genetics (Plants, (2022), 11, 20, (2740), 10.3390/plants11202740)
    Jaco D. Zandberg, Cassandria T. Fernandez, Monica F. Danilevicz, William J. W. Thomas, David Edwards, Jacqueline Batley
    Plants, 2025
    There was an error in the original publication [...]
  • Image-based crop disease detection using machine learning
    Aria Dolatabadian, Ting Xiang Neik, Monica F. Danilevicz, Shriprabha R. Upadhyaya, Jacqueline Batley, David Edwards
    Plant Pathology, 2025
    Crop disease detection is important due to its significant impact on agricultural productivity and global food security. Traditional disease detection methods often rely on labour‐intensive field surveys and manual inspection, which are time‐consuming and prone to human error. In recent years, the advent of imaging technologies coupled with machine learning (ML) algorithms has offered a promising solution to this problem, enabling rapid and accurate identification of crop diseases. Previous studies have demonstrated the potential of image‐based techniques in detecting various crop diseases, showcasing their ability to capture subtle visual cues indicative of pathogen infection or physiological stress. However, the field is rapidly evolving, with advancements in sensor technology, data analytics and artificial intelligence (AI) algorithms continually expanding the capabilities of these systems. This review paper consolidates the existing literature on image‐based crop disease detection using ML, providing a comprehensive overview of cutting‐edge techniques and methodologies. Synthesizing findings from diverse studies offers insights into the effectiveness of different imaging platforms, contextual data integration and the applicability of ML algorithms across various crop types and environmental conditions. The importance of this review lies in its ability to bridge the gap between research and practice, offering valuable guidance to researchers and agricultural practitioners.
  • Genomics-based plant disease resistance prediction using machine learning
    Shriprabha R. Upadhyaya, Monica F. Danilevicz, Aria Dolatabadian, Ting Xiang Neik, Fangning Zhang, Hawlader A. Al‐Mamun, Mohammed Bennamoun, Jacqueline Batley, David Edwards
    Plant Pathology, 2024
    Plant disease outbreaks continuously challenge food security and sustainability. Traditional chemical methods used to treat diseases have environmental and health concerns, raising the need to enhance inherent plant disease resistance mechanisms. Traits, including disease resistance, can be linked to specific loci in the genome and identifying these markers facilitates targeted breeding approaches. Several methods, including genome‐wide association studies and genomic selection, have been used to identify important markers and select varieties with desirable traits. However, these traditional approaches may not fully capture the non‐linear characteristics of the effect of genomic variation on traits. Machine learning, known for its data‐mining abilities, offers an opportunity to enhance the accuracy of the existing trait association approaches. It has found applications in predicting various agronomic traits across several species. However, its use in disease resistance prediction remains limited. This review highlights the potential of machine learning as a complementary tool for predicting the genetic loci contributing to pathogen resistance. We provide an overview of traditional trait prediction methods, summarize machine‐learning applications, and address the challenges and opportunities associated with machine learning‐based crop disease resistance prediction.
  • Local haplotyping reveals insights into the genetic control of flowering time variation in wild and domesticated soybean
    Shameela Mohamedikbal, Hawlader A. Al‐Mamun, Jacob I. Marsh, Shriprabha Upadhyaya, Monica F. Danilevicz, Henry T. Nguyen, Babu Valliyodan, Adam Mahan, Jacqueline Batley, David Edwards
    Plant Genome, 2024
    The timing of flowering in soybean [ Glycine max (L.) Merr.], a key legume crop, is influenced by many factors, including daylight length or photoperiodic sensitivity, that affect crop yield, productivity, and geographical adaptation. Despite its importance, a comprehensive understanding of the local linkage landscape and allelic diversity within regions of the genome influencing flowering and contributing to phenotypic variation in subpopulations has been limited. This study addresses these gaps by conducting an in‐depth trait association and linkage analysis coupled with local haplotyping using advanced bioinformatics tools, including crosshap , to characterize genomic variation using a pangenome dataset representing 915 domesticated and wild‐type individuals. The association analysis identified eight significant loci on seven chromosomes. Moving beyond traditional association analysis, local haplotyping of targeted regions on chromosomes 6 and 20 identified distinct haplotype structures, variation patterns, and genomic candidates influencing flowering in subpopulations. These results suggest the action of a network of genomic candidates influencing flowering time and an untapped reservoir of genomic variation for this trait in wild germplasm. Notably, GlymaLee.20G147200 on chromosome 20 was identified as a candidate gene that may cause delayed flowering in soybean, potentially through histone modifications of floral repressor loci as seen in Arabidopsis thaliana (L.) Heynh. These findings support future functional validation of haplotype‐based alleles for marker‐assisted breeding and genomic selection to enhance latitude adaptability of soybean without compromising yield.
  • Multimodal Deep Learning Integration of Image, Weather, and Phenotypic Data Under Temporal Effects for Early Prediction of Maize Yield
    Danial Shamsuddin, Monica F. Danilevicz, Hawlader A. Al-Mamun, Mohammed Bennamoun, David Edwards
    Remote Sensing, 2024
  • Focus on the Crop Not the Weed: Canola Identification for Precision Weed Management Using Deep Learning
    Michael Mckay, Monica F. Danilevicz, Michael B. Ashworth, Roberto Lujan Rocha, Shriprabha R. Upadhyaya, Mohammed Bennamoun, David Edwards
    Remote Sensing, 2024
  • Segmentation of Sandplain Lupin Weeds from Morphologically Similar Narrow-Leafed Lupins in the Field
    Monica F. Danilevicz, Roberto Lujan Rocha, Jacqueline Batley, Philipp E. Bayer, Mohammed Bennamoun, David Edwards, Michael B. Ashworth
    Remote Sensing, 2023
  • DNABERT-based explainable lncRNA identification in plant genome assemblies
    Monica F. Danilevicz, Mitchell Gill, Cassandria G. Tay Fernandez, Jakob Petereit, Shriprabha R. Upadhyaya, Jacqueline Batley, Mohammed Bennamoun, David Edwards, Philipp E. Bayer
    Computational and Structural Biotechnology Journal, 2023
  • The Global Assessment of Oilseed Brassica Crop Species Yield, Yield Stability and the Underlying Genetics
    Jaco D. Zandberg, Cassandria T. Fernandez, Monica F. Danilevicz, William J. W. Thomas, David Edwards, Jacqueline Batley
    Plants, 2022
  • Plant Genotype to Phenotype Prediction Using Machine Learning
    Monica F. Danilevicz, Mitchell Gill, Robyn Anderson, Jacqueline Batley, Mohammed Bennamoun, Philipp E. Bayer, David Edwards
    Frontiers in Genetics, 2022
  • Genetic and Genomic Resources for Soybean Breeding Research
    Jakob Petereit, Jacob I. Marsh, Philipp E. Bayer, Monica F. Danilevicz, William J. W. Thomas, Jacqueline Batley, David Edwards
    Plants, 2022
  • Pangenomes as a Resource to Accelerate Breeding of Under-Utilised Crop Species
    Cassandria Geraldine Tay Fernandez, Benjamin John Nestor, Monica Furaste Danilevicz, Mitchell Gill, Jakob Petereit, Philipp Emanuel Bayer, Patrick Michael Finnegan, Jacqueline Batley, David Edwards
    International Journal of Molecular Sciences, 2022
  • Expanding Gene-Editing Potential in Crop Improvement with Pangenomes
    Cassandria G. Tay Fernandez, Benjamin J. Nestor, Monica F. Danilevicz, Jacob I. Marsh, Jakob Petereit, Philipp E. Bayer, Jacqueline Batley, David Edwards
    International Journal of Molecular Sciences, 2022
  • Machine Learning for Image Analysis: Leaf Disease Segmentation
    Monica F. Danilevicz, Philipp Emanuel Bayer
    Methods in Molecular Biology, 2022
  • Producing High-Quality Single Nucleotide Polymorphism Data for Genome-Wide Association Studies
    Philipp E. Bayer, Mitchell Gill, Monica F. Danilevicz, David Edwards
    Methods in Molecular Biology, 2022
  • The application of pangenomics and machine learning in genomic selection in plants
    Philipp E. Bayer, Jakob Petereit, Monica Furaste Danilevicz, Robyn Anderson, Jacqueline Batley, David Edwards
    Plant Genome, 2021
  • Resources for image-based high-throughput phenotyping in crops and data sharing challenges
    Monica F. Danilevicz, Philipp E. Bayer, Benjamin J. Nestor, Mohammed Bennamoun, David Edwards
    Plant Physiology, 2021
  • Maize yield prediction at an early developmental stage using multispectral images and genotype data for preliminary hybrid selection
    Monica F. Danilevicz, Philipp E. Bayer, Farid Boussaid, Mohammed Bennamoun, David Edwards
    Remote Sensing, 2021
  • High-Throughput Genotyping Technologies in Plant Taxonomy
    Monica F. Danilevicz, Cassandria G. Tay Fernandez, Jacob I. Marsh, Philipp E. Bayer, David Edwards
    Methods in Molecular Biology, 2021
  • Plant pangenomics: approaches, applications and advancements
    Monica Furaste Danilevicz, Cassandria Geraldine Tay Fernandez, Jacob Ian Marsh, Philipp Emanuel Bayer, David Edwards
    Current Opinion in Plant Biology, 2020
  • Copaifera langsdorffii novel putative long non-coding RNAs: Interspecies conservation analysis in adaptive response to different biomes
    Monica F. Danilevicz, Kanhu C. Moharana, Thiago M. Venancio, Luciana O. Franco, Sérgio R. S. Cardoso, Mônica Cardoso, Flávia Thiebaut, Adriana S. Hemerly, Francisco Prosdocimi, Paulo C. G. Ferreira
    Non Coding RNA, 2018