NGS, Bioinformatics, Next Generation Sequencing, Data Analysis
42
Scopus Publications
1248
Scholar Citations
20
Scholar h-index
29
Scholar i10-index
Scopus Publications
Exploring the Impact of DNA Methylation on Gene Expression in CRC: A Computational Approach for Identifying Epigenetically Regulated Genes in Multi-Omic Datasets Andrei Stefan Blindu, Silvia Berardelli, Federica De Paoli, Federico Manai, Rossella Tricarico, Susanna Zucca, Paolo Magni Cancers, 2026 Background/Objectives: DNA methylation is a key epigenetic process that regulates gene expression and is often disrupted in colorectal cancer (CRC). Aberrant methylation of promoter CpG islands can silence tumor suppressor genes and drive tumorigenesis. A subset of CRCs exhibits the CpG Island Methylator Phenotype (CIMP), characterized by widespread hypermethylation and distinct clinical outcomes. Identifying genes whose expression is epigenetically regulated by methylation is important for prioritizing candidate biomarkers and therapeutic targets in CRC. Methods: We developed and compared a series of computational approaches to identify genes whose expression is regulated by DNA methylation in The Cancer Genome Atlas (TCGA) cohort of Colon Adenocarcinoma (COAD) patients. Samples were stratified according to their CpG Island Methylator Phenotype (CIMP) level to capture distinct epigenetic subgroups. The proposed framework integrates methylation and transcriptomic data to systematically detect methylation–expression associations indicative of epigenetic regulation. Results: The best-performing method identified gene sets strongly associated with promoter methylation–expression relationships and enriched for pathways relevant to colorectal cancer progression and patient stratification. To evaluate the robustness and transferability of the approach, it was further validated on independent datasets, including Stomach Adenocarcinoma (STAD), Glioblastoma Multiforme (GBM), and Mesothelioma (MESO), supporting its robustness and potential generalizability across multiple tumor types. Conclusions: Our study highlights the potential of computational pipelines to uncover epigenetically regulated genes in colorectal cancer. The identified candidate genes provide a hypothesis-generating foundation for refining molecular stratification and guiding future studies aimed at epigenetic biomarker discovery and therapeutic hypothesis development.
Digenic variant interpretation with hypothesis-driven explainable AI Federica De Paoli, Giovanna Nicora, Silvia Berardelli, Andrea Gazzo, Riccardo Bellazzi, Paolo Magni, Ettore Rizzo, Ivan Limongelli, Susanna Zucca Nar Genomics and Bioinformatics, 2025 The digenic inheritance hypothesis holds the potential to enhance diagnostic yield in rare diseases. Computational approaches capable of accurately interpreting and prioritizing digenic combinations of variants based on the proband’s phenotypes and family information can provide valuable assistance during the diagnostic process. We developed diVas, a hypothesis-driven machine learning approach that interprets genomic variants across different gene pairs. DiVas demonstrates strong performance in both classifying and prioritizing causative digenic combinations of rare variants within the top positions across 11 cases with the complete list of variants available (73% sensitivity and a median ranking of 3). Furthermore, it achieves a sensitivity of 0.81 when applied to 645 published causative digenic combinations. Additionally, diVas leverages explainable artificial intelligence to elucidate the digenic disease mechanism for predicted positive pairs.
An AI-based approach driven by genotypes and phenotypes to uplift the diagnostic yield of genetic diseases S. Zucca, G. Nicora, F. De Paoli, M. G. Carta, R. Bellazzi, P. Magni, E. Rizzo, I. Limongelli Human Genetics, 2025 Identifying disease-causing variants in Rare Disease patients’ genome is a challenging problem. To accomplish this task, we describe a machine learning framework, that we called “Suggested Diagnosis”, whose aim is to prioritize genetic variants in an exome/genome based on the probability of being disease-causing. To do so, our method leverages standard guidelines for germline variant interpretation as defined by the American College of Human Genomics (ACMG) and the Association for Molecular Pathology (AMP), inheritance information, phenotypic similarity, and variant quality. Starting from (1) the VCF file containing proband’s variants, (2) the list of proband’s phenotypes encoded in Human Phenotype Ontology terms, and optionally (3) the information about family members (if available), the “Suggested Diagnosis” ranks all the variants according to their machine learning prediction. This method significantly reduces the number of variants that need to be evaluated by geneticists by pinpointing causative variants in the very first positions of the prioritized list. Most importantly, our approach proved to be among the top performers within the CAGI6 Rare Genome Project Challenge, where it was able to rank the true causative variant among the first positions and, uniquely among all the challenge participants, increased the diagnostic yield of 12.5% by solving 2 undiagnosed cases.
Critical assessment of variant prioritization methods for rare disease diagnosis within the rare genomes project Sarah L. Stenton, Melanie C. O’Leary, Gabrielle Lemire, Grace E. VanNoy, Stephanie DiTroia, Vijay S. Ganesh, Emily Groopman, Emily O’Heir, Brian Mangilog, Ikeoluwa Osei-Owusu, Lynn S. Pais, Jillian Serrano, Moriel Singer-Berk, Ben Weisburd, Michael W. Wilson, Christina Austin-Tse, Marwa Abdelhakim, Azza Althagafi, Giulia Babbi, Riccardo Bellazzi, Samuele Bovo, Maria Giulia Carta, Rita Casadio, Pieter-Jan Coenen, Federica De Paoli, Matteo Floris, Manavalan Gajapathy, Robert Hoehndorf, Julius O. B. Jacobsen, Thomas Joseph, Akash Kamandula, Panagiotis Katsonis, Cyrielle Kint, Olivier Lichtarge, Ivan Limongelli, Yulan Lu, Paolo Magni, Tarun Karthik Kumar Mamidi, Pier Luigi Martelli, Marta Mulargia, Giovanna Nicora, Keith Nykamp, Vikas Pejaver, Yisu Peng, Thi Hong Cam Pham, Maurizio S. Podda, Aditya Rao, Ettore Rizzo, Vangala G. Saipradeep, Castrense Savojardo, Peter Schols, Yang Shen, Naveen Sivadasan, Damian Smedley, Dorian Soru, Rajgopal Srinivasan, Yuanfei Sun, Uma Sunderam, Wuwei Tan, Naina Tiwari, Xiao Wang, Yaqiong Wang, Amanda Williams, Elizabeth A. Worthey, Rujie Yin, Yuning You, Daniel Zeiberg, Susanna Zucca, Constantina Bakolitsa, Steven E. Brenner, Stephanie M. Fullerton, Predrag Radivojac, Heidi L. Rehm, Anne O’Donnell-Luria Human Genomics, 2024 Background A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. Methods We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. Results Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. Conclusions Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.
RNA expression profiling in lymphoblastoid cell lines from mutated and non-mutated amyotrophic lateral sclerosis patients Jessica Garau, Maria Garofalo, Francesca Dragoni, Eveljn Scarian, Rosalinda Di Gerlando, Luca Diamanti, Susanna Zucca, Matteo Bordoni, Orietta Pansarasa, Stella Gagliardi Journal of Gene Medicine, 2024 BackgroundAmyotrophic lateral sclerosis (ALS) is a neurodegenerative disease characterized by the death of upper and lower motor neurons with an unknown etiology. The difficulty of recovering biological material from patients led to employ lymphoblastoid cell lines (LCLs) as a model for ALS because many pathways, typically located in neurons, are also activated in these cells.MethodsTo investigate the expression of coding and long non‐coding RNAs in LCLs, a transcriptomic profiling of sporadic ALS (SALS) and mutated patients (FUS, TARDBP, C9ORF72 and SOD1) and matched controls was realized. Thus, differentially expressed genes (DEGs) were investigated among the different subgroups of patients. Peripheral blood mononuclear cells (PBMCs) were isolated and immortalized into LCLs via Epstein–Barr virus infection; RNA was extracted, and RNA‐sequencing analysis was performed.ResultsGene expression profiles of LCLs were genetic‐background‐specific; indeed, only 12 genes were commonly deregulated in all groups. Nonetheless, pathways enriched by DEGs in each group were also compared, and a total of 89 Kyoto Encyclopedia of Genes and Genomes (KEGG) terms were shared among all patients. Eventually, the similarity of affected pathways was also assessed when our data were matched with a transcriptomic profile realized in the PBMCs of the same patients.ConclusionsWe conclude that LCLs are a good model for the study of RNA deregulation in ALS.
VarChat: the generative AI assistant for the interpretation of human genomic variations Federica De Paoli, Silvia Berardelli, Ivan Limongelli, Ettore Rizzo, Susanna Zucca Bioinformatics, 2024 Motivation In the modern era of genomic research, the scientific community is witnessing an explosive growth in the volume of published findings. While this abundance of data offers invaluable insights, it also places a pressing responsibility on genetic professionals and researchers to stay informed about the latest findings and their clinical significance. Genomic variant interpretation is currently facing a challenge in identifying the most up-to-date and relevant scientific papers, while also extracting meaningful information to accelerate the process from clinical assessment to reporting. Computer-aided literature search and summarization can play a pivotal role in this context. By synthesizing complex genomic findings into concise, interpretable summaries, this approach facilitates the translation of extensive genomic datasets into clinically relevant insights. Results To bridge this gap, we present VarChat (varchat.engenome.com), an innovative tool based on generative AI, developed to find and summarize the fragmented scientific literature associated with genomic variants into brief yet informative texts. VarChat provides users with a concise description of specific genetic variants, detailing their impact on related proteins and possible effects on human health. In addition, VarChat offers direct links to related scientific trustable sources, and encourages deeper research. Availability and implementation varchat.engenome.com.
Cardiovascular Disease Burden, Mortality, and Sudden Death Risk in Epilepsy: A UK Biobank Study Ravi A. Shah, C. Anwar A. Chahal, Shaheryar Ranjha, Ghaith Sharaf Dabbagh, Babken Asatryan, Ivan Limongelli, Mohammed Khanji, Fabrizio Ricci, Federica De Paoli, Susanna Zucca, Martin Tristani-Firouzi, Erik K. St. Louis, Elson L. So, Virend K. Somers Canadian Journal of Cardiology, 2024
MINCR: A long non-coding RNA shared between cancer and neurodegeneration Cecilia Pandini, Maria Garofalo, Federica Rey, Jessica Garau, Susanna Zucca, Daisy Sproviero, Matteo Bordoni, Giulia Berzero, Annalisa Davin, Tino Emanuele Poloni, Orietta Pansarasa, Stephana Carelli, Stella Gagliardi, Cristina Cereda Genomics, 2021
COVID-19 patients and Dementia: Frontal cortex transcriptomic data Maria Garofalo, Stella Gagliardi, Susanna Zucca, Cecilia Pandini, Francesca Dragoni, Daisy Sproviero, Orietta Pansarasa, Tino Emanuele Poloni, Valentina Medici, Annalisa Davin, Silvia Damiana Visonà, Matteo Moretti, Antonio Guaita, Mauro Ceroni, Livio Tronconi, Cristina Cereda Data in Brief, 2021
Different mirna profiles in plasma derived small and large extracellular vesicles from patients with neurodegenerative diseases Daisy Sproviero, Stella Gagliardi, Susanna Zucca, Maddalena Arigoni, Marta Giannini, Maria Garofalo, Martina Olivero, Michela Dell’Orco, Orietta Pansarasa, Stefano Bernuzzi, Micol Avenali, Matteo Cotta Ramusino, Luca Diamanti, Brigida Minafra, Giulia Perini, Roberta Zangaglia, Alfredo Costa, Mauro Ceroni, Nora I. Perrone-Bizzozero, Raffaele A. Calogero, Cristina Cereda International Journal of Molecular Sciences, 2021
Predictable design in biological engineering: Debugging of synthetic circuits by in vivo and in silico approaches Synthetic Biology Engineering Evolution and Design Conference 2015 Seed 2015, 2015
Exploring the Impact of DNA Methylation on Gene Expression in CRC: A Computational Approach for Identifying Epigenetically Regulated Genes in Multi-Omic Datasets AS Blindu, S Berardelli, F De Paoli, F Manai, R Tricarico, S Zucca, ... Cancers 18 (2), 211 , 2026 2026 Citations: 1
P229: The eVai suggested diagnosis and VarChat: The enGenome AI ecosystem for variant interpretation S Zucca, F De Paoli, S Berardelli, E Rizzo Genetics in Medicine Open 4, 103723 , 2026 2026
A bioinformatics approach to evaluating familial relationships through genetic similarity at selected SNP sites G Cerchia, S Zucca, I Limongelli, E Rizzo EUROPEAN JOURNAL OF HUMAN GENETICS 33, 1014-1014 , 2025 2025
Evaluation of SNV and CNV calling on the clinically relevant high-homology gene PMS2 V Andrioletti, M Sauer, T Risch, A Benet-Pages, B Klink, E Holinski-Feder, ... EUROPEAN JOURNAL OF HUMAN GENETICS 33, 1019-1019 , 2025 2025
AI Models Predicting Methylation Status from DNA Sequence: What is Missing? AS Blindu, S Berardelli, F De Paoli, R Tricarico, S Zucca, P Magni International Conference on Artificial Intelligence in Medicine, 40-45 , 2025 2025
Generative AI Meets Genomics: VarChat, a RAG-Based Approach for Literature-Driven Variant Summarization F De Paoli, S Berardelli, A Tudisco, A Blindu, E Parimbelli, S Zucca International Conference on Artificial Intelligence in Medicine, 127-131 , 2025 2025 Citations: 4
Digenic variant interpretation with hypothesis-driven explainable AI F De Paoli, G Nicora, S Berardelli, A Gazzo, R Bellazzi, P Magni, E Rizzo, ... NAR Genomics and Bioinformatics 7 (2), lqaf029 , 2025 2025 Citations: 5
Cross-tissue MiRNA profiling of extracellular vesicles and PBMCs from amyotrophic lateral sclerosis patients F Dragoni, RD Gerlando, L Diamanti, B Rizzo, M Bordoni, E Scarian, ... Scientific Reports 15 (1), 14976 , 2025 2025 Citations: 5
An AI-based approach driven by genotypes and phenotypes to uplift the diagnostic yield of genetic diseases S Zucca, G Nicora, F De Paoli, MG Carta, R Bellazzi, P Magni, E Rizzo, ... Human Genetics 144 (2), 159-171 , 2025 2025 Citations: 19
Chromoanagenesis of chromosome 22 in a subject with obesity and borderline cognitive performance F Baldan, E Demori, C Gnan, N Passon, G Damante, C Mio, L Allegri, ... Gene 933, 148956 , 2025 2025 Citations: 1
Phenotypes extraction from clinical descriptions using Large Language Models S Berardelli, A Gazzo, F De Paoli, G Briere, B Loire, I Limongelli, E Rizzo, ... Proceedings of ESHG 2025 , 2025 2025
Validation of Twist CNV backbone panels at different probe densities for large pathological CNV detection I Limongelli, V Andrioletti, T Han, E Rizzo, A Lee, A Davassi, T Tannous, ... EUROPEAN JOURNAL OF HUMAN GENETICS 32, 1643-1643 , 2024 2024
Evaluation of structural variants calling performances using short and long reads sequencing V Andrioletti, F De Paoli, I Limongelli, S Zucca, E Rizzo EUROPEAN JOURNAL OF HUMAN GENETICS 32, 1630-1630 , 2024 2024
Extracting phenotypes from clinical descriptions using large language models: a comparison between automated and manual approach S Berardelli, A Gazzo, F De Paoli, I Limongelli, E Rizzo, P Magni, S Zucca EUROPEAN JOURNAL OF HUMAN GENETICS 32, 1630-1631 , 2024 2024
A systematic investigation of the role of the oligogenic/digenic inheritance in Amyotrophic lateral sclerosis with machine learning tools on WGS data L Corrado, F Caushi, A Bottrighi, N Pomella, F De Marchi, S Zucca, ... EUROPEAN JOURNAL OF HUMAN GENETICS 32, 1532-1532 , 2024 2024
In-depth variant interpretation: AI-powered tools for advancing genomic understanding F De Paoli, S Berardelli, G Nicora, E Rizzo, I Limongelli, S Zucca EUROPEAN JOURNAL OF HUMAN GENETICS 32, 1631-1631 , 2024 2024
A novel VEP plugin to annotate Short Tandem Repeats with HGVS nomenclature G Cerchia, F De Paoli, V Andrioletti, S Zucca, E Rizzo, I Limongelli EUROPEAN JOURNAL OF HUMAN GENETICS 32, 1631-1631 , 2024 2024
RNA expression profiling in lymphoblastoid cell lines from mutated and non‐mutated amyotrophic lateral sclerosis patients J Garau, M Garofalo, F Dragoni, E Scarian, R Di Gerlando, L Diamanti, ... The Journal of Gene Medicine 26 (7), e3711 , 2024 2024 Citations: 1
Predictive method for determining the pathogenicity of combinations of digenic or oligogenic variants I Limongelli, S ZUCCA, F DE PAOLI, E RIZZO, P Magni, F BACCALINI US Patent App. 18/550,662 , 2024 2024 Citations: 1
Critical assessment of variant prioritization methods for rare disease diagnosis within the rare genomes project SL Stenton, MC O’Leary, G Lemire, GE VanNoy, S DiTroia, VS Ganesh, ... Human Genomics 18 (1), 44 , 2024 2024 Citations: 27
MOST CITED SCHOLAR PUBLICATIONS
Long non-coding and coding RNAs characterization in peripheral blood mononuclear cells and spinal cord from amyotrophic lateral sclerosis patients S Gagliardi, S Zucca, C Pandini, L Diamanti, M Bordoni, D Sproviero, ... Scientific reports 8 (1), 2378 , 2018 2018 Citations: 105
A machine learning approach based on ACMG/AMP guidelines for genomic variant classification and prioritization G Nicora, S Zucca, I Limongelli, R Bellazzi, P Magni Scientific reports 12 (1), 2517 , 2022 2022 Citations: 91
Fermentation of lactose to ethanol in cheese whey permeate and concentrated permeate by engineered Escherichia coli L Pasotti, S Zucca, M Casanova, G Micoli, MG Cusella De Angelis, ... BMC biotechnology 17 (1), 48 , 2017 2017 Citations: 86
Different miRNA profiles in plasma derived small and large extracellular vesicles from patients with neurodegenerative diseases D Sproviero, S Gagliardi, S Zucca, M Arigoni, M Giannini, M Garofalo, ... International Journal of Molecular Sciences 22 (5), 2737 , 2021 2021 Citations: 83
Half-life measurements of chemical inducers for recombinant gene expression N Politi, L Pasotti, S Zucca, M Casanova, G Micoli, MG Cusella De Angelis, ... Journal of biological engineering 8 (1), 5 , 2014 2014 Citations: 73
Alzheimer’s, Parkinson’s disease and amyotrophic lateral sclerosis gene expression patterns divergence reveals different grade of RNA metabolism involvement M Garofalo, C Pandini, M Bordoni, O Pansarasa, F Rey, A Costa, ... International Journal of Molecular Sciences 21 (24), 9500 , 2020 2020 Citations: 64
Molecular genetics and interferon signature in the Italian Aicardi Goutières syndrome cohort: report of 12 new cases and literature review J Garau, V Cavallera, M Valente, D Tonduti, D Sproviero, S Zucca, ... Journal of clinical medicine 8 (5), 750 , 2019 2019 Citations: 50
Bottom-up engineering of biological systems through standard bricks: a modularity study on basic parts and devices L Pasotti, N Politi, S Zucca, MG Cusella De Angelis, P Magni PloS one 7 (7), e39407 , 2012 2012 Citations: 50
Characterization of a synthetic bacterial self-destruction device for programmed cell death and for recombinant proteins release L Pasotti, S Zucca, M Lupotto, MG Cusella De Angelis, P Magni Journal of biological engineering 5 (1), 8 , 2011 2011 Citations: 49
Advances and computational tools towards predictable design in biological engineering L Pasotti, S Zucca Computational and mathematical methods in medicine 2014 (1), 369681 , 2014 2014 Citations: 48
A standard vector for the chromosomal integration and characterization of BioBrick™ parts in Escherichia coli S Zucca, L Pasotti, N Politi, MG Cusella De Angelis, P Magni Journal of biological engineering 7 (1), 12 , 2013 2013 Citations: 47
RNA-Seq profiling in peripheral blood mononuclear cells of amyotrophic lateral sclerosis patients and controls S Zucca, S Gagliardi, C Pandini, L Diamanti, M Bordoni, D Sproviero, ... Scientific Data 6 (1), 1-8 , 2019 2019 Citations: 45
VarChat: the generative AI assistant for the interpretation of human genomic variations F De Paoli, S Berardelli, I Limongelli, E Rizzo, S Zucca Bioinformatics 40 (4), btae183 , 2024 2024 Citations: 40
Leukocyte derived microvesicles as disease progression biomarkers in slow progressing amyotrophic lateral sclerosis patients D Sproviero, S La Salvia, F Colombo, S Zucca, O Pansarasa, L Diamanti, ... Frontiers in Neuroscience 13, 344 , 2019 2019 Citations: 39
Curcumin and novel synthetic analogs in cell-based studies of Alzheimer’s disease S Gagliardi, V Franco, S Sorrentino, S Zucca, C Pandini, P Rota, ... Frontiers in Pharmacology 9, 1404 , 2018 2018 Citations: 35
Characterization of an inducible promoter in different DNA copy number conditions S Zucca, L Pasotti, G Mazzini, MG Cusella De Angelis, P Magni BMC bioinformatics 13 (Suppl 4), S11 , 2012 2012 Citations: 35
Extracellular vesicles derived from plasma of patients with neurodegenerative disease have common transcriptomic profiling D Sproviero, S Gagliardi, S Zucca, M Arigoni, M Giannini, M Garofalo, ... Frontiers in aging neuroscience 14, 785741 , 2022 2022 Citations: 34
Multi-Faceted Characterization of a Novel LuxR-Repressible Promoter Library for Escherichia coli S Zucca, L Pasotti, N Politi, M Casanova, G Mazzini, ... PLoS One 10 (5), e0126264 , 2015 2015 Citations: 28
Critical assessment of variant prioritization methods for rare disease diagnosis within the rare genomes project SL Stenton, MC O’Leary, G Lemire, GE VanNoy, S DiTroia, VS Ganesh, ... Human Genomics 18 (1), 44 , 2024 2024 Citations: 27
A synthetic close-loop controller circuit for the regulation of an extracellular molecule by engineered bacteria L Pasotti, M Bellato, N Politi, M Casanova, S Zucca, MGC De Angelis, ... IEEE transactions on biomedical circuits and systems 13 (1), 248-258 , 2018 2018 Citations: 20