Pooja Rani

Scopus Publications

Multi language models for on-the-fly syntax highlighting
Marco Edoardo Palma, Pooja Rani, Harald C. Gall
Journal of Systems and Software, 2026
Key-augmented neural triggers for knowledge sharing
Alex Wolf, Marco Edoardo Palma, Pooja Rani, Harald C. Gall
Journal of Systems and Software, 2026
Repository-level code comprehension and knowledge sharing remain core challenges in software engineering. Large language models (LLMs) have shown promise by generating explanations of program structure and logic. Retrieval-Augmented Generation (RAG), the state-of-the-art (SOTA), improves relevance by injecting context at inference time. However, these approaches still face limitations: First, semantic fragmentation across structural boundaries impairs comprehension, as relevant knowledge is distributed across multiple files within a repository. Second, retrieval inefficiency and attention saturation degrade performance in RAG workflows, where long, weakly aligned contexts overwhelm model attention. Third, repository specific training data is scarce, often outdated, incomplete or misaligned. Finally, proprietary LLMs hinder industrial adoption due to privacy and deployment constraints. To address these issues, we propose Key-Augmented Neural Triggers (KANT), a novel approach that embeds knowledge anchors, symbolic cues linking code regions to semantic roles, into both training and inference. Unlike prior methods, KANT enables internal access to repository specific knowledge, reducing fragmentation and grounding inference in localized, semantically structured memory. Moreover, we synthesize specialized instruction tuning data directly from code, eliminating reliance on noisy or outdated documentation and comments. At inference, knowledge anchors replace verbose context, reducing token overhead and latency while supporting efficient, on premise deployment. We evaluate KANT via: a qualitative human evaluation of the synthesized dataset’s intent coverage and quality across five dimensions; compare against SOTA baselines across five qualitative dimensions and inference speed; and replication across different LLMs to assess generalizability. Results show that the synthetic training data aligned with information-seeking needs: over 90% of questions and answers were rated relevant and understandable; 77.34%, 69.53%, and 64.58% of answers were considered useful, accurate, and complete, respectively. KANT achieved over 60% preference from human annotators and a LocalStack expert over the baselines (e.g., 21% RAG) and notably the expert preferred KANT in over 79% of cases. Also, KANT reduced inference latency by up to 85% across all models. Overall, KANT demonstrated its effectiveness across all evaluated areas, implying that it is well-suited for scalable, low-latency, on-premise deployments, providing a strong foundation for repository-level code comprehension.
The price of precision: the cost of preprocessing for automated code revision in code review
Shirin Pirouzkhah, Pooja Rani, Francesco Sovrano, Vincent Hellendoorn, Alberto Bacchelli
Empirical Software Engineering, 2026
Code review is a widespread practice in software engineering during which developers examine each other’s source code changes to identify potential issues and improve code quality. Among the automated techniques proposed by researchers to reduce the manual workload of code review, Automated Code Revision (ACR) aims to automatically address reviewers’ feedback by producing a revised version of the code. Transformer-based language models have demonstrated state-of-the-art results in ACR. The performance of these models, however, is significantly influenced by the quality and preparation of the training and evaluation data. We present several systematic analyses of prevalent preprocessing steps, examined both cumulatively and in isolation, across three established preprocessing pipelines and two dataset splitting strategies (time-level vs. project-level). Our study spans across models of different scales: OpenNMT (small), T5 and CodeReviewer (mid-sized), LoRA-tuned CodeLLaMA-7B (large), and GPT-3.5-Turbo (large, black-box). Using datasets up to 496k training records, we evaluate and statistically compare models’ performance using exact match ratio (EXM), CodeBLEU, and Levenshtein ratio. Our findings show that preprocessing may be a significant component in the success of the different techniques: OpenNMT relies on heavy preprocessing; T5 benefits from light filtering (selective removal of records); CodeReviewer performs best when trained on larger, less aggressively filtered data; CodeLLaMA-7B and ChatGPT-3.5 Turbo are largely indifferent to preprocessing. Overall, the effectiveness of ACR tools depends on aligning preprocessing with model scale and training setup. In general, small models need abstraction, mid-sized ones benefit from light filtering, and large-scale models perform best when trained on the original, unprocessed form of the code.
A Roadmap for Simulation-Based Testing of Autonomous Cyber-Physical Systems: Challenges and Future Direction
Christian Birchler, Sajad Khatiri, Pooja Rani, Timo Kehrer, Sebastiano Panichella
ACM Transactions on Software Engineering and Methodology, 2025
As the era of autonomous cyber-physical systems (ACPSs), such as unmanned aerial vehicles and self-driving cars, unfolds, the demand for robust testing methodologies is key to realizing the adoption of such systems in real-world scenarios. However, traditional software testing paradigms face unprecedented challenges in ensuring the safety and reliability of these systems. In response, this article pioneers a strategic roadmap for simulation-based system-level testing of ACPSs, specifically focusing on autonomous systems. Our article discusses the relevant challenges and obstacles of ACPSs, focusing on test automation and quality assurance, hence advocating for tailored solutions to address the unique demands of autonomous systems. While providing concrete definitions of test cases within simulation environments, we also accentuate the need to create new benchmark assets and the development of automated tools tailored explicitly for autonomous systems in the software engineering community. This article not only highlights the relevant, pressing issues the software engineering community should focus on (in terms of practices, expected automation, and paradigms), but it also outlines ways to tackle them. By outlining the various domains and challenges of simulation-based testing/development for ACPSs, we provide directions for future research efforts.
The NLBSE'25 Tool Competition
Ali Al-Kaswan, Giuseppe Colavito, Nataliia Stulova, Pooja Rani
Proceedings 2025 IEEE ACM International Workshop on Natural Language Based Software Engineering Nlbse 2025, 2025
We report on the organization and results of the tool competition of the fourth International Workshop on Natural Language-based Software Engineering (NLBSE'25). As in prior editions, we organized the competition on automated code comment classification, with a larger dataset. In this tool competition edition, six teams submitted multiple classification models to automatically classify code comments. The submitted models were fine-tuned and evaluated on a benchmark dataset of 14,875 code comments, respectively. This paper reports details of the competition, including the rules, the teams and contestant models, and the ranking of models based on their average classification performance across issue report and code comment types.
Code Review Comprehension: Reviewing Strategies Seen Through Code Comprehension Theories
Pavlína Wurzel Gonçalves, Pooja Rani, Margaret-Anne Storey, Diomidis Spinellis, Alberto Bacchelli
IEEE International Conference on Program Comprehension, 2025
Despite the popularity and importance of modern code review, the understanding of the cognitive processes that enable reviewers to analyze code and provide meaningful feedback is lacking. To address this gap, we observed and interviewed ten experienced reviewers while they performed 25 code reviews from their review queue. Since comprehending code changes is essential to perform code review and the primary challenge for reviewers, we focused our analysis on this cognitive process. Using Letovsky's model of code comprehension, we performed a theorydriven thematic analysis to investigate how reviewers apply code comprehension to navigate changes and provide feedback. Our findings confirm that code comprehension is fundamental to code review. We extend Letovsky's model to propose the Code Review Comprehension Model and demonstrate that code review, like code comprehension, relies on opportunistic strategies. These strategies typically begin with a context-building phase, followed by code inspection involving code reading, testing, and discussion management. To interpret and evaluate the proposed change, reviewers construct a mental model of the change as an extension of their understanding of the overall software system and contrast mental representations of expected and ideal solutions against the actual implementation. Based on our findings, we discuss how review tools and practices can better support reviewers in employing their strategies and in forming understanding. Data and material: https://doi.org/10.5281/zenodo.14748996
On Refining the SZZ Algorithm with Bug Discussion Data
Pooja Rani, Fernando Petrulio, Alberto Bacchelli
Empirical Software Engineering, 2024
Context Researchers testing hypotheses related to factors leading to low-quality software often rely on historical data, specifically on details regarding when defects were introduced into a codebase of interest. The prevailing techniques to determine the introduction of defects revolve around variants of the SZZ algorithm. This algorithm leverages information on the lines modified during a bug-fixing commit and finds when these lines were last modified, thereby identifying bug-introducing commits. Objectives Despite several improvements and variants, SZZ struggles with accuracy, especially in cases of unrelated modifications or that touch files not involved in the introduction of the bug in the version control systems (aka tangled commit and ghost commits). Methods Our research investigates whether and how incorporating content retrieved from bug discussions can address these issues by identifying the related and external files and thus improve the efficacy of the SZZ algorithm. Results To conduct our investigation, we take advantage of the links manually inserted by Mozilla developers in bug reports to signal which commits inserted bugs. Thus, we prepared the dataset, RoTEB, comprised of 12,472 bug reports. We first manually inspect a sample of 369 bug reports related to these bug-fixing or bug-introducing commits and investigate whether the files mentioned in these reports could be useful for SZZ. After we found evidence that the mentioned files are relevant, we augment SZZ with this information, using different strategies, and evaluate the resulting approach against multiple SZZ variations. Conclusion We define a taxonomy outlining the rationale behind developers’ references to diverse files in their discussions. We observe that bug discussions often mention files relevant to enhancing the SZZ algorithm’s efficacy. Then, we verify that integrating these file references augments the precision of SZZ in pinpointing bug-introducing commits. Yet, it does not markedly influence recall. These results deepen our comprehension of the usefulness of bug discussions for SZZ. Future work can leverage our dataset and explore other techniques to further address the problem of tangled commits and ghost commits. Data & material: https://zenodo.org/records/11484723.
Beyond code: Is there a difference between comments in visual and textual languages?
Alexander Boll, Pooja Rani, Alexander Schultheiß, Timo Kehrer
Journal of Systems and Software, 2024
Code comments are crucial for program comprehension and maintenance. To better understand the nature and content of comments, previous work proposed taxonomies of comment information for textual languages, notably classical programming languages. However, paradigms such as model-driven or model-based engineering often promote the use of visual languages, to which existing taxonomies are not directly applicable. Taking MATLAB/Simulink as a representative of a sophisticated and widely used modeling environment, we extend a multi-language comment taxonomy onto new (visual) comment types and two new languages: Simulink and MATLAB. Furthermore, we outline Simulink commenting practices and compare them to textual languages. We analyze 259,267 comments from 9095 Simulink models and 17,792 MATLAB scripts. We identify the comment types, their usage frequency, classify comment information, and analyze their correlations with model metrics. We manually analyze 757 comments to extend the taxonomy. We also analyze commenting guidelines and developer adherence to them. Our extended taxonomy, SCoT (Simulink Comment Taxonomy), contains 25 categories. We find that Simulink comments, although often duplicated, are used at all model hierarchy levels. Of all comment types, Annotations are used most often; Notes scarcely. Our results indicate that Simulink developers, instead of extending comments, add new ones, and rarely follow commenting guidelines. Overall, we find Simulink comment information comparable to textual languages, which highlights commenting practice similarity across languages. • Overview of Simulink&MATLAB comments in diverse open-source projects/models. • Taxonomy for Simulink&MATLAB comments, applicable to other languages. • Comparison of Simulink&MATLAB comments to previously studied languages. • Public dataset of classified comments & scripts in the replication package. • Quantitatively and qualitatively, visual and textual comments are very similar.
The NLBSE'24 Tool Competition
Rafael Kallis, Giuseppe Colavito, Ali Al-Kaswan, Luca Pascarella, Oscar Chaparro, Pooja Rani
Proceedings 2024 ACM IEEE International Workshop on Nl Based Software Engineering Nlbse 2024, 2024
We report on the organization and results of the tool competition of the third International Workshop on Natural Language-based Software Engineering (NLBSE’24). As in prior editions, we organized the competition on automated issue report classification, with focus on small repositories, and on automated code comment classification, with a larger dataset. In this tool competition edition, six teams submitted multiple classification models to automatically classify issue reports and code comments. The submitted models were fine-tuned and evaluated on a benchmark dataset of 3 thousand issue reports or 82 thousand code comments, respectively. This paper reports details of the competition, including the rules, the teams and contestant models, and the ranking of models based on their average classification performance across issue report and code comment types.
Energy Patterns for Web: An Exploratory Study
Pooja Rani, Jonas Zellweger, Veronika Kousadianos, Luis Cruz, Timo Kehrer, Alberto Bacchelli
Proceedings International Conference on Software Engineering, 2024
As the energy footprint generated by software is increasing at an alarming rate, understanding how to develop energy-eﬃcient applications has become a necessity. Previous work has introduced catalogs of coding practices, also known as energy patterns. These patterns are yet limited to Mobile or third-party libraries. In this study, we focus on the Web domain—a main source of energy consumption. First we investigated whether and how Mobile energy patterns can be ported to this domain and found that 20 patterns could be ported. Then, we interviewed six expert web developers from different companies to challenge the ported patterns. Most developers expressed concerns for antipatterns, specifically with functional antipatterns, and were able to formulate guidelines to locate these patterns in the source code. Finally, to quantify the effect of Web energy patterns on energy consumption, we set up an automated pipeline to evaluate two ported patterns: ‘Dynamic Retry Delay’ (DRD) and ‘Open Only When Necessary’ (OOWN). With this, we found no evidence that the DRD pattern consumes less energy than its antipattern, while the opposite is true for OOWN. Data and Material: https://doi.org/10.5281/zenodo.8404487CCS CONCEPTS• Software and its engineering → Empirical software validation.LAY ABSTRACTThe information technology sector significantly affects the climate. With our increasing online activities, from chatting to accessing medical history, software powering these services requires to be energy-efficient. Researchers in software engineering have been exploring green coding practices, or energy-specific design patterns (aka energy patterns) to make software more eco-friendly. While such energy practices have been explored for other domains including Mobile, Web applications have been somewhat overlooked, despite our daily heavy internet use. We focused on the existing energy patterns from Mobile applications to Web applications. To validate these ported energy patterns, we interviewed six professional web developers from various companies. Then, we tested some patterns to see if these energy patterns indeed save any energy. Our results showed that developers are unaware of the energy practices and some patterns did not make a noticeable difference, while others consume more energy than their counterpart. In a nutshell, our work highlights the knowledge gap between green coding research and industry and emphasize the need to understand the trade-offs in energy practices for sustainable digital future.
Variable Discovery with Large Language Models for Metamorphic Testing of Scientific Software
Christos Tsigkanos, Pooja Rani, Sebastian Müller, Timo Kehrer
Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2023
Large Language Models: The Next Frontier for Variable Discovery within Metamorphic Testing?
Christos Tsigkanos, Pooja Rani, Sebastian Müller, Timo Kehrer
Proceedings 2023 IEEE International Conference on Software Analysis Evolution and Reengineering Saner 2023, 2023
The NLBSE'23 Tool Competition
Rafael Kallis, Maliheh Izadi, Luca Pascarella, Oscar Chaparro, Pooja Rani
Proceedings 2023 IEEE ACM 2nd International Workshop on Natural Language Based Software Engineering Nlbse 2023, 2023
LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain
Joel Niklaus, Veton Matoshi, Pooja Rani, Andrea Galassi, Matthias Stürmer, Ilias Chalkidis
Findings of the Association for Computational Linguistics Emnlp 2023, 2023
A decade of code comment quality assessment: A systematic literature review
Pooja Rani, Arianna Blasi, Nataliia Stulova, Sebastiano Panichella, Alessandra Gorla, Oscar Nierstrasz
Journal of Systems and Software, 2023
Can We Automatically Generate Class Comments in Pharo?
Ceur Workshop Proceedings, 2022
Speculative Analysis for Quality Assessment of Code Comments
Pooja Rani
Proceedings International Conference on Software Engineering, 2021
What do class comments tell us? An investigation of comment evolution and practices in Pharo Smalltalk
Pooja Rani, Sebastiano Panichella, Manuel Leuenberger, Mohammad Ghafari, Oscar Nierstrasz
Empirical Software Engineering, 2021
How to identify class comment types? A multi-language approach for class comment classification
Pooja Rani, Sebastiano Panichella, Manuel Leuenberger, Andrea Di Sorbo, Oscar Nierstrasz
Journal of Systems and Software, 2021
Makar: A Framework for Multi-source Studies based on Unstructured Data
Mathias Birrer, Pooja Rani, Sebastiano Panichella, Oscar Nierstrasz
Proceedings 2021 IEEE International Conference on Software Analysis Evolution and Reengineering Saner 2021, 2021
Do Comments follow Commenting Conventions? A Case Study in Java and Python
Pooja Rani, Suada Abukar, Nataliia Stulova, Alexandre Bergel, Oscar Nierstrasz
Proceedings IEEE 21st International Working Conference on Source Code Analysis and Manipulation Scam 2021, 2021
What Do Developers Discuss about Code Comments?
Pooja Rani, Mathias Birrer, Sebastiano Panichella, Mohammad Ghafari, Oscar Nierstrasz
Proceedings IEEE 21st International Working Conference on Source Code Analysis and Manipulation Scam 2021, 2021

MOST CITED SCHOLAR PUBLICATIONS

Lextreme: A multi-lingual and multi-task benchmark for the legal domain
J Niklaus, V Matoshi, P Rani, A Galassi, M Stürmer, I Chalkidis
Findings of the Association for Computational Linguistics: EMNLP 2023, 3016-3054 , 2023
2023
Citations: 110
A decade of code comment quality assessment: A systematic literature review
P Rani, A Blasi, N Stulova, S Panichella, A Gorla, O Nierstrasz
Journal of Systems and Software 195, 111515 , 2023
2023
Citations: 74
How to identify class comment types? A multi-language approach for class comment classification
P Rani, S Panichella, M Leuenberger, A Di Sorbo, O Nierstrasz
Journal of Systems and Software 181, 111047 , 2021
2021
Citations: 59
The NLBSE'24 Tool Competition
R Kallis, G Colavito, A Al-Kaswan, L Pascarella, O Chaparro, P Rani
Proceedings of the Third ACM/IEEE International Workshop on NL-based … , 2024
2024
Citations: 55
Large Language Models: The Next Frontier for Variable Discovery within Metamorphic Testing?
C Tsigkanos, P Rani, S Müller, T Kehrer
2023 IEEE International Conference on Software Analysis, Evolution and … , 2023
2023
Citations: 34
How does Simulation-based Testing for Self-driving Cars match Human Perception?
C Birchler, TK Mohammed, P Rani, T Nechita, T Kehrer, S Panichella
Proceedings of the ACM on Software Engineering 1 (FSE), 929-950 , 2024
2024
Citations: 28
Energy Patterns for Web: An Exploratory Study
P Rani, J Zellweger, V Kousadianos, L Cruz, T Kehrer, A Bacchelli
Proceedings of the 46th International Conference on Software Engineering … , 2024
2024
Citations: 25
What do developers discuss about code comments?
P Rani, M Birrer, S Panichella, M Ghafari, O Nierstrasz
2021 IEEE 21st International Working Conference on Source Code Analysis and … , 2021
2021
Citations: 24
Variable Discovery with Large Language Models for Metamorphic Testing of Scientific Software
C Tsigkanos, P Rani, S Müller, T Kehrer
International Conference on Computational Science, 321-335 , 2023
2023
Citations: 23
What do class comments tell us? An investigation of comment evolution and practices in Pharo Smalltalk
P Rani, S Panichella, M Leuenberger, M Ghafari, O Nierstrasz
Empirical software engineering 26 (6), 112 , 2021
2021
Citations: 22
Greening ai-enabled systems with software engineering: A research agenda for environmentally sustainable ai practices
L Cruz, JP Fernandes, MH Kirkeby, S Martínez-Fernández, J Sallou, ...
ACM SIGSOFT Software Engineering Notes 50 (3), 14-23 , 2025
2025
Citations: 20
Do Comments follow Commenting Conventions? A Case Study in Java and Python
P Rani, S Abukar, N Stulova, A Bergel, O Nierstrasz
2021 IEEE 21st International Working Conference on Source Code Analysis and … , 2021
2021
Citations: 19
A roadmap for simulation-based testing of autonomous cyber-physical systems: Challenges and future direction
C Birchler, S Khatiri, P Rani, T Kehrer, S Panichella
ACM Transactions on Software Engineering and Methodology 34 (5), 1-9 , 2025
2025
Citations: 18
Code Review Comprehension: Reviewing Strategies Seen Through Code Comprehension Theories
PW Gonçalves, P Rani, MA Storey, D Spinellis, A Bacchelli
2025 IEEE/ACM 33rd International Conference on Program Comprehension (ICPC) , 2025
2025
Citations: 11
Beyond Code: Is There a Difference between Comments in Visual and Textual Languages?
A Boll, P Rani, A Schultheiß, T Kehrer
Available at SSRN 4650661 , 2024
2024
Citations: 9
Speculative Analysis for Quality Assessment of Code Comments
P Rani
2021 IEEE/ACM 43rd International Conference on Software Engineering … , 2021
2021
Citations: 7
Can We Make Code Green? Understanding Trade-Offs in LLMs vs. Human Code Optimizations
P Rani, JA Bard, J Sallou, A Boll, T Kehrer, A Bacchelli
arXiv preprint arXiv:2503.20126 , 2025
2025
Citations: 6
On Refining the SZZ Algorithm with Bug Discussion Data
P Rani, F Petrulio, A Bacchelli
Empirical Software Engineering 29 (5), 115 , 2024
2024
Citations: 5
The nlbse'25 tool competition
A Al-Kaswan, G Colavito, N Stulova, P Rani
2025 IEEE/ACM International Workshop on Natural Language-Based Software … , 2025
2025
Citations: 4
The price of precision: the cost of preprocessing for automated code revision in code review
S Pirouzkhah, P Rani, F Sovrano, V Hellendoorn, A Bacchelli
Empirical Software Engineering 31 (2), 47 , 2026
2026
Citations: 1

Pooja Rani

Scopus Publications

RECENT SCHOLAR PUBLICATIONS

MOST CITED SCHOLAR PUBLICATIONS