Johanna Fleckenstein

@uni-hildesheim.de

Technology-based Learning and Instruction, Department of Educational and Social Sciences
Universität Hildesheim

44

Scopus Publications

2954

Scholar Citations

24

Scholar h-index

40

Scholar i10-index

Scopus Publications

  • LLM feedback for academic writing: Effects on students’ performance and engagement
    Robert Glüsing, Johanna Fleckenstein, Fabian T.C. Schmidt, Jens Möller
    Contemporary Educational Psychology, 2026
    Writing and revising academic texts is a demanding task that benefits significantly from feedback provided by teachers or peers. However, providing elaborated formative feedback on students’ academic writing is time-intensive and therefore hard to implement in educational practice. As a supplementary resource, large language models (LLMs) offer the potential to support the writing process by generating automated feedback to help students enhance their texts. The present study examined the accuracy of LLM-generated feedback on student texts and its effectiveness in improving university students’ revision performance and engagement in academic writing. In a randomized controlled experiment, a sample of N = 144 university students wrote an abstract summarizing a research article. All participants were then instructed to revise their abstracts; half received individualized feedback generated by GPT-4 using a standardized prompting procedure. Controlling for the quality of the initial drafts, regression analyses revealed that LLM-generated feedback led to higher revision quality and increased behavioral engagement, as measured by revision time and edit distance. Furthermore, behavioral engagement partially mediated the effect of feedback on revision quality. These findings demonstrate that LLMs can provide high-accuracy, effective feedback on academic writing. The study discusses the potential applications and implications of this technology within higher education contexts.
  • On the role of engagement in automated feedback effectiveness: Insights from keystroke logging
    Ronja Schiller, Johanna Fleckenstein, Lars Höft, Andrea Horbach, Jennifer Meyer
    Computers and Education, 2025
    Feedback research increasingly focuses on the role of learners’ engagement in the feedback process. Process measures from technology-based learning environments that reflect writing behavior can provide new insights into the mechanisms underlying feedback effectiveness by making engagement visible. Previous research has shown that log data and similarity measures mediate the effects of automated feedback on learners’ revision performance. In the present study, we aimed to replicate and extend previous research using measures obtained from keystroke logging that represent the revision process on a more fine-grained level. We considered behavioral engagement (i.e., number of keystrokes and typing time) and writing pauses as potential indicators of cognitive engagement. In a classroom experiment, N = 453 English-as-a-foreign-language (EFL) learners ( M age = 16.11) completed a writing task and revised their draft, receiving either feedback generated by a large language model (i.e., GPT 3.5 Turbo) or no feedback. A second writing task served as a transfer task. All texts were scored automatically to assess performance. The effect of automated feedback on learners’ revision and transfer performance was mediated through the different indicators of behavioral engagement during the text revision, although the direct effect of automated feedback on the transfer task was not significant. We found small effects of feedback on pause length and the number of pauses, but the indirect effects were not significant. The study provides further evidence on the role of learning engagement in feedback effectiveness and illustrates how online measures (i.e., keystroke logging) can be used to gain new insights into the effectiveness of automated feedback. The use of different process measures to assess learning engagement is discussed.
  • Self-assessment accuracy in the age of artificial Intelligence: Differential effects of LLM-generated feedback
    Lucas W. Liebenow, Fabian T.C. Schmidt, Jennifer Meyer, Johanna Fleckenstein
    Computers and Education, 2025
    Feedback is a promising intervention to foster students’ self-assessment accuracy (SAA), but the effect can vary depending on students' initial skill levels or prior performance. In particular, lower-performing students who are less accurate might benefit more from feedback in terms of SAA. To deepen our understanding, the present study investigated the mechanism and dependencies of feedback effects on SAA in the realm of large language models (LLMs). Within a randomized control experiment, we examined the effect of LLM-generated feedback on SAA by considering students’ initial performance and initial SAA as potential moderators. A sample of N = 459 upper secondary students wrote an argumentative essay in English as a foreign language and revised their text. After finishing their first draft (pretest) and revision (posttest) of the draft, students self-assessed their writing performance. Students in the experimental group received GPT-3.5-turbo-generated feedback on their first draft during their revision. In the control group, students could revise their text without feedback. Our results indicated no significant main effect of LLM-generated feedback on students’ SAA. Furthermore, we found a significant interaction effect between feedback and students' pretest SAA on SAA changes, indicating that lower-calibrated students improved their SAA with feedback more, compared to students with similar pretest SAA and without feedback. Exploratory analyses revealed that students with higher pretest SAA did not improve their SAA with feedback and decreased their SAA. We discuss this nuanced evidence and draw implications for research and practice using LLM-generated feedback in education. • LLM-generated feedback did not improve self-assessment accuracy (SAA) on average. • Feedback effectiveness depended on students' initial SAA, not performance. • Students with lower initial SAA improved their SAA after LLM feedback. • LLM-generated feedback offers a scalable way to support students who need it most.
  • Neural Networks or Linguistic Features? - Comparing Different Machine-Learning Approaches for Automated Assessment of Text Quality Traits Among L1- and L2-Learners’ Argumentative Essays
    Julian F. Lohmann, Fynn Junge, Jens Möller, Johanna Fleckenstein, Ruth Trüb, Stefan Keller, Thorben Jansen, Andrea Horbach
    International Journal of Artificial Intelligence in Education, 2025
    Recent investigations in automated essay scoring research imply that hybrid models, which combine feature engineering and the powerful tools of deep neural networks (DNNs), reach state-of-the-art performance. However, most of these findings are from holistic scoring tasks. In the present study, we use a total of four prompts from two different corpora consisting of both L1 and L2 learner essays annotated with trait scores (e.g., content, organization, and language quality). In our main experiments, we compare three variants of trait-specific models using different inputs: (1) models based on 220 linguistic features, (2) models using essay-level contextual embeddings from the distilled version of the pre-trained transformer BERT (DistilBERT), and (3) a hybrid model using both types of features. Results imply that when trait-specific models are trained based on a single resource, the feature-based models slightly outperform the embedding-based models. These differences are most prominent for the organization traits. The hybrid models outperform the single-resource models, indicating that linguistic features and embeddings indeed capture partially different aspects relevant for the assessment of essay traits. To gain more insights into the interplay between both feature types, we run addition and ablation tests for individual feature groups. Trait-specific addition tests across prompts indicate that the embedding-based models can most consistently be enhanced in content assessment when combined with morphological complexity features. Most consistent performance gains in the organization traits are achieved when embeddings are combined with length features, and most consistent performance gains in the assessment of the language traits when combined with lexical complexity, error, and occurrence features. Cross-prompt scoring again reveals slight advantages for the feature-based models.
  • (De)motivating Zero-Performing Students With Negative Feedback: Does the Salience of Performance Information Matter?
    Marlene Steinbach, Johanna Fleckenstein, Livia Kuklick, Jennifer Meyer
    Journal of Computer Assisted Learning, 2025
    BackgroundProviding students with information on their current performance could help them improve by stimulating their reflection, but negative feedback that saliently mirrors task‐related failure can harm motivation. In the context of automated scoring based on artificial intelligence, we explored how feedback on written texts might be designed to be least detrimental for zero‐performing students who are likely to receive negative feedback frequently and might suffer from its motivational consequences.ObjectivesThis experiment set out to investigate whether making the negative performance information in automated feedback messages less salient reduces the potential threat of negative feedback for zero‐performing students' task‐specific self‐concept, intrinsic value, and performance.MethodsA sample of 105 (Mage = 13.97 years) zero‐performing students received negative feedback with either more or less salient performance information after completing an English writing task. We used regression analysis to examine pre–post effects and group differences in self‐concept, intrinsic value, and performance.Results and ConclusionsThe analyses showed that zero‐performing students' performance improved but their self‐concept and intrinsic value declined over the course of two writing tasks, with feedback provided after the initial task. Contrary to expectations, our findings showed that students' task‐specific self‐concept and intrinsic value declined more in the condition with less salient performance information (i.e., without a red cross as a salient visual performance cue). Our findings highlight the motivational potential of performance information and are discussed in terms of the need for further research into how negative feedback can be designed to effectively motivate and support zero‐performing learners.
  • “Can (A)I do this task?” The role of AI as a socializer of students' self-beliefs of their abilities
    Thorben Jansen, Jennifer Meyer, Johanna Fleckenstein, Allan Wigfield, Jens Möller
    Learning and Individual Differences, 2025
    Students' beliefs about their own academic abilities – their answers to the question “Can I do this task?” - are crucial to their success. Learning within AI-supported environments, alongside AI agents, influences students' beliefs about their abilities. Studies show enhancing and diminishing influences that remain unexplained by motivation theory, limiting theories' explanatory effect in AI-supported learning environments, and leaving educational technology research without a solid theoretical foundation. The following article specifies the situated expectancy-value theory (SEVT) for students' self-belief formation in the context of an AI-driven society. The expanded theory conceptualizes AI as becoming an artificial socializer, capturing the role of AI as an instrumental tool and social agents making up students' individual environments. Bridging AI and motivational research provides a framework for systematically investigating students' self-beliefs in AI-supported contexts and how educational technology can support positive self-beliefs, considering students' contexts and individual differences. • Summarizes and explains empirical influences of AI on students' ability self-beliefs. • Integrates AI into situated-expectancy value theory. • Provides a framework to investigate AI effects on students' ability self-beliefs. • Describe potential mechanisms in the self-belief formation considering AI.
  • Nonengagement and unsuccessful engagement with feedback in lower secondary education: The role of student characteristics
    Jennifer Meyer, Thorben Jansen, Johanna Fleckenstein
    Contemporary Educational Psychology, 2025
    • We investigated feedback engagement in a sample of lower-secondary students. • Findings show that 20% of students did not engage, 47% unsuccessfully engaged in a text revision. • We focused on the role of individual differences in feedback engagement. • We considered the role of gender, cognitive and noncognitive variables. Feedback can be a powerful learning intervention and learners’ active engagement is assumed to be one of the most important determinants of feedback effectiveness. But not all students successfully engage with feedback. In the present study, we aimed to make students’ engagement with feedback visible by focusing on their text revisions as an indicator of feedback response. On the basis of theoretical models of feedback processing, we differentiated between behavioral nonengagement (i.e., not revising at all after receiving feedback) and unsuccessful engagement (i.e., revising after receiving feedback, but not improving in the process). Capitalizing on this distinction, we compared the characteristics of students in both groups with those of students who (successfully) engaged with the feedback. We provided automated computer-based feedback on a writing task to a sample of 937 students in lower secondary education in Germany (49% female, Grades 7[28%], 8 [29%], and 9[43%]), asking students to revise their texts according to the feedback. We found that 20% of the students did not make any revisions to their text after receiving feedback (nonengagement) and that 47% of the students did not improve their performance after working with the feedback during a text revision (unsuccessful engagement). Male students and students with lower cognitive abilities were more likely to show nonengagement. For unsuccessful engagement, cognitive abilities and the English grade were relevant predictors, hinting at the role that domain-specific competencies play in translating feedback into effective revision. We also found significant positive associations of intrinsic task value with successful feedback engagement. We discuss how future research could advance understanding of feedback processing by taking a more fine-grained approach to investigating feedback response.
  • Understanding individual differences in students’ responses to technology-based feedback on a writing task: the role of achievement motives and initial task performance
    Jennifer Meyer, Thorben Jansen, Martin Daumiller, Johanna Fleckenstein
    Journal of Research on Technology in Education, 2025
    Computer-based feedback interventions are generally effective—but not for all students. Students’ achievement motives (hopes for success, fear of failure) might explain how students respond to feedback in interplay with initial task performance. In a sample of 949 secondary school students in Germany (Grades 7–9) we found that when the task criterion was initially not met, higher hopes for success were positively associated with students’ subsequent task performance after receiving automated feedback. When the criterion was initially met, a higher fear of failure was negatively related to the subsequent task performance. Our results suggest that achievement motives can play a complex role at different levels of initial task performance. These insights could inform personalized feedback design to enhance feedback effectiveness in cognitively demanding tasks.
  • Data extraction by generative artificial intelligence: Assessing determinants of accuracy using human-extracted data from systematic review databases.
    Thorben Jansen, Lucas W. Liebenow, Ute Mertens, Fabian T. C. Schmidt, Julian F. Lohmann, Johanna Fleckenstein, Jennifer Meyer
    Psychological Bulletin, 2025
    Psychological science requires reliable measures. Within systematic literature reviews, reliability hinges on high interrater agreement during data extraction. Yet, the extraction process has been time-consuming. Efforts to accelerate the process using technology have shown limited success until generative artificial intelligence (genAI), particularly large language models (LLMs), accurately extracted variables from medical studies. Nonetheless, for psychological researchers, it remains unclear how to utilize genAI for data extraction, given the range of tested variables, the medical context, and the variability in accuracy. We systematically assessed extraction accuracy and error patterns across domains in psychology by comparing genAI-extracted and human-extracted data from 22 systematic review databases published in the Psychological Bulletin. Eight LLMs extracted 312,329 data points from 2,179 studies on 186 variables. LLM extractions achieved unacceptable accuracy on all metrics for 20% of variables. For 46% of variables, accuracy was acceptable for some metrics and unacceptable for others. LLMs reached acceptable but not high accuracy on all metrics in 15%, high but not excellent in 8%, and excellent accuracy in 12% of variables. Accuracy varied most between variables, less between systematic reviews, and least between LLMs. Moderator analyses using a hierarchical logistic regression, hierarchical linear model, and meta-analysis revealed that accuracy was higher for variables describing studies' context and moderator variables compared to variables for effect size calculation. Also, accuracy was higher in systematic reviews with more detailed variable descriptions and positively correlated with model sizes. We discuss directions for investigating ways to use genAI to accelerate data extractions while ensuring meaningful human control. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
  • Understanding the effectiveness of automated feedback: Using process data to uncover the role of behavioral engagement
    Ronja Schiller, Johanna Fleckenstein, Ute Mertens, Andrea Horbach, Jennifer Meyer
    Computers and Education, 2024
    In the last couple of years, feedback research has shifted towards a feedback-as-process approach, taking a learner-centered perspective and focusing on the proactive role of the learner in feedback effectiveness. Process measures can provide new insights into the role of the learner by making learners’ actual behavioral engagement visible. We conducted an experimental study, comparing two groups (feedback vs. no feedback) of English-as-a-foreign-language learners in lower secondary schools ( N = 189). The learners completed a writing task and revised it with or without feedback. A second writing task served as a transfer task. Performance was automatically assessed using a scoring algorithm. To determine the level of learners’ behavioral engagement during the text revision, we used the revision time and the edit distance (i.e., a similarity measure) as behavioral measures. Our analyses showed a positive effect of feedback on text revision. We found a full mediation of the effect of feedback on text revision through revision time with an estimated portion of mediation (POM) of .63∗∗∗ and a partial mediation of the feedback effect on text revision through the edit distance with a POM of .30∗∗. We did not find significant mediation effects of either engagement variable regarding performance in a transfer task. Our findings contribute to the understanding of feedback effectiveness, highlighting the central role of learner engagement in the feedback process. • We investigate feedback effectiveness from a process-oriented perspective. • Log-data and computer-linguistic features as objective indicators of engagement. • Feedback is associated with higher levels of engagement during text revision. • Behavioral engagement mediates positive effect of feedback on revision performance. • We did not find an effect of the feedback on a transfer task.
  • How am I going? Behavioral engagement mediates the effect of individual feedback on writing performance
    Johanna Fleckenstein, Thorben Jansen, Jennifer Meyer, Ruth Trüb, Emily E. Raubach, Stefan D. Keller
    Learning and Instruction, 2024
  • Language quality, content, structure: What analytic ratings tell us about EFL writing skills at upper secondary school level in Germany and Switzerland
    Stefan D. Keller, Julian Lohmann, Ruth Trüb, Johanna Fleckenstein, Jennifer Meyer, Thorben Jansen, Jens Möller
    Journal of Second Language Writing, 2024
  • Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays
    Johanna Fleckenstein, Jennifer Meyer, Thorben Jansen, Stefan D. Keller, Olaf Köller, Jens Möller
    Computers and Education Artificial Intelligence, 2024
  • Using LLMs to bring evidence-based feedback into the classroom: AI-generated feedback increases secondary students’ text revision, motivation, and positive emotions
    Jennifer Meyer, Thorben Jansen, Ronja Schiller, Lucas W. Liebenow, Marlene Steinbach, Andrea Horbach, Johanna Fleckenstein
    Computers and Education Artificial Intelligence, 2024
  • Individualizing goal-setting interventions using automated writing evaluation to support secondary school students’ text revisions
    Thorben Jansen, Jennifer Meyer, Johanna Fleckenstein, Andrea Horbach, Stefan Keller, Jens Möller
    Learning and Instruction, 2024
  • Two-way immersion promotes additional language learning: performance of bilingual sixth-grade students in English as a third language
    Sandra Preusler, Johanna Fleckenstein, Steffen Zitzmann, Jürgen Baumert, Jens Möller
    International Journal of Bilingual Education and Bilingualism, 2024
  • Conscientiousness and Cognitive Ability as Predictors of Academic Achievement: Evidence of Synergistic Effects From Integrative Data Analysis
    Jennifer Meyer, Oliver Lüdtke, Fabian T. C. Schmidt, Johanna Fleckenstein, Ulrich Trautwein, Olaf Köller
    European Journal of Personality, 2024
  • Comparing Generative AI and Expert Feedback to Students’ Writing: Insights from Student Teachers
    Thorben Jansen, Lars Höft, Luca Bahr, Johanna Fleckenstein, Jens Möller, Olaf Köller, Jennifer Meyer
    Psychologie in Erziehung Und Unterricht, 2024
  • Machine Learning in the educational context: Evidence of prediction accuracy considering essays in English as a foreign language
    Jennifer Meyer, Thorben Jansen, Johanna Fleckenstein, Stefan Keller, Olaf Köller
    Zeitschrift Fur Padagogische Psychologie, 2023
  • A closer look at the domain-specific associations of openness with language achievement: Evidence on the role of intrinsic value from two large-scale longitudinal studies
    Jennifer Meyer, Fabian T. C. Schmidt, Johanna Fleckenstein, Olaf Köller
    British Journal of Educational Psychology, 2023
  • Automated feedback and writing: a multi-level meta-analysis of effects on students' performance
    Johanna Fleckenstein, Lucas W. Liebenow, Jennifer Meyer
    Frontiers in Artificial Intelligence, 2023
  • Sequence Tagging in EFL Email Texts as Feedback for Language Learners
    Proceedings of the 12th Workshop on Natural Language Processing for Computer Assisted Language Learning Nlp4call 2023, 2023
  • Read at home to do well at school: informal reading predicts achievement and motivation in English as a foreign language
    Jennifer Meyer, Johanna Fleckenstein, Maleika Krüger, Stefan Daniel Keller, Nicolas Hübner
    Frontiers in Psychology, 2023
  • Judgment accuracy of German student texts: Do teacher experience and content knowledge matter?
    Jens Möller, Thorben Jansen, Johanna Fleckenstein, Nils Machts, Jennifer Meyer, Raja Reble
    Teaching and Teacher Education, 2022
  • Correction to: Studies on the Acculturation of Young Refugees in the Educational Domain: A Scoping Review of Research and Methods (Adolescent Research Review, (2021), 6, 1, (15-31), 10.1007/s40894-019-00129-7)
    Débora B. Maehler, Steffen Pötzschke, Howard Ramos, Paul Pritchard, Johanna Fleckenstein
    Adolescent Research Review, 2021
  • Studies on the Acculturation of Young Refugees in the Educational Domain: A Scoping Review of Research and Methods
    Débora B. Maehler, Steffen Pötzschke, Howard Ramos, Paul Pritchard, Johanna Fleckenstein
    Adolescent Research Review, 2021
  • The Long-Term Proficiency of Early, Middle, and Late Starters Learning English as a Foreign Language at School: A Narrative Review and Empirical Study
    Jürgen Baumert, Johanna Fleckenstein, Michael Leucht, Olaf Köller, Jens Möller
    Language Learning, 2020
  • Is a Long Essay Always a Good Essay? The Effect of Text Length on Writing Assessment
    Johanna Fleckenstein, Jennifer Meyer, Thorben Jansen, Stefan Keller, Olaf Köller
    Frontiers in Psychology, 2020
  • Is younger always better? Early foreign language learning at primary school
    Johanna Fleckenstein, Jens Möller, Jürgen Baumert
    Zeitschrift Fur Padagogische Psychologie, 2020
  • English writing skills of students in upper secondary education: Results from an empirical study in Switzerland and Germany
    Stefan D. Keller, Johanna Fleckenstein, Maleika Krüger, Olaf Köller, André A. Rupp
    Journal of Second Language Writing, 2020
  • Linking TOEFL iBT® writing rubrics to CEFR levels: Cut scores and validity evidence from a standard setting study
    Johanna Fleckenstein, Stefan Keller, Maleika Krüger, Richard J. Tannenbaum, Olaf Köller
    Assessing Writing, 2020
  • Writing skills in English as a foreign language in upper secondary school
    Olaf Köller, Johanna Fleckenstein, Jennifer Meyer, Anna Lara Paeske, Maleika Krüger, Andre A. Rupp, Stefan Keller
    Zeitschrift Fur Erziehungswissenschaft, 2019
  • Expectancy value interactions and academic achievement: Differential relationships with achievement measures
    Jennifer Meyer, Johanna Fleckenstein, Olaf Köller
    Contemporary Educational Psychology, 2019
  • Measuring grit: A German validation and a domain-specific approach to grit
    Fabian T. C. Schmidt, Johanna Fleckenstein, Jan Retelsdorf, Lauren Eskreis-Winkler, Jens Möller
    European Journal of Psychological Assessment, 2019
  • Promoting mathematics achievement in one-way immersion: Performance development over four years of elementary school
    Johanna Fleckenstein, Sandra Kristina Gebauer, Jens Möller
    Contemporary Educational Psychology, 2019
  • The relationship of personality traits and different measures of domain-specific achievement in upper secondary education
    Jennifer Meyer, Johanna Fleckenstein, Jan Retelsdorf, Olaf Köller
    Learning and Individual Differences, 2019
  • Same Same, but Different? Relations Between Facets of Conscientiousness and Grit
    Fabian T.C. Schmidt, Gabriel Nagy, Johanna Fleckenstein, Jens Möller, Jan Retelsdorf
    European Journal of Personality, 2018
  • Multilingualism as a resource: Dual-immersion students’ achievement in English as a third language
    Johanna Fleckenstein, Jens Möller, Jürgen Baumert
    Zeitschrift Fur Erziehungswissenschaft, 2018
  • Editorial
    Jens Möller, Johanna Fleckenstein, Sandra Preusler, Isabell Paulick, Jürgen Baumert
    Zeitschrift Fur Erziehungswissenschaft, 2018
  • Variations and effects of bilingual education in schools
    Jens Möller, Johanna Fleckenstein, Friederike Hohenstein, Sandra Preusler, Isabell Paulick, Jürgen Baumert
    Zeitschrift Fur Erziehungswissenschaft, 2018
  • Teachers’ Judgement Accuracy Concerning CEFR Levels of Prospective University Students
    Johanna Fleckenstein, Michael Leucht, Olaf Köller
    Language Assessment Quarterly, 2018
  • Proficient beyond borders: assessing non-native speakers in a native speakers’ framework
    Johanna Fleckenstein, Michael Leucht, Hans Anand Pant, Olaf Köller
    Large Scale Assessments in Education, 2016
  • What works in school? Expert and novice teachers’ beliefs about school effectiveness
    Johanna Fleckenstein, Friederike Zimmermann, Olaf Köller, Jens Møller
    Frontline Learning Research, 2015
  • Who's got Grit? Perseverance and consistency of interest in pre-service teachers. A German adaptation of the 12-Item Grit Scale
    Johanna Fleckenstein, Fabian T.C. Schmidt, Jens Möller
    Psychologie in Erziehung Und Unterricht, 2014

RECENT SCHOLAR PUBLICATIONS

  • Measuring Task-Level Behavioral Learning Engagement During Text Revision
    R Schiller, J Fleckenstein, U Mertens, J Meyer
    Computers & Education, 105656 , 2026
    2026
  • The Future of Feedback: How Can AI Help Transform Feedback to Be More Engaging, Effective, and Scalable?
    J Meyer, O Köller, T Jansen, J Fleckenstein, MW Asher, S Bichler, ...
    arXiv preprint arXiv:2603.12463 , 2026
    2026
  • On the role of engagement in automated feedback effectiveness: Insights from keystroke logging
    R Schiller, J Fleckenstein, L Höft, A Horbach, J Meyer
    Computers & Education 238, 105386 , 2025
    2025
    Citations: 7
  • Self-assessment accuracy in the age of artificial Intelligence: Differential effects of LLM-generated feedback
    LW Liebenow, FTC Schmidt, J Meyer, J Fleckenstein
    Computers & Education 237, 105385 , 2025
    2025
    Citations: 15
  • Data extraction by generative artificial intelligence: Assessing determinants of accuracy using human-extracted data from systematic review databases.
    T Jansen, LW Liebenow, U Mertens, FTC Schmidt, JF Lohmann, ...
    Psychological Bulletin 151 (10), 1280 , 2025
    2025
    Citations: 15
  • Neural networks or linguistic features?-Comparing different machine-learning approaches for automated assessment of text quality traits among L1-and L2-learners’ argumentative …
    JF Lohmann, F Junge, J Möller, J Fleckenstein, R Trüb, S Keller, T Jansen, ...
    International Journal of Artificial Intelligence in Education 35 (3), 1178-1217 , 2025
    2025
    Citations: 9
  • Testing teacher judgments comprehensively: Accuracy, halo, frame of reference, strategy, and personality effects in holistic and analytic assessments of student essays.
    JF Lohmann, F Lötscher, F Junge, S Keller, T Jansen, J Fleckenstein, ...
    Journal of Educational Psychology , 2025
    2025
    Citations: 2
  • “Can (A) I do this task?” The role of AI as a socializer of students' self-beliefs of their abilities
    T Jansen, J Meyer, J Fleckenstein, A Wigfield, J Möller
    Learning and Individual Differences 122, 102731 , 2025
    2025
    Citations: 8
  • (De) motivating Zero‐Performing Students With Negative Feedback: Does the Salience of Performance Information Matter?
    M Steinbach, J Fleckenstein, L Kuklick, J Meyer
    Journal of Computer Assisted Learning 41 (4), e70070 , 2025
    2025
    Citations: 2
  • Nonengagement and unsuccessful engagement with feedback in lower secondary education: The role of student characteristics
    J Meyer, T Jansen, J Fleckenstein
    Contemporary Educational Psychology 81, 102363 , 2025
    2025
    Citations: 22
  • Understanding individual differences in students’ responses to technology-based feedback on a writing task: the role of achievement motives and initial task performance
    J Meyer, T Jansen, M Daumiller, J Fleckenstein
    Journal of Research on Technology in Education, 1-31 , 2025
    2025
    Citations: 8
  • LLM feedback for academic writing: Effects on students’ performance and engagement
    R Glüsing, J Fleckenstein, F Schmidt, J Möller
    Available at SSRN 5445319 , 2025
    2025
    Citations: 3
  • Negative Feedback: Does the Salience of Performance Information Matter?
    M Steinbach, J Fleckenstein, L Kuklick, J Meyer
    2025
  • Understanding the effectiveness of automated feedback: Using process data to uncover the role of behavioral engagement
    R Schiller, J Fleckenstein, U Mertens, A Horbach, J Meyer
    Computers & Education 223, 105163 , 2024
    2024
    Citations: 30
  • How am I going? Behavioral engagement mediates the effect of individual feedback on writing performance
    J Fleckenstein, T Jansen, J Meyer, R Trüb, EE Raubach, SD Keller
    Learning and Instruction 93, 101977 , 2024
    2024
    Citations: 22
  • Language quality, content, structure: What analytic ratings tell us about EFL writing skills at upper secondary school level in Germany and Switzerland
    SD Keller, J Lohmann, R Trüb, J Fleckenstein, J Meyer, T Jansen, J Möller
    Journal of Second Language Writing 65, 101129 , 2024
    2024
    Citations: 19
  • Two-way immersion promotes additional language learning: performance of bilingual sixth-grade students in English as a third language
    S Preusler, J Fleckenstein, S Zitzmann, J Baumert, J Möller
    International Journal of Bilingual Education and Bilingualism 27 (7), 910-922 , 2024
    2024
    Citations: 7
  • Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays
    J Fleckenstein, J Meyer, T Jansen, SD Keller, O Köller, J Möller
    Computers and Education: Artificial Intelligence 6, 100209 , 2024
    2024
    Citations: 225
  • Using LLMs to bring evidence-based feedback into the classroom: AI-generated feedback increases secondary students’ text revision, motivation, and positive emotions
    J Meyer, T Jansen, R Schiller, LW Liebenow, M Steinbach, A Horbach, ...
    Computers and Education: Artificial Intelligence 6, 100199 , 2024
    2024
    Citations: 486
  • Empirische arbeit: comparing generative AI and expert feedback to students’ writing: insights from student teachers
    T Jansen, L Höft, L Bahr, J Fleckenstein, J Möller, O Köller, J Meyer
    Psychologie in Erziehung und Unterricht 71 (2), 80-92 , 2024
    2024
    Citations: 67

MOST CITED SCHOLAR PUBLICATIONS

  • Using LLMs to bring evidence-based feedback into the classroom: AI-generated feedback increases secondary students’ text revision, motivation, and positive emotions
    J Meyer, T Jansen, R Schiller, LW Liebenow, M Steinbach, A Horbach, ...
    Computers and Education: Artificial Intelligence 6, 100199 , 2024
    2024
    Citations: 486
  • Measuring grit
    FTC Schmidt, J Fleckenstein, J Retelsdorf, L Eskreis-Winkler, J Möller
    European Journal of Psychological Assessment , 2017
    2017
    Citations: 306
  • Same same, but different? Relations between facets of conscientiousness and grit
    FTC Schmidt, G Nagy, J Fleckenstein, J Möller, JAN Retelsdorf
    European journal of personality 32 (6), 705-720 , 2018
    2018
    Citations: 227
  • Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays
    J Fleckenstein, J Meyer, T Jansen, SD Keller, O Köller, J Möller
    Computers and Education: Artificial Intelligence 6, 100209 , 2024
    2024
    Citations: 225
  • Expectancy value interactions and academic achievement: Differential relationships with achievement measures
    J Meyer, J Fleckenstein, O Köller
    Contemporary Educational Psychology 58, 58-74 , 2019
    2019
    Citations: 204
  • Automated feedback and writing: a multi-level meta-analysis of effects on students' performance
    J Fleckenstein, L Liebenow, J Meyer
    Frontiers in Artificial Intelligence 6 , 2023
    2023
    Citations: 171
  • The relationship of personality traits and different measures of domain-specific achievement in upper secondary education
    J Meyer, J Fleckenstein, J Retelsdorf, O Köller
    Learning and Individual Differences 69, 45-59 , 2019
    2019
    Citations: 126
  • The long‐term proficiency of early, middle, and late starters learning English as a foreign language at school: A narrative review and empirical study
    J Baumert, J Fleckenstein, M Leucht, O Köller, J Möller
    Language Learning 70 (4), 1091-1135 , 2020
    2020
    Citations: 94
  • Linking TOEFL iBT® writing rubrics to CEFR levels: Cut scores and validity evidence from a standard setting study
    J Fleckenstein, S Keller, M Krüger, RJ Tannenbaum, O Köller
    Assessing Writing 43, 100420 , 2020
    2020
    Citations: 80
  • Erfolgreich integrieren-die Staatliche Europa-Schule Berlin
    J Möller, F Hohenstein, J Fleckenstein, O Köller, J Baumert
    Waxmann Verlag , 2017
    2017
    Citations: 73
  • Empirische arbeit: comparing generative AI and expert feedback to students’ writing: insights from student teachers
    T Jansen, L Höft, L Bahr, J Fleckenstein, J Möller, O Köller, J Meyer
    Psychologie in Erziehung und Unterricht 71 (2), 80-92 , 2024
    2024
    Citations: 67
  • Is a long essay always a good essay? The effect of text length on writing assessment
    J Fleckenstein, J Meyer, T Jansen, S Keller, O Köller
    Frontiers in psychology 11, 562462 , 2020
    2020
    Citations: 67
  • English writing skills of students in upper secondary education: Results from an empirical study in Switzerland and Germany
    SD Keller, J Fleckenstein, M Krüger, O Köller, AA Rupp
    Journal of Second Language Writing 48, 100700 , 2020
    2020
    Citations: 63
  • Pädagogische und didaktische Anforderungen an die häusliche Aufgabenbearbeitung
    O Köller, J Fleckenstein, K Guill, J Meyer
    Langsam vermisse ich die Schule…“. Schule während und nach der Corona … , 2020
    2020
    Citations: 48
  • Conscientiousness and cognitive ability as predictors of academic achievement: Evidence of synergistic effects from integrative data analysis
    J Meyer, O Lüdtke, FTC Schmidt, J Fleckenstein, U Trautwein, O Köller
    European Journal of Personality 38 (1), 36-52 , 2024
    2024
    Citations: 46
  • Teachers’ judgement accuracy concerning CEFR levels of prospective university students
    J Fleckenstein, M Leucht, O Köller
    Language Assessment Quarterly 15 (1), 90-101 , 2018
    2018
    Citations: 40
  • Wer hat Biss? Beharrlichkeit und beständiges Interesse von Lehramtsstudierenden
    J Fleckenstein, FTC Schmidt, J Möller
    Psychologie in Erziehung und Unterricht 61 (4), 281-286 , 2014
    2014
    Citations: 40
  • Mehrsprachigkeit als Ressource
    J Fleckenstein, J Möller, J Baumert
    Zeitschrift für Erziehungswissenschaft 21 (1), 97-120 , 2018
    2018
    Citations: 38
  • Proficient beyond borders: assessing non-native speakers in a native speakers’ framework
    J Fleckenstein, M Leucht, HA Pant, O Köller
    Large-scale assessments in education 4 (1), 19 , 2016
    2016
    Citations: 38
  • Promoting mathematics achievement in one-way immersion: Performance development over four years of elementary school
    J Fleckenstein, SK Gebauer, J Möller
    Contemporary Educational Psychology 56, 228-235 , 2019
    2019
    Citations: 37