Jakab Mate Buda

@elte.hu

Instructor, Department of Statistics, Faculty of Social Sciences
Eotvos Lorand University

RESEARCH INTERESTS

Natural Language Processing, Statistics
6

Scopus Publications

94

Scholar Citations

4

Scholar h-index

3

Scholar i10-index

Scopus Publications

  • The language of discrimination: assessing attention discrimination by Hungarian local governments
    Jakab Buda, Renáta Németh, Bori Simonovits, Gábor Simonovits
    Language Resources and Evaluation, 2023
    In our study we assess the responsiveness of Hungarian local governments to requests for information by Roma and non-Roma clients, relying on a nationwide correspondence study. Our paper has both methodological and substantive relevance. The methodological novelty is that we treat discrimination as a classification problem and study to what extent emails written to Roma and non-Roma clients can be distinguished, which in turn serves as a metric of discrimination in general. We show that it is possible to detect discrimination in textual data in an automated way without human coding, and that machine learning (ML) may detect features of discrimination that human coders may not recognize. To the best of our knowledge, our study is the first attempt to assess discrimination using ML techniques. From a substantive point of view, our study focuses on linguistic features the algorithm detects behind the discrimination. Our models worked significantly better compared to random classification (the accuracy of the best of our models was 61%), confirming the differential treatment of Roma clients. The most important predictors showed that the answers sent to ostensibly Roma clients are not only shorter, but their tone is less polite and more reserved, supporting the idea of attention discrimination, in line with the results of Bartos et al. (2016). A higher level of attention discrimination is detectable against male senders, and in smaller settlements. Also, our results can be interpreted as digital discrimination in the sense in which Edelman and Luca (2014) use this term.
  • The impact of depression forums on illness narratives: a comprehensive NLP analysis of socialization in e-mental health communities
    Domonkos Sik, Márton Rakovics, Jakab Buda, Renáta Németh
    Journal of Computational Social Science, 2023
    While depression is globally on the rise, the mental health sector struggles with handling the increased number of cases, especially since the pandemic. These circumstances have resulted in an increased interest in the e-mental health sector. The dataset is constituted of 67 857 posts from the most popular English-language online health forums between 15 February 2016 and 15 February 2019. The posts were first automatically labelled (biomedical vs. psy framing) via deep learning; second, the time series of framing types of recurring forum users were analysed; third, the clusters of biomedical and psy patterns were analysed; fourth, the discursive characteristics of each cluster were analysed with the help of topic modelling. Five ideal-typical patterns of forum socialization are described: the first and the second clusters express the developing of a ‘recovery helper’ role, either by opposing expert discourses or by identifying with the psy discourses; the third cluster expresses the acquiring of a substantively diffuse, uncertain role; the fourth and fifth clusters refer to a trajectory leading to the incorporating of a biomedically framed patient role, or a therapeutic psy subjectivity. Elements of data collection that potentially undermine representativeness: online forum users, open and public forums, keyword search. The trajectories identified in our study represent various phases of a general forum socialization process: newcomers (cluster 3); settled patient role (cluster 4) or psy subjectivity (cluster 5); recovery helpers (cluster 1 and 2).
  • Trust in the household
    Endre Sik, Jakab Buda
    Szociologiai Szemle, 2023
    Ebben a tanulmányban egy olyan modellt elemzünk, amely valamennyi háztartástagot külön-külön tartalmaz, de együttesen vizsgál. Kérdésünk: Hogyan függ össze a háztartástagok bizalmának mértéke és egyenlőtlensége a háztartástagok szociológiai jellemzőivel, a közöttük lévő egyenlőtlenségekkel és a háztartás egészének jellemzőivel? Az elemzés alapja a EU-SILC 2015. évi adatbázisa. A bizalom mértékének becslésére az intézményi és általánosított bizalmat, valamit a belőlük képzett változó átlagát, a bizalom egyenlőtlenségének becslésére e változók szóródását használtuk. Azt találtuk, hogy a bizalom magasabb szintje a magasabb társadalmi státusszal (jó lakókörnyezet, jó anyagi helyzet, magasabb iskolai végzettség és több baráti kapcsolat) és a háztartáson belül élő nők nagyobb arányával jár együtt. A hétköznapi életvitel gondjaitól szenvedő háztartások (akadályozott háztartástag léte, rossz lakás- és lakókörnyezet) körében alacsonyabb a bizalom mértéke. A háztartás bizalma egyenlőtlenebbül oszlik meg a háztartás tagjai között, ha magas az iskolai végzettség, és a sok baráti kapcsolat. Az iskolai végzettség kivételével a háztartástagok közötti valamennyi egyenlőtlenség növeli a bizalmi heterogenitást, ami arra utal, hogy a háztartásban működik egyfajta „egyenlőtlenség-csomag”, s a bizalom is része ennek.
  • Using N-grams and statistical features to identify Hate Speech Spreaders on Twitter
    Ceur Workshop Proceedings, 2021
  • An Ensemble Model Using N-grams and Statistical Features to Identify Fake News Spreaders on Twitter Notebook for PAN at CLEF 2020
    Ceur Workshop Proceedings, 2020
  • Bot or not: A two-level approach in author profiling notebook for PAN at CLEF 2019
    Ceur Workshop Proceedings, 2019

RECENT SCHOLAR PUBLICATIONS

  • From Polarization to Consensus? A Comparative Analysis of Refugee and Migrant Discourse in Belgium and Hungary Across Parliamentary, Media, and Social Media Layers (2015/16 and …
    S Kiyak, J Buda, M Gosztonyi, C Meeusen, R Németh, I Barna, ...
    Etmaal 2025, Date: 2025/02/03-2025/02/04, Location: Brugges , 2025
    2025
  • A felügyelt gépi tanulás alkalmazási lehetőségei szöveges adatokon. A magyar országgyűlésben 1998–2018 között elhangzott beszédek elemzése= The application of supervised …
    JM Buda, R Németh
    STATISZTIKAI SZEMLE 102 (11), 1087-1103 , 2024
    2024
  • The language of discrimination: assessing attention discrimination by Hungarian local governments
    J Buda, R Németh, B Simonovits, G Simonovits
    Language Resources and Evaluation 57 (4), 1547-1570 , 2023
    2023
    Citations: 5
  • The impact of depression forums on illness narratives: a comprehensive NLP analysis of socialization in e-mental health communities
    D Sik, M Rakovics, J Buda, R Németh
    Journal of Computational Social Science 6 (2), 781-802 , 2023
    2023
    Citations: 11
  • The language of discrimination: assessing attention discrimination by Hungarian local governments using machine learning
    R Nemeth, J Buda, B Simonovits
    XX ISA World Congress of Sociology (June 25-July 1, 2023) , 2023
    2023
    Citations: 1
  • Using N-grams and Statistical Features to Identify Hate Speech Spreaders on Twitter.
    E Katona, J Buda, F Bolonyai
    CLEF (Working Notes), 2025-2034 , 2021
    2021
    Citations: 11
  • An Ensemble Model Using N-grams and Statistical Features to Identify Fake News Spreaders on Twitter.
    J Buda, F Bolonyai
    CLEF (working notes) , 2020
    2020
    Citations: 64
  • Bot Or Not: A Two-Level Approach In Author Profiling.
    F Bolonyai, J Buda, E Katona
    CLEF (Working Notes) , 2019
    2019
    Citations: 2

MOST CITED SCHOLAR PUBLICATIONS

  • An Ensemble Model Using N-grams and Statistical Features to Identify Fake News Spreaders on Twitter.
    J Buda, F Bolonyai
    CLEF (working notes) , 2020
    2020
    Citations: 64
  • The impact of depression forums on illness narratives: a comprehensive NLP analysis of socialization in e-mental health communities
    D Sik, M Rakovics, J Buda, R Németh
    Journal of Computational Social Science 6 (2), 781-802 , 2023
    2023
    Citations: 11
  • Using N-grams and Statistical Features to Identify Hate Speech Spreaders on Twitter.
    E Katona, J Buda, F Bolonyai
    CLEF (Working Notes), 2025-2034 , 2021
    2021
    Citations: 11
  • The language of discrimination: assessing attention discrimination by Hungarian local governments
    J Buda, R Németh, B Simonovits, G Simonovits
    Language Resources and Evaluation 57 (4), 1547-1570 , 2023
    2023
    Citations: 5
  • Bot Or Not: A Two-Level Approach In Author Profiling.
    F Bolonyai, J Buda, E Katona
    CLEF (Working Notes) , 2019
    2019
    Citations: 2
  • The language of discrimination: assessing attention discrimination by Hungarian local governments using machine learning
    R Nemeth, J Buda, B Simonovits
    XX ISA World Congress of Sociology (June 25-July 1, 2023) , 2023
    2023
    Citations: 1
  • From Polarization to Consensus? A Comparative Analysis of Refugee and Migrant Discourse in Belgium and Hungary Across Parliamentary, Media, and Social Media Layers (2015/16 and …
    S Kiyak, J Buda, M Gosztonyi, C Meeusen, R Németh, I Barna, ...
    Etmaal 2025, Date: 2025/02/03-2025/02/04, Location: Brugges , 2025
    2025
  • A felügyelt gépi tanulás alkalmazási lehetőségei szöveges adatokon. A magyar országgyűlésben 1998–2018 között elhangzott beszédek elemzése= The application of supervised …
    JM Buda, R Németh
    STATISZTIKAI SZEMLE 102 (11), 1087-1103 , 2024
    2024