Stylometry recognizes human and LLM-generated texts in short samples

Karol Przystalski, Jan K. Argasiński, Iwona Grabska-Gradzińska, Jeremi K. Ochab

A recent study co-authored by Jan K. Argasiński from the Computational Neuroscience group at the Sano Centre for Computational Medicine investigates the potential of stylometry — a technique traditionally applied in authorship attribution — to differentiate between human-written content and texts produced by Large Language Models (LLMs).

The researchers developed a benchmark dataset sourced from Wikipedia, featuring:

Summaries authored by humans
Summaries generated by various LLMs (GPT-3.5/4, LLaMa 2/3, Orca, Falcon)
Summaries altered by automatic summarisation tools (T5, BART, Gensim, Sumy)
Texts reworded using paraphrasing systems (Dipper, T5)

By applying decision trees and LightGBM classifiers, and incorporating both manually crafted features (via the StyloMetrix toolkit) and n-gram-based stylometric indicators, the models achieved notable results:

Up to 0.87 Matthews correlation coefficient (MCC) in a multi-class (7-way) classification task
Between 0.79 and 1.0 accuracy in binary classification, with detection of GPT-4 texts reaching 98% accuracy on balanced datasets

Interpretability analysis showed that LLM-generated texts tend to be more grammatically uniform and exhibit distinctive lexical patterns compared to human writing. These findings demonstrate that, despite the growing sophistication of AI-generated text, distinguishing between machine and human authorship remains feasible — especially for structured formats like encyclopaedia entries.

Fig. 1. Explanations for binary classification between the Wikipedia and GPT-4.

Source: www.sciencedirect.com/science/article/pii/S0957417425026181#fig0002

READ HERE

Stylometry recognizes human and LLM-generated texts in short samples
Authors: Karol Przystalski, Jan K. Argasiński, Iwona Grabska-Gradzińska, Jeremi K. Ochab
DOI: 10.1016/j.eswa.2025.129001

Stylometry recognizes human and LLM-generated texts in short samples

Karol Przystalski, Jan K. Argasiński, Iwona Grabska-Gradzińska, Jeremi K. Ochab

Artificial Intelligence Predicts GBA1 Mutated Status in Parkinson's Disease Patients

Functional and structural reorganization in brain tumors: a machine learning approach using desynchronized functional oscillations

Stability of Machine Learning Predictive Features Under Limited Data

Subscribe to our Newsletter