This is the official repository for our EMNLP 2024 paper, "How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection"
We found that current detectors are brittle to instruction changes in text generation and raised awareness of the need to ensure prompt diversity when creating a detection benchmark.
To combat the misuse of Large Language Models (LLMs), many recent studies have presented LLM-generated-text detectors with highly promising performance. When users instruct LLMs to generate texts, the instruction can include different constraints depending on the user’s need. However, most recent studies do not cover such diverse instruction patterns when creating datasets for LLM detection.
In this paper, we find that even task-oriented constraints --- constraints that would naturally be included in an instruction and are not related to detection-evasion --- cause existing powerful detectors to have a large variance in detection performance. We focus on student essay writing as a realistic domain and manually create task-oriented constraints based on several factors for essay quality.
Our experiments and analysis show that:
- A task-oriented constraint has a more significant effect on the detection performance than the baseline randomness caused by generating texts multiple times (via sampling) or paraphrasing the instruction.
- The standard deviation (SD) of current detector performance on texts generated by an instruction with such a constraint reaches up to 14.4 points in F1-score.
- The constraints, overall, make LLM detection more challenging than without them (up to a 40.3 drop in F1-score).
- The high instruction-following ability of LLMs fosters the large impact of such constraints on detection performance.
Considering the remarkable speed of recent LLM development, the instruction-following ability of LLMs would be much better, amplifying the effects of the constraints. Therefore, in an era of evolving LLMs, our finding more strongly calls for further development of robust LLM detectors against such distribution shifts caused by a constraint in instruction.
- Sep 2024: 🎉 Accepted to EMNLP 2024 Findings! See you in Miami🇺🇸
- March 2024: Our constrained essay datasets are now available!
Our dataset is based on 500 pairs of essay problem statements and human(native-student)-written essays that are part of OUTFOX dataset. The native students range from 6th to 12th grade in the U.S.
We instruct three LMs to generate essays: GPT-4(gpt-4-0613
), ChatGPT(gpt-3.5-turbo-0613
), and GPT-3(davinci-002
).
data/llm_essays/Multiple/
includes essays in the Multiple setting: generating texts multiple times (via sampling).
data/llm_essays/Paraphrase/
includes essays in the Paraphrase setting: generating texts via each paraphrased instruction.
data/llm_essays/Constraint/
includes essays in the Constraint setting: generating texts via instruction with each different constraint.
@inproceedings{Koike:EMNLPFindings2024,
title={How You Prompt Matters! {E}ven Task-Oriented Constraints in Instructions Affect {LLM}-Generated Text Detection},
author={Ryuto Koike and Masahiro Kaneko and Naoaki Okazaki},
booktitle="Findings of the Association for Computational Linguistics: EMNLP 2024",
series={EMNLP},
pages="14384--14395",
year="2024",
month=nov,
address={Miami, USA},
}