DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

Mitchell, Eric; Lee, Yoonho; Khazatsky, Alexander; Manning, Christopher D.; Finn, Chelsea

Computer Science > Computation and Language

arXiv:2301.11305 (cs)

[Submitted on 26 Jan 2023 (v1), last revised 23 Jul 2023 (this version, v2)]

Title:DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

Authors:Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D. Manning, Chelsea Finn

View PDF

Abstract:The increasing fluency and widespread usage of large language models (LLMs) highlight the desirability of corresponding tools aiding detection of LLM-generated text. In this paper, we identify a property of the structure of an LLM's probability function that is useful for such detection. Specifically, we demonstrate that text sampled from an LLM tends to occupy negative curvature regions of the model's log probability function. Leveraging this observation, we then define a new curvature-based criterion for judging if a passage is generated from a given LLM. This approach, which we call DetectGPT, does not require training a separate classifier, collecting a dataset of real or generated passages, or explicitly watermarking generated text. It uses only log probabilities computed by the model of interest and random perturbations of the passage from another generic pre-trained language model (e.g., T5). We find DetectGPT is more discriminative than existing zero-shot methods for model sample detection, notably improving detection of fake news articles generated by 20B parameter GPT-NeoX from 0.81 AUROC for the strongest zero-shot baseline to 0.95 AUROC for DetectGPT. See this https URL for code, data, and other project information.

Comments:	ICML 2023
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2301.11305 [cs.CL]
	(or arXiv:2301.11305v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2301.11305

Submission history

From: Eric A Mitchell [view email]
[v1] Thu, 26 Jan 2023 18:44:06 UTC (3,041 KB)
[v2] Sun, 23 Jul 2023 04:18:36 UTC (1,229 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2023-01

Change to browse by:

cs
cs.AI

References & Citations

export BibTeX citation

Bookmark

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)

Connected Papers (What is Connected Papers?)

Litmaps (What is Litmaps?)

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub (What is DagsHub?)

Gotit.pub (What is GotitPub?)

Hugging Face (What is Huggingface?)

Papers with Code (What is Papers with Code?)

ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)

Hugging Face Spaces (What is Spaces?)

TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)

CORE Recommender (What is CORE?)

About arXivLabs

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)