Overview

Style Representations trained with the approach described in "Unsupervised Style Representation Learning for AI-Text Detection via Paraphrase Inversion". These representations are trained in an unsupervised manner, that is, without the use of any authorship labels. We've found the representations to be performant for machine-text detection in particular, although they show some transfer to the tasks of authorship verification (see below).

We expect to release more performant versions of LUSR in the future. We'll link all such versions here.

Few-Shot Machine-text Detection

The following table shows machine-text detection performance on the M4 dataset using the same setup as: Few-Shot Detection of Machine-Generated Text using Style Representations .

Zero-Shot Approaches AUROC(1)
Binoculars 69
FastDetectGPT 65
Rank 50
LogRank 50
LRR 50
Revise-Detect 60
DNA-GPT 51
Supervised Classifiers
Rank 50
Longformer 58
RADAR 50
RemoDetect 64
Few-Shot Approaches k=1 k=5
LUAR CRUD 60 87
LUAR Multi-LLM 61 88
LUAR Multidomain 60 89
CISR 58 84
ProtoNet 61 87
SBERT 52 62
LUSR 69 96

Authorship Verification

AUROC is averaged across PAN13/14/15/20/21 authorship verification tasks. The comparison includes supervised and unsupervised style representations, with LUSR evaluated without any training on authorship labels.

Model AUROC
Supervised
LUAR 78
MSR 73
CISR 70
StyleDistance 73
--- ---
Unsupervised
LUSR 74
Downloads last month
479
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rrivera1849/LUSR

Finetuned
(2328)
this model

Collection including rrivera1849/LUSR

Paper for rrivera1849/LUSR