How to Detect if an Essay Was Written by AI

I’ve been reading student essays for fifteen years now, and something shifted around 2022. Not gradually. Suddenly. One day I’m grading papers that feel like papers–messy, human, occasionally brilliant in unexpected ways. The next semester, I’m staring at submissions that read like they’ve been processed through some kind of linguistic smoothing machine. Perfect. Sterile. Suspiciously perfect.

The rise of ChatGPT and similar language models has forced educators, employers, and anyone who cares about authentic writing to develop new instincts. I don’t mean paranoia. I mean actual discernment. And I’ve learned that detecting AI-written essays isn’t about running text through some magic detector tool. Those exist, sure, but they’re unreliable. It’s about understanding how human writing actually works and recognizing when something fundamental is missing.

The Texture Problem

Here’s what I notice first: texture. Human writing has friction. It has moments where the writer is clearly thinking, revising mid-sentence, doubling back. When I read an essay written by a real person, I can almost feel the writer’s uncertainty. They’ll start a sentence one way, catch themselves, and pivot. They’ll use a word that’s slightly imprecise because it’s the closest thing they could grab from their vocabulary in that moment.

AI writing doesn’t have this. It’s too smooth. Too considered. Every transition flows into the next one with an almost mathematical precision. The vocabulary is consistently sophisticated without ever feeling strained. There’s no moment where you think, “Oh, they were reaching for a word they weren’t quite sure about.” Everything lands exactly where it’s supposed to.

I started paying attention to what linguists call “disfluency markers”–those little verbal tics that humans produce when they’re actually composing. Words like “actually,” “I mean,” “sort of,” “kind of.” These aren’t filler. They’re evidence of thought happening in real time. AI tends to minimize these because they’re technically redundant. But that’s exactly why their absence is so telling.

Argument Architecture and the Missing Mess

When I ask students to write an essay, I’m not just asking them to present information. I’m asking them to think. That means they’ll often start with one thesis and discover halfway through that they need to revise it. They’ll realize their second point actually contradicts their first point, and they’ll have to work through that contradiction. Real thinking is messy.

AI-generated essays don’t have this problem because they’re not thinking. They’re predicting. They’re generating the most statistically likely next sequence of words based on patterns in their training data. This means the argument structure is almost always perfectly balanced. Three main points. Each point gets roughly equal development. The conclusion restates the thesis without adding anything new.

This is actually a red flag. Human writers rarely produce such symmetrical arguments. We get passionate about one point and spend three times as much space on it as the others. We realize midway through that we need to concede something to the opposing view. We find ourselves defending a position we didn’t expect to defend when we started writing.

I’ve also noticed that AI essays almost never contain genuine intellectual struggle. There’s no moment where the writer says, “This is harder to defend than I thought.” There’s no moment of real vulnerability or uncertainty. Everything is presented with the same level of confidence, which is actually a sign that nothing is being genuinely questioned.

The Specificity Paradox

Here’s something counterintuitive: AI essays often include specific details, but those details are frequently generic. They’ll cite a study without quite getting the methodology right. They’ll reference a historical event but miss the crucial context that would make it meaningful. They’ll use a quote that sounds right but isn’t quite accurate.

I started checking. I’d see a reference to a particular research study and look it up. The citation would be close but not exact. The details would be plausible but slightly off. It’s as if the AI is hallucinating specificity. It knows that specific details are what make writing convincing, so it generates them. But it doesn’t actually know whether they’re true.

Real student writing has a different problem. When students include specific details, they’re usually either accurate or obviously wrong. They either did the research or they didn’t. There’s rarely this uncanny valley of plausibility-without-accuracy.

Voice and the Absence of Personality

Every writer has a voice. Even bad writers. Even students who are just starting to develop their writing skills have some quirk, some preference, some way of approaching language that’s distinctly theirs. I can usually identify a student’s essay just by reading the first paragraph because I know how they think and how they express that thinking.

AI doesn’t have a voice. It has a style. The style is professional, measured, and completely interchangeable. You could swap the AI-written essay from one student with the AI-written essay from another student, and they’d be nearly indistinguishable except for the topic.

This is why I’ve started asking students to write in class, under timed conditions. It’s not a perfect solution, but it forces them to produce writing that’s closer to their actual voice because there’s no time for extensive revision or external assistance. The difference between what they produce in class and what they submit online can be revealing.

Red Flags in Structure and Language

I’ve compiled a mental checklist of things that make me suspicious:

Excessive use of transition phrases like “Furthermore,” “In addition,” and “It is important to note that” appearing more than once per paragraph
Every paragraph beginning with a topic sentence that directly restates the thesis
Vocabulary that’s consistently sophisticated but never quite surprising or unexpected
Absence of contractions (AI tends to avoid these)
Perfect grammar throughout, including in places where a human writer might intentionally break grammar for effect
No citations or sources, combined with claims that sound authoritative
Paragraphs that are exactly the same length, suggesting algorithmic generation
Conclusions that add nothing new and simply restate the introduction

The Context Matters

I should be clear: not all polished writing is AI-generated. Some students are genuinely skilled writers. Some have hired a paper writing service to help them understand the assignment better. Some have worked with tutors. The difference is that when I talk to these students about their essays, they can discuss the choices they made. They can explain why they structured an argument a particular way. They can defend a claim they made.

When I ask a student who submitted an AI-generated essay to discuss their work, there’s often a disconnect. They can’t quite articulate why they made certain choices because they didn’t make them. They can explain the general topic, but not the specific reasoning behind the specific sentences.

I’ve also learned that the guide to paying for essay writing services has become more sophisticated. Some services now offer essays that are designed to look more human, more flawed. They’ll intentionally include minor grammatical errors or slightly awkward phrasing. But even these have a quality of intentional imperfection that’s different from genuine human writing. It’s like someone trying to act natural–you can usually tell.

Detection Tools and Their Limitations

There are tools available now. Turnitin has added AI detection. OpenAI released a detection tool, then quietly deprecated it. Various startups have built detectors. But here’s what I’ve learned: they’re all imperfect. False positives are common. A student with ESL background might score high on AI detection because their writing is unusually formal. A student using sophisticated vocabulary might trigger alerts.

I’ve also seen AI-generated text that passes these detectors. The technology is evolving faster than the detection methods. It’s an arms race, and the arms manufacturers are losing.

Detection Method	Reliability	False Positive Rate	Best Used For
Turnitin AI Detection	Moderate	High	Initial screening
Manual reading and analysis	High	Low	Confirmation and context
Student interviews about work	Very High	Very Low	Determining authorship
In-class writing samples	Very High	Very Low	Establishing baseline voice
Third-party AI detectors	Low to Moderate	Very High	Not recommended alone

The Bigger Picture

I think about this differently now than I did two years ago. When I first started noticing AI-generated essays, I was angry. I felt like students were cheating, taking shortcuts, avoiding the actual work of thinking and writing. I still think that’s true in some cases.

But I’ve also started to wonder whether the problem is partly with how we assign writing. If students are so desperate to avoid the work that they’re willing to risk academic integrity violations, maybe we need to reconsider what we’re asking them to do. Maybe the steps to writing a lab report shouldn’t be something they can outsource. Maybe the essay assignment itself needs to evolve.

That doesn’t mean accepting AI-generated work. It means being more intentional about what we’re actually trying to teach when we assign writing. It means building in processes that make it harder to cheat and more rewarding to engage authentically.

The reality is that AI isn’t going away. Students will continue to have access to these tools. The question isn’t how to eliminate AI from education. It’s how to teach students to think critically about when and how to use