The Rebuttal

Reviewer 2 was brutal. Draft the response that wins the day.

Turn a scorched-earth peer review into a calm, surgical, point-by-point response — with AI as your adversarial pre-reviewer.

You opened the decision email, scrolled to Reviewer 2, and felt your heart sink. This challenge is about turning that moment into one of the most persuasive documents you’ll write as a researcher.

35 min Intermediate teams of 3-5 Any chat AI

The goal

The Goal You’re holding a harsh-but-fair peer review of a (fictional) pain-imaging manuscript. Your team’s job: produce a point-by-point response letter that an editor would read and think, “These authors are reasonable, rigorous, and worth publishing.” It must concede what’s genuinely wrong, defend what’s genuinely defensible, and stay unfailingly professional even where the reviewer was not. Bring the reviewer around without groveling, and without picking a fight. How you get there is up to you.

Why it matters

Peer review is a constant of academic life, and the response-to-reviewers letter is one of the highest-leverage documents you will write — a strong one can move a paper from “reject” to “accept,” and a defensive one can sink a salvageable manuscript. Yet almost no graduate program teaches it. In pain science especially, reviewers come from different traditions (fMRI skeptics, QST purists, clinical-translation advocates), so a single manuscript can draw critiques that pull in opposite directions. Learning to triage criticism, separate the fixable from the fatal, and write back with calm authority is a lasting career skill. AI is well suited to help here: it never takes the review personally, it can role-play a skeptical reviewer to stress-test your rebuttal before the editor does, and it can help you find gracious phrasing when you’re still too annoyed to find it yourself.

Run of show

0:00–0:05 · Challenge introduction (5 min)
0:05–0:20 · Work in your group (15 min)
0:20–0:22 · Post your best prompt (2 min)
0:22–0:32 · Share & debrief (10 min)
0:32–0:35 · Reset (3 min)

Bad prompt to better prompt

Weak prompt

Write a response to these reviewer comments. [paste review]

Why it disappoints: you get a generic, slightly defensive letter that thanks the reviewer in the abstract, restates the critiques without really answering them, promises vague “additional analyses,” and never tells you which points to concede versus contest. It has no structure an editor can follow and no awareness of what’s actually fixable in a revision.

Strong prompt

You are a senior pain-neuroimaging researcher who has handled 200+ revisions as a corresponding author. Below is Reviewer 2’s report on our fMRI study of placebo analgesia (n=24, healthy adults, thermal pain).

First, for EACH numbered comment: (a) classify it as VALID (we must change the manuscript), PARTIALLY VALID (concede the point but it doesn’t undermine our conclusions), or MISDIRECTED (reviewer misread us — we clarify, we don’t change findings); (b) give one sentence of reasoning.

Then draft a point-by-point response letter. For each point: open by thanking/agreeing where honest, state precisely what we changed or will add (or why no change is warranted), and quote where in the manuscript the change appears. Tone: warm, confident, never defensive, never groveling. Flag any place where I’d need real new data versus a reframing. Use “we” throughout.

Why it works: it assigns expert identity, forces triage before drafting (so concession and defense are deliberate, not accidental), demands manuscript-anchored specifics instead of vague promises, fixes the tone explicitly, and separates “needs new data” from “needs better wording” — which is exactly the call you have to make in a real revision.

Prompting moves to try

Decompose before you draft. Ask the AI to first triage every comment (valid / partially valid / misdirected) and only then write. Drafting and judging at the same time is what produces mushy letters.
Role-prompt the author voice. “You are a senior corresponding author known for gracious, surgical rebuttals.” Then, separately, “You are the editor — would this letter make you want to send the paper back to Reviewer 2?”
Use AI as an adversarial pre-reviewer. Paste your draft response and prompt: “You are Reviewer 2 reading our reply. Where would you push back, feel dismissed, or stay unconvinced? Be harsh.” Then patch the holes before a human finds them.
Ask for a confidence score. “For each of your responses, rate 1–5 how persuasive it is and flag the weakest one.” Self-evaluation surfaces the point that will actually sink you.
Tone surgery. “Rewrite this paragraph to remove any defensiveness or sarcasm while keeping the scientific point intact.” Great for the comment that made your blood boil.
Let it improve your prompt. “Before answering, what three pieces of context about our study would make your response letter stronger?” — then feed those back.

Starter materials

Paste the mock review below into your AI of choice. It concerns a fictional manuscript so you can focus entirely on the craft of the response.

The manuscript under review (one-paragraph summary) Title: “Predicting Placebo Analgesia from Resting-State Connectivity of the Periaqueductal Gray.” Design: 24 healthy adults underwent a conditioned placebo-analgesia paradigm with thermal pain. Resting-state fMRI was collected before conditioning. The authors report that baseline PAG–rostral ACC connectivity correlates with the magnitude of placebo analgesia (r = 0.52, p = 0.009) and conclude that resting connectivity is a “biomarker” that could “identify placebo responders in clinical trials.”

Reviewer 2 — for revision (harsh but fair)

The manuscript tackles an interesting question, but in its current form the conclusions substantially outrun the evidence. I have several serious concerns.

1. The sample is alarmingly small (n = 24) for a brain-behavior correlation, yet the authors report a single r = 0.52 as though it were stable. With this n, the 95% confidence interval on that correlation is enormous and almost certainly spans values that would change the story entirely. There is no power analysis, no bootstrap, no cross-validation, nothing. Calling an in-sample correlation a “biomarker” is, frankly, exactly the kind of overclaiming that has made this literature so hard to trust.

2. The PAG is roughly the size of a few voxels at the field strength reported, sits in a region notorious for physiological noise from cardiac and respiratory pulsatility, and the Methods say nothing about physiological noise correction. How do the authors know they are measuring PAG connectivity and not a heartbeat?

3. The placebo-analgesia outcome is built on conditioned expectations, but the design appears to confound expectation with simple habituation to repeated thermal stimulation. Without a no-treatment control arm, I do not see how the “analgesia” can be attributed to placebo rather than the well-known decline in pain ratings over repeated trials.

4. The framing throughout is breathless. The Discussion claims this could “identify placebo responders in clinical trials,” but nothing here was tested prospectively or out of sample. At minimum the clinical-translation language must be removed. I am also not convinced the authors have read the relevant prior work, as two obvious precedents go uncited.

Response scaffold (fill this in — works for any comment) For each point, draft three moves in order:

Acknowledge — one honest sentence of agreement or thanks (“The reviewer is right that…”).
Act — the concrete change: a new analysis, a softened claim, an added limitation, a clarification with a manuscript location (“We have added… see p. X, lines Y”).
Anchor — quote or paraphrase the revised text so the reviewer sees the result without hunting for it.

Triage cheat-sheet (decide before you draft) Concede & fix (cheap, strengthens the paper): the overclaiming/“biomarker” language (#1, #4), missing power/CI reporting (#1), missing physiological-noise detail (#2).
Defend with evidence or reframing: whether the effect survives cross-validation (#1 — run it and report it), whether PAG signal is plausibly real given your acquisition (#2).
The genuinely hard one: the habituation confound (#3) may need new data or a strong design-based argument — be honest if it’s a stated limitation rather than something you can fully rebut. Reviewers respect “you’re right, and here is exactly how far we can go” far more than a dodge.

Debrief questions

Which critique did your team decide to fully concede, and did conceding it make the letter weaker or stronger in the editor’s eyes?
Comment #3 (the habituation confound) is the one you probably can’t fix without new data. How did your AI handle a critique that has no clean answer — did it bluff, or did it tell the truth gracefully?
When you ran the AI as an adversarial Reviewer 2 against your own draft, what hole did it find that you’d missed?
Where did the AI’s tone go wrong — too obsequious, too combative, or too vague — and what prompt fixed it?
Did any group get the AI to invent a citation, statistic, or analysis that doesn’t exist? How would you catch that before it reached an editor?

Level up

Make it bilingual in register. Produce both the formal point-by-point letter to the editor AND the tracked-changes-style summary of what actually moved in the manuscript — and check they tell the same story.
Stress-test against a second reviewer. Have the AI generate a plausible Reviewer 1 whose preferences contradict Reviewer 2’s (e.g., wants more clinical framing, not less), then write a response that satisfies both without contradicting yourself.
Quantify the rescue. Ask the AI to score your manuscript’s odds of acceptance before and after your response, with the reasoning — then argue with its estimate.

Back to the Challenge menu · Bring a strategy from the AI Toolkit.