Google Research Shows Prompt Repetition Boosts Accuracy 76%

Repeating an instruction or question twice in a prompt can dramatically improve the correctness of large language model (LLM) outputs. Google Research tested this simple copy‑paste trick on several popular non‑reasoning models and observed accuracy gains up to 76 percentage points, turning low‑performing answers into near‑perfect results without extra compute.

Experiment Overview: Repeating Prompts

The study evaluated seven widely used non‑reasoning LLMs, including Google Gemini 2.0 Flash, Gemini 2.0 Flash Lite, OpenAI GPT‑4o, GPT‑4o‑mini, Anthropic Claude 3 Haiku, Claude 3.7 Sonnet, and DeepSeek V3. Each model received the same query in two formats: a single‑prompt version and a duplicated version where the exact question appeared back‑to‑back.

Example transformation:

How many columns are at the entrance of St. Peter’s Basilica in the Vatican? How many columns are at the entrance of St. Peter’s Basilica in the Vatican?

Benchmarks and Test Suites

  • ARC
  • OpenBookQA
  • GSM8K
  • MMLU‑Pro
  • MATH
  • Custom NameIndex challenge
  • Custom MiddleMatch challenge

Key Results

Across 70 distinct test tasks, the duplicated‑prompt approach outperformed the single‑prompt baseline in 47 cases and never reduced accuracy. Overall, the technique improved results in 67 % of benchmark runs. The most striking improvement was a jump from 21.33 % to 97.33 % on a specific task for one model.

Why Repetition Works

Non‑reasoning transformer models generate tokens left‑to‑right, relying only on previously seen tokens. Presenting the prompt twice gives the model a “second look” at the full context, reinforcing the semantic cue and correcting early misinterpretations. Reasoning‑oriented models already perform internal re‑phrasing, so they gain less from external repetition.

Implications for Developers and Enterprises

The method requires no architectural changes, extra compute, or additional latency, as the duplicated prompt is processed in a single forward pass. It offers an immediate accuracy uplift for real‑time chatbots, code assistants, and low‑latency search augmentation that depend on fast, cost‑effective LLM inference.

Practical Prompt Engineering Tip

When using a non‑reasoning model, simply copy the instruction and paste it once more before the query. This low‑effort tweak can be combined with other strategies such as few‑shot examples for further gains.

Caveats and Future Directions

The research focused on non‑reasoning models; effects on larger, reasoning‑heavy systems remain untested. Performance may vary with prompt length, token limits, or domain‑specific language. Future work could explore optimal repetition counts, interactions with other prompting techniques, and applicability to multimodal generators.

Bottom Line

Google Research demonstrates that a straightforward copy‑paste of the instruction—repeating it twice—can transform mediocre LLM answers into near‑perfect ones for many fast, non‑reasoning models. This discovery highlights that some of the most powerful engineering solutions are also the simplest.