LLM Output Contract Breaks

Why begging an LLM to "only return valid JSON" will eventually crash your backend.

The idea

When you build an app that relies on an LLM (like an AI travel planner), your frontend code expects the LLM to return data in a specific format (an "Output Contract"), usually JSON. You prompt the LLM: "Return a JSON array of cities. Do not include any other text." Most of the time, it works. But occasionally, the LLM hallucinates and adds "Here is your JSON: ```json..." before the actual data, or misses a trailing comma. When your backend calls JSON.parse(), it throws a fatal error and the user sees a broken app.

Step 1: The happy path. The LLM follows instructions perfectly, returning raw JSON. The app parses it and renders.

How it works (Structured Outputs & Retries)

You cannot solve this with better prompting. LLMs are probabilistic text generators; they will eventually break the contract. You must solve it with engineering:

  1. Constrained Decoding (Structured Outputs): Modern APIs (like OpenAI's response_format) force the LLM at the token-level to ONLY output tokens that fit a JSON Schema. It physically cannot generate invalid JSON.
  2. Defensive Parsing: Always wrap your parsing logic in a try/catch. If it fails, use Regex to strip out conversational filler (like markdown ticks).
  3. Auto-Retry Loop: If the parsed JSON fails schema validation (e.g., missing a required "price" field), catch the error, and automatically send the error message back to the LLM so it can fix its own mistake.
// Pseudocode for a robust LLM call

async function getTravelPlanWithRetry(prompt, retries = 3) {
    let currentPrompt = prompt;
    
    for (let i = 0; i < retries; i++) {
        // 1. Force JSON schema at the API level
        const response = await llm.generate(currentPrompt, { format: "json" });
        
        try {
            // 2. Strip markdown wrappers just in case
            const cleanText = response.replace(/```json|```/g, "").trim();
            const data = JSON.parse(cleanText);
            
            // 3. Validate business logic (e.g. using Zod)
            if (!data.cities || data.cities.length === 0) throw new Error("Missing cities array");
            return data;
            
        } catch (error) {
            // 4. Auto-correction: tell the LLM it failed
            currentPrompt = `${prompt}\n\nYou failed previously: ${error.message}. Fix it.`;
        }
    }
    throw new Error("LLM failed to adhere to contract after 3 retries.");
}

Cost

Implementing auto-retry loops increases latency and API costs. If the LLM fails on the first try, the user has to wait double the time while the backend silently tries again. Constrained Decoding (Structured Outputs) is vastly superior because it solves the problem on the first pass, though it requires using specific LLM providers that support it.

Watch out for