Why begging an LLM to "only return valid JSON" will eventually crash your backend.
When you build an app that relies on an LLM (like an AI travel planner), your frontend code expects the LLM to return data in a specific format (an "Output Contract"), usually JSON. You prompt the LLM: "Return a JSON array of cities. Do not include any other text." Most of the time, it works. But occasionally, the LLM hallucinates and adds "Here is your JSON: ```json..." before the actual data, or misses a trailing comma. When your backend calls JSON.parse(), it throws a fatal error and the user sees a broken app.
You cannot solve this with better prompting. LLMs are probabilistic text generators; they will eventually break the contract. You must solve it with engineering:
response_format) force the LLM at the token-level to ONLY output tokens that fit a JSON Schema. It physically cannot generate invalid JSON.try/catch. If it fails, use Regex to strip out conversational filler (like markdown ticks).// Pseudocode for a robust LLM call
async function getTravelPlanWithRetry(prompt, retries = 3) {
let currentPrompt = prompt;
for (let i = 0; i < retries; i++) {
// 1. Force JSON schema at the API level
const response = await llm.generate(currentPrompt, { format: "json" });
try {
// 2. Strip markdown wrappers just in case
const cleanText = response.replace(/```json|```/g, "").trim();
const data = JSON.parse(cleanText);
// 3. Validate business logic (e.g. using Zod)
if (!data.cities || data.cities.length === 0) throw new Error("Missing cities array");
return data;
} catch (error) {
// 4. Auto-correction: tell the LLM it failed
currentPrompt = `${prompt}\n\nYou failed previously: ${error.message}. Fix it.`;
}
}
throw new Error("LLM failed to adhere to contract after 3 retries.");
}
Implementing auto-retry loops increases latency and API costs. If the LLM fails on the first try, the user has to wait double the time while the backend silently tries again. Constrained Decoding (Structured Outputs) is vastly superior because it solves the problem on the first pass, though it requires using specific LLM providers that support it.
"destination" to "city". Your JSON.parse() will succeed, but your frontend will still crash when it tries to render item.destination.toUpperCase(). Always validate the structure of the parsed object using a library like Zod or Pydantic.