These are the load-bearing moments of the chat-first sandbox after the user submits their attempt: streaming (mid-evaluation), hint expanded (after pressing Hint 1), success passed (perfect attempt, lesson complete), and error / network failure. Stacked here for review; each is the same chat shell with different content rendered into the conversation column. Style B (Luminous Dark Glass).
You are a senior marketing analyst at a B2B SaaS company. Given: our last 3 product launches under-performed against pipeline targets by 30–40%. We're preparing for Q3 and need to avoid a 4th miss…
Solid structure. You hit four of the five elements cleanly. Here's the breakdown|
● AINA is composing feedback. You can keep typing — your next attempt will queue.
Look at what's missing, not what's there.
You named the role and the task — the model knows who it's playing and what to do. But re-read your prompt and ask: would a brand-new analyst at this company have enough to start? What would they want to know first?
"Add 1–2 sentences of business situation before the task. Market, competition, timing — pick what matters most."
Take a swing. I'll be here.
You are a senior marketing analyst at a B2B SaaS company in the sales-engagement category… Context: We've launched 3 products in 18 months. Outreach shipped a similar feature 60 days ago. Pipeline targets are owned by RevOps and updated quarterly. Task: Identify the top 3 root causes our last 3 launches missed pipeline targets by 30–40%… Success criteria: A senior PM should be able to act on findings within one week. No vague recommendations. Return as: 1-page exec brief. TLDR (3 sentences) → Findings (3 bullets each) → 1-week action plan.
There it is.
All five elements present, all five strong. The Context and Output Format upgrades were the unlock — the model now has both the situation AND a definition of "good" to self-check against.
Your AI Readiness Score moved from 68 to 71. Path locked at 44% complete.
Take this skill to a real Tuesday task.
You came in hoping AI would help with your launch playbook analysis. The prompt you just wrote is exactly the one you'd use in production. Want to run it now? I'll save the conversation to your private workspace.
You are a senior marketing analyst…
I couldn't reach the evaluator just now.
Your attempt is saved — I have the full prompt locally and your draft in the editor isn't lost. This usually clears in 10–30 seconds. I'll auto-retry, or you can press retry yourself.
If this keeps happening: something is wrong on our side, not yours. Status page · Try again later
All four states share the same chat shell, the same composer, the same sidebars. The differences are entirely what gets rendered into the conversation column. That's the whole point of the chat-first move.