AI Sales Architecture May 3, 2025 · 10 min read

How We Built a 7-Node LangGraph Pipeline for Cold Email (And Why Linear Chains Weren't Enough)

A behind-the-scenes look at why CarcMail uses LangGraph for AI cold email orchestration instead of a simple prompt chain — and what each of the seven nodes actually does to a lead before an email sends.

When we built the first version of CarcMail’s AI email generation, we used a simple sequential prompt chain: take lead data, generate an email, send it. It worked until it didn’t — spam scores were unpredictable, low-quality drafts shipped without review, and there was no clean way to retry a specific failure without re-running the whole chain.

We moved to LangGraph for the second generation, and it changed how we think about AI email orchestration entirely.

Why LangGraph Instead of a Prompt Chain

A prompt chain is just a sequence of LLM calls where the output of step N becomes the input to step N+1. It’s fine for simple pipelines. But cold email preparation isn’t simple:

  • Some leads need more data enrichment than others
  • Spam detection might require a rewrite loop (run spam check → rewrite if too risky → re-check)
  • Some steps (lead validation, spam scoring) don’t need an LLM at all — they’re deterministic
  • Failures in step 3 shouldn’t force a restart from step 1

LangGraph treats the pipeline as a directed graph of nodes where each node is a function (LLM call or deterministic logic) and edges are conditional transitions. Nodes only run when their predecessors complete successfully. The graph state persists across the run so any node can read what previous nodes produced.

This unlocks the retry loop that makes the spam checking actually useful.

The 7-Node Pipeline: What Each Node Does

The graph is compiled once at application startup via a registry and reused for every campaign run. Here’s each node:

Node 1: validate_lead

Type: Deterministic (no LLM)

Checks that the lead record has the minimum required fields: first name, company name, and a syntactically valid email address. Flags leads with missing data before they waste any LLM credits or affect deliverability.

Output: { valid: boolean, reason?: string }

If valid is false, the lead is marked as invalid in the SendLog and the graph terminates early for that lead. No email is drafted, no credit is consumed.

Node 2: prepare_email

Type: LLM (AI)

This is the main generation node. It receives:

  • The validated lead record (name, role, company)
  • The user’s AI Identity (company story, services, case studies, target persona, brand voice)
  • The campaign’s goal (book a call, get a response, share a resource)

The prompt instructs AI to write a cold email that references the lead’s specific role and company context, grounds the value proposition in one of the uploaded case studies, and ends with a single clear CTA. Hallucination is constrained by explicitly limiting the model to content available in the AI Identity — if no relevant case study exists, it’s told to omit the reference rather than invent one.

Output: { subject: string, body: string, personalization_signals: string[] }

Node 3: check_spam

Type: Deterministic (regex + scoring model)

Runs the draft through a spam-language scoring model. Checks for:

  • Trigger phrases (free, urgent, guaranteed, no risk, limited time, etc.)
  • Excessive capitalisation
  • Link density relative to body length
  • Excessive punctuation (!!!, ???)
  • Deceptive header patterns

Returns a score from 0–10. Scores above 4.0 trigger the rewrite path.

Output: { score: number, flags: string[] }

Node 4: rewrite_email (conditional)

Type: LLM (Claude Sonnet 4.6) — only runs if spam score > 4.0

Receives the draft plus the specific flags raised by the spam check. Instructs Claude to rewrite only the flagged sections while preserving the personalization signals. The rewrite prompt specifically tells the model not to introduce new claims not present in the AI Identity.

After rewriting, the graph routes back to check_spam for re-scoring. This loop runs a maximum of two times — if the score is still too high after two rewrites, the lead is flagged for human review rather than sending a potentially risky email.

Output: { subject: string, body: string } (updated draft)

Node 5: score

Type: LLM (Claude Sonnet 4.6 with structured output)

Quality-scores the final draft on three dimensions:

  • Personalization depth (1–10): Does the email reference specific, accurate details about the lead’s role and company?
  • Value clarity (1–10): Is the value proposition concrete and specific to the lead’s likely pain?
  • CTA specificity (1–10): Is the ask clear and singular?

Drafts scoring below 5.0 on any dimension are held for review. This gate catches technically-clean emails that are still generic or poorly structured.

Output: { score: number, breakdown: { personalization, value_clarity, cta } }

Node 6: send

Type: Deterministic (SMTP / OAuth)

Routes the email through the connected sender identity — Google OAuth or SMTP — using the pacing configuration for the campaign. Enforces the daily send cap for the user’s plan tier (40 / 250 / 900). Attaches the List-Unsubscribe header to every outgoing message.

Output: { delivered: boolean, message_id: string, timestamp: string }

Node 7: log

Type: Deterministic (database write)

Writes the complete run record to the SendLog table:

  • Lead ID and campaign ID
  • Final subject and body hash
  • Spam score and quality score
  • Node path taken (e.g., whether rewrite ran)
  • Delivery status and message ID
  • Timestamp

The log is immutable. It can be queried via the API or exported for compliance review. This is what makes the audit trail reliable — every email has a full provenance record.

The Spam Check Loop in Practice

The most useful structural benefit of using LangGraph is the conditional edge between check_spam and rewrite_email. In a linear chain you’d have to decide statically whether to include the rewrite step. With a graph, the decision is made at runtime based on the actual spam score.

prepare_email → check_spam → [score > 4.0] → rewrite_email → check_spam (again)
                           → [score ≤ 4.0] → score → send → log

In practice, about 15–20% of first drafts fail the spam check and go through at least one rewrite. The two-pass loop resolves the majority of those. Drafts that fail twice are queued for manual review rather than blocked entirely — a human reviewer can approve or discard them.

What LangGraph Gives You That Prompt Chains Don’t

  • Conditional routing without code duplication — the spam check loop is defined once as a conditional edge
  • Shared state across nodes — Node 7 can see exactly what path the graph took to get to the send stage
  • Per-node retries — if Node 6 (send) fails due to a transient SMTP error, only that node retries, not the whole pipeline
  • Real-time visibility — each node emits a graph_node SSE event as it activates, so the UI shows live progress

The pipeline is compiled once at startup (app/graphs/registry.py) and handles concurrency through async node execution. Multiple campaigns can run simultaneously against the same compiled graph.

The Tradeoff

LangGraph adds complexity. The graph definition, state schema, and conditional edge logic are more code than a sequential chain. For a genuinely simple pipeline — generate text, return it — a chain is the right choice.

Cold email preparation isn’t simple. The spam check loop alone justifies the graph structure. And the operational benefits (per-node retries, shared state, live progress events) matter at campaign scale in ways that only become apparent once you’re running hundreds of sends a day.

If you’re building any AI workflow that has conditional branches, retry loops, or needs a persistent run record, LangGraph is worth the learning curve.

More from CarcMail

Continue reading

← Back to all posts Launch outreach
Campaign-ready onboarding · Built for serious outbound teams

Ready to Put Your B2B Outbound on Complete Autopilot?

Join thousands of growing businesses using smart pre-send AI optimization.

Set up your sending workflow, invite your team, and scale when you are ready.