Last updated on

From Prompts to Workflow Skills in Umbraco


Most teams still try to scale AI with prompts.

I understand why. Prompting is fast, accessible, and gives instant feedback. I still use it for quick one-off tasks. But if you run growth or marketing operations, prompt quality is not your real bottleneck.

Your real bottleneck is operational consistency.

You need repeatable outputs across people, channels, and deadlines. That is why this guide focuses on simple, skill-based workflows in Umbraco, not agent hype.

I will show you how to move from prompt hacking to workflow design with one orchestrator and chained skills using Codex and Claude. If you want supporting implementation references, both Anthropic’s agent engineering guide and the OpenAI Agents guide are useful baselines. You will also get concrete workflows for:

  • SEO planning and production (Ahrefs + Search Console)
  • content research and briefing
  • growth experimentation loops
  • paid ads performance analysis from LinkedIn or Meta CSV exports

Before we go deeper, one important clarification:

The GitHub setup I reference is a starter framework, not a final one-size-fits-all system. You should adapt workflows, thresholds, QA rules, and integrations to your own Umbraco setup.

Who this is for

This guide is for marketing operators, growth teams, and content leads who already use AI but need more consistency across recurring work.

If you only need occasional one-off prompt output, you can keep it simple. If your work repeats every week, workflow design is usually the better leverage point.

Key takeaways

  • Prompt quality matters, but process design matters more once work repeats.
  • Start with one orchestrator, one workflow, one owner, and clear QA thresholds.
  • Use explicit keep/change/discard logic so decisions are consistent between people.
  • Treat models as components in a workflow, not as the workflow itself.
  • Close the loop with outcomes so the system improves over time.

Why prompt-driven marketing does not scale

Prompting feels productive because it compresses effort into one interaction. The problem is that your team still has to rebuild context every time.

In real execution, that creates familiar failure modes:

  • outputs vary heavily between operators
  • structure and formatting drift over time
  • proof requirements and claim checks get skipped
  • decisions are not logged, so quality regressions are hard to trace
  • lessons from previous campaigns rarely feed into future runs

This is why teams using “better prompts” still feel stuck. They improved one layer, but not the system layer.

If your work repeats weekly, you need repeatable architecture.

The shift: prompts -> workflows -> skills

Here is the practical model I use:

  • Prompt: one isolated generation request
  • Workflow: ordered sequence of deterministic steps
  • Skill workflow: workflow plus tool access, constraints, logging, and handoffs

This shift changes your planning questions.

Instead of asking “what is the best prompt?”, you start asking:

  • Which steps are deterministic?
  • Which steps require model judgment?
  • Which steps need strict QA gates?
  • Where should a human approve before publish or budget moves?

That is the moment AI becomes useful in production, not just impressive in demos.

What this is actually for

If you landed here, your intent is usually a mix of:

  • educational: “what is this and how is it different from prompting?”
  • tactical: “how do I implement this with my existing team?”
  • problem-solution: “how do I get consistent quality without adding headcount?”

So I am not giving you theory alone. I am giving you an implementation pattern you can run this week in a normal Umbraco team.

The simple setup I use in Umbraco

You do not need a multi-agent platform. You need a clear operating setup:

  • One orchestrator: runs stages in order and logs each run
  • Workflow skills: research, SEO, data analysis, CRO, strategy, copywriting, social content, PPC/paid search, QA, and analytics
  • Model routing: use Claude or Codex based on task type
  • Tools you already use: Ahrefs, Search Console, ad CSV exports, analytics, Umbraco
  • One approval point: human check before risky actions
  • Feedback loop: outcomes feed next week’s priorities

If this feels like overkill, start smaller: one workflow, one owner, one SLA. Expand after you stabilize quality.

The point is not to run every skill at once. The point is to chain the ones that match the job in front of you.

Workflow example 1: SEO workflow (Ahrefs + Search Console + content production)

This is usually the highest leverage starting point for content and organic growth teams.

Inputs

  • Ahrefs keyword export (keyword, volume, difficulty, intent, trend)
  • Search Console page + query export (impressions, CTR, average position) from the Performance report
  • Existing content inventory
  • internal linking targets
  • brand voice and evidence requirements

Stage flow

  1. Keyword clustering skill
  • clusters semantically related terms
  • scores opportunities by demand, difficulty, and business fit
  • proposes one primary keyword + mapped secondary terms
  1. Search Console opportunity skill
  • maps clusters to live page/query data
  • flags “striking distance” opportunities (for example position 6-20)
  • identifies high-impression / low-CTR pages for title and intro rewrites
  • flags cannibalization risk where multiple pages compete for same intent
  1. Content brief skill
  • creates structured brief with:
    • intent target
    • audience and objection profile
    • section-level keyword map
    • evidence requirements
    • internal link opportunities
  1. Drafting skill
  • writes complete long-form draft in your house style
  • includes practical examples, not generic filler
  1. SEO + editorial QA skill
  • validates heading hierarchy and keyword placement
  • checks unsupported claims and readability
  • verifies internal links and publish metadata
  1. Refresh recommendation skill
  • outputs keep / update / merge / drop recommendations based on performance trend and topic overlap

Keep / update / discard logic

Use explicit rules so decisions are consistent across editors:

  • Keep: stable rankings, strong CTR, still aligned to business priority
  • Update: meaningful impressions but weak CTR, weak depth, or outdated examples
  • Discard or merge: low value, overlapping intent, no strategic role

That is where pairing Ahrefs with Search Console gets powerful. Ahrefs gives demand and difficulty signals. Search Console tells you where your current asset base is leaking value.

If you automate this ingestion, the Search Analytics API keeps your workflow refreshable without manual exports.

Workflow example 2: content research workflow

Prompting can generate text quickly. It does not automatically create differentiated positioning.

This workflow is for finding better angles before writing.

Inputs

  • competitor pages in your topic cluster
  • sales call notes or objections
  • support ticket themes
  • existing positioning statements

Stage flow

  1. Extraction skill
  • pulls thesis statements, section structure, claims, and proof style from competitor pages
  1. Pattern skill
  • finds repeated talking points and blind spots across the set
  1. Angle generation skill
  • proposes 3-5 differentiated angles tied to your market position
  1. Briefing skill
  • outputs production-ready brief with thesis, structure, examples, and risk notes

This keeps your content from sounding like everyone else repeating the same list post with different wording.

Workflow example 3: growth experimentation workflow

Most teams have no shortage of test ideas. They have a shortage of test discipline.

This workflow makes experimentation compounding instead of random.

Inputs

  • experiment history
  • baseline conversion metrics
  • segment-level data
  • constraints (sample size, confidence threshold, risk limits)

Stage flow

  1. Hypothesis skill
  • generates prioritized hypotheses by impact x confidence x effort
  1. Design validation skill
  • checks power assumptions, instrumentation coverage, and guardrails
  1. Results analysis skill
  • reads test output with both significance and effect size
  1. Next-step planner skill
  • recommends next tests based on observed interaction effects

If you are already using the A/B Test Lab, this workflow plugs directly into your interpretation process.

Workflow example 4: paid ads performance workflow (LinkedIn or Meta CSV)

This is the one paid teams usually ask for first, and for good reason.

You already export campaign data. The gap is usually decision quality and consistency after export.

For teams new to this, LinkedIn documents the export flow in Campaign Manager report exports.

Inputs

  • LinkedIn Ads or Meta Ads CSV export
  • naming convention map (campaign, ad set, ad, objective)
  • KPI hierarchy (pipeline, CPA, ROAS, CTR, CPL)
  • optional CRM outcome mapping

Stage flow

  1. Normalization skill
  • standardizes fields across platforms
  • aligns metric names and units (spend, impressions, clicks, conversions, CPL/CPA)
  1. Diagnostics skill
  • computes performance by objective, audience, creative, placement, and time window
  • flags instability and outliers
  1. Decision skill
  • outputs explicit keep / change / discard classification
  • uses your threshold rules, not random model preference
  1. Action recommendation skill
  • proposes specific next moves:
    • pause inefficient placements
    • shift budget to stronger cohorts
    • rotate fatigued creatives
    • refine audience definitions
    • generate new hooks based on highest CTR themes
  1. Execution brief skill
  • produces a weekly action plan with owner, due date, expected impact, and validation metric

Keep / change / discard rules (example)

You can start with simple policy logic:

  • Keep: above target efficiency for 2+ windows with acceptable volatility
  • Change: near threshold but one bottleneck is clear (creative, audience, placement)
  • Discard: below threshold across multiple windows with no valid recovery signal

This turns ad optimization into a system. Not a Monday spreadsheet debate.

Prompting vs workflow architecture

Here is the simple difference:

  • Prompt-only mode: ask model -> get output -> edit manually -> repeat
  • Workflow mode: structured inputs -> staged skill steps -> human approval -> publish -> learn from results

Prompting helps you produce one output. Workflow architecture helps you produce consistent outputs.

GitHub setup: how to use the framework the right way

Reference repo: nclaursen/agentic-marketing-repo

Important: this repository is a suggested starter setup, not a finished universal implementation.

Use it as a base architecture. Then adapt:

  • skill roles
  • workflow stages
  • KPI thresholds
  • QA policies
  • approval flow
  • tool integrations
  1. Clone repo and review structure
  2. Set environment variables and model/provider keys
  3. Configure one workflow only (SEO or ads analysis)
  4. Run on historical data first
  5. Compare workflow recommendation vs human decision
  6. Tune thresholds and prompts
  7. Expand to additional workflows

If you skip tuning and go straight to full automation, quality drift will catch you.

Practical implementation checklist

If you want a no-excuses starting point, use this checklist:

  • define one workflow owner and one backup owner
  • set input file naming standards for Ahrefs, Search Console, and ad exports
  • create one shared schema document for required columns
  • define QA thresholds before you run the first workflow run
  • set a weekly review slot to compare recommendations vs real outcomes
  • keep a short failure log with root cause and fix

This removes most coordination friction in small teams. It also makes onboarding much easier when new people join the workflow.

What to track so workflows actually improve

Workflow skills only compound when performance data feeds back into system design.

At minimum, track:

  • cycle time: time from input to publish-ready output
  • QA pass rate: percent of runs that pass without manual rework
  • decision accuracy: keep/change/discard recommendations validated by later outcomes
  • business impact: CTR, conversion rate, CPL/CPA, or pipeline contribution depending on workflow
  • stability: how often outputs vary for equivalent inputs

If these metrics do not improve after a few iterations, the issue is usually process design, not model capability. Tighten contracts and thresholds first.

Also keep your editorial quality bar aligned with Google’s helpful, reliable, people-first content guidance, especially when workflow output scales faster than human review.

Common mistakes to avoid

  • treating model choice as strategy
  • skipping schemas and contracts
  • automating before policy is defined
  • shipping outputs without QA
  • forgetting to feed performance data back into planning

Think of this as your marketing preflight check. If the process is weak, faster generation just helps you scale inconsistency.

Conclusion

The biggest shift is not from one model to another model.

The real shift is from isolated prompting to repeatable execution systems.

If you are serious about making AI useful in your team, start with one recurring process, design it as a workflow, apply strict QA, and adapt the starter repo to your own reality.

That is how you get leverage that compounds.

Quick summary

  • Move from isolated prompts to structured workflows when output consistency matters.
  • Keep deterministic steps explicit and protect risky steps with human approval.
  • Use shared schemas, QA gates, and policy logic before scaling automation.
  • Track cycle time, QA pass rate, decision accuracy, and business impact together.
  • Start small, tune, then expand once quality is stable.