Last updated: 2026-04-21
| File | Status |
|---|---|
scripts/hubspot-posts-cache.json |
✅ Done — ~1,200 posts cached |
scripts/hubspot-posts-content.json |
✅ Done — full HTML bodies |
scripts/hubspot-posts-enriched.json |
✅ Done — enriched metadata |
scripts/translation-groups-ai.json |
✅ Done — posts grouped by language |
scripts/hubspot-users.json |
✅ Done — authors/users |
scripts/hubspot-tags.json |
✅ Done — tags |
scripts/hubspot-blogs.json |
✅ Done — blog configs |
scripts/all-domains-assets.json |
✅ Done — asset inventory per domain |
scripts/hubspot-posts-blocks.json |
🔄 Partial — Claude block analysis (resumable) |
scripts/taxonomy-output/ |
🔄 Partial — de-de run complete; remaining domains pending |
scripts/seo-scores.json |
🔄 Partial — en-us scored; re-run after each domain migrates |
scripts/seo-scores-bad.json |
🔄 Partial — failing posts below threshold (en-us) |
scripts/seo-scores.csv |
🔄 Partial — spreadsheet export (en-us) |
Testing flag: all scripts accept
--limit <n>to process onlynitems. Use this for spot-checks only — never for a full domain run.
legacyHtml field: Needs to be added to blogPost.ts schema — confirm before Phase 6.bodyText, bodyImage, bodyVideo, bodyBlockquote, comparisonTable, faqItem, howToModule) the full list, or are there HubSpot modules (sliders, CTAs, forms) that need new schema types before analysis runs?All per-domain runs use the full locale code from HubSpot (language field in hubspot-blogs.json).
| Country | Domain code | HubSpot blog name | HubSpot domain |
|---|---|---|---|
| USA / English | en-us |
The Cold Jet Blog | blog.coldjet.com |
| Germany | de-de |
Cold Jet Germany Blog | blog-de.coldjet.com |
| France | fr-fr |
Cold Jet France Blog | blog-fr.coldjet.com |
| Netherlands | nl-nl |
Cold Jet Netherlands Blog | blog-nl.coldjet.com |
| Belgium (French) | fr-be |
Cold Jet Belgium (French) Blog | blog-fr-be.coldjet.com |
| Mexico | es-mx |
Cold Jet Mexico Blog | blog-mx.coldjet.com |
| China | zh-cn |
Cold Jet China Blog | blog-cn.coldjet.com |
| Poland | pl-pl |
Cold Jet Poland Blog | blog-pl.coldjet.com |
| Japan | ja-jp |
Cold Jet Japan Blog | blog-ja.coldjet.com |
| Brazil | pt-br |
Cold Jet Brazil Blog | blog-pt-br.coldjet.com |
| Belgium (Dutch) | nl-be |
Cold Jet Belgium (Dutch) Blog | blog-nl-be.coldjet.com |
| Spain | es-es |
Cold Jet Spain Blog | blog-es.coldjet.com |
| Portugal | pt-pt |
Cold Jet Portugal Blog | (no domain set) |
Phase 1 — Data (mostly already done)
└── npm run enrich:posts # refresh if needed
Phase 2 — Excel taxonomy → JSON → Sanity
└── Copy excels to scripts/excels/
└── Copy & fill scripts/excel-convert-config.json from example
└── npm install # installs xlsx package
└── npm run convert:taxonomy:excel -- --domain de-de # one domain
└── npm run convert:taxonomy:excel # all domains
└── Review scripts/taxonomy-output/merged/
└── npm run push:taxonomy -- --dataset staging
Phase 3 — Asset dedup (manual)
└── Edit scripts/all-domains-assets.json manually
└── npm run check:dom-url
Phase 4 — Upload assets → Brandfolder
└── npm run upload:brandfolder
Phase 5 — Block analysis (Claude AI)
└── npm run analyze:post-blocks:claude -- --domain de-de # one domain
└── npm run analyze:post-blocks:claude # all domains
Phase 6 — Blog posts → Sanity (per domain)
└── npm run push:posts:sanity -- --domain de-de --dataset staging
└── [QA] spot-check in Sanity Studio
└── npm run push:posts:sanity -- --domain de-de --dataset production
└── Repeat: fr-fr → en-us → es-es → pl-pl → ...
Phase 7 — URL validation
└── npm run check:dom-url
Phase 8 — QA
└── npx tsc --noEmit && npm run lint
└── Manual review in Sanity Studio
All data is already cached. Verify it is complete and up to date before any writes.
# Prints the total number of cached posts to confirm the cache is not empty or truncated
# No output files — console only
node -e "const d=require('./scripts/hubspot-posts-cache.json'); console.log(d.length, 'posts')"
# Fetches all blog configurations (names, IDs, domains) from HubSpot API
# Output → scripts/hubspot-blogs.json
npm run fetch:blogs
# Fetches all HubSpot portal users (name, email, userId)
# Output → scripts/hubspot-users.json
npm run fetch:users
# Fetches all HubSpot blog tags / taxonomy
# Output → scripts/hubspot-tags.json
npm run fetch:taxonomy
# Fetches the full HTML body of every blog post (paginated, concurrent)
# Output → scripts/hubspot-posts-content.json
npm run fetch:post-bodies
# Merges post metadata + bodies + blog info into one enriched dataset
# Reads → hubspot-posts-cache.json + hubspot-posts-content.json + hubspot-blogs.json
# Output → scripts/hubspot-posts-enriched.json
npm run enrich:posts
# Groups posts that are translations of each other using AI matching
# Reads → scripts/hubspot-posts-enriched.json
# Output → scripts/translation-groups-ai.json (array of groups, each group = same article in N languages)
# → scripts/language-orphans-ai.json (posts with no matched translation)
# → scripts/coverage-matrix.csv (language × domain coverage table)
npm run check:language-coverage:ai
# Prints all fields of one post to the console — useful for debugging field mappings
# No output files — console only
npm run inspect:post -- --id <hubspot_post_id>
There are 4 tag types + authors that come from Excel files (one per domain), not from HubSpot. They are richer and more categorised than anything in HubSpot.
Tags sheet — de-de only, shared globally)| Tag type | Sanity doc type | Field on blogPost |
|---|---|---|
| Technology Tags | technologyTag |
technologyTags[] → array of references |
| Product Model Tags | productModelTag |
productModelTags[] → array of references |
| Industry Tags | industryTag |
industryTags[] → array of references |
| Application Tags | applicationTag |
applicationTags[] → array of references |
Authors - Person Schema sheet — de-de only)Extracted fields per author:
| Excel column | Output key | Notes |
|---|---|---|
Keep as Author? |
keepAsAuthor |
Replace or Yes |
Full Name (for Sanity) |
name |
Used as the Sanity person display name + slug source |
Job Title |
jobTitle |
|
Bio (50-100 words) |
bio |
|
Headshot Available? |
headshotAvailable |
|
LinkedIn URL |
linkedIn |
|
Knows About (3-5 topics) |
knowsAbout |
Split by comma into array |
Credentials |
credentials |
|
Education |
education |
Template/instruction rows are automatically filtered out using skipIfColumnEmpty: "#" and excludeNames: ["Map to Created By"] in the config.
Plus the existing HubSpot sources:
| Type | Sanity doc type | Source |
|---|---|---|
| Blog categories | category |
hubspot-blogs.json + hubspot-tags.json |
# Installs all npm packages including xlsx (SheetJS) which is needed to read .xlsx files
# No output files
npm install
One Excel file per domain. The config (excel-convert-config.json) already covers the first 5 — add an entry for each additional domain you have an Excel file for:
scripts/excels/
blogs-de-de.xlsx ← Tags sheet + Authors sheet + Blog Posts sheet
blogs-fr-fr.xlsx ← Blog Posts sheet only
blogs-en-us.xlsx
blogs-es-es.xlsx
blogs-pl-pl.xlsx
blogs-es-mx.xlsx ← add to config if available
blogs-fr-be.xlsx
blogs-nl-nl.xlsx
blogs-nl-be.xlsx
blogs-pt-br.xlsx
blogs-ja-jp.xlsx
blogs-pt-pt.xlsx
# Copies the example config — sheet names and column headers are already filled in
# Output → scripts/excel-convert-config.json (ready to use, just add your domain file paths)
cp scripts/excel-convert-config.example.json scripts/excel-convert-config.json
The config is pre-filled with the real sheet and column structure:
Sheet: Tags — read from de-de only (identical across all files)
| Excel column | Output key | Tag type |
|---|---|---|
industries-tags |
name |
industryTags |
Technology Tags |
name |
technologyTags |
Product Model Tags |
name |
productModelTags |
Application Tags |
name |
applicationTags |
Each column is extracted as a separate outputType — 4 passes over the same sheet, one per tag type.
Sheet: Authors - Person Schema — read from de-de only, outputs authors in all-tags.json
| Excel column | Output key |
|---|---|
Keep as Author? |
keepAsAuthor |
Full Name (for Sanity) |
name (also generates slug) |
Job Title |
jobTitle |
Bio (50-100 words) |
bio |
Headshot Available? |
headshotAvailable |
LinkedIn URL |
linkedIn |
Knows About (3-5 topics) |
knowsAbout (array) |
Credentials |
credentials |
Education |
education |
Sheet: Blog Posts — read from every domain file
| Excel column | Output key |
|---|---|
URL |
url (post identifier) |
Technology Tags |
technologyTags |
Product Model Tags |
productModelTags |
Industry Tags |
industryTags |
Application Tags |
applicationTags |
Tag columns contain comma-separated tag names. push-taxonomy-to-sanity.ts splits them and resolves each name to a Sanity document ID.
Only the de-de entry includes the Tags sheet. All other domains only include Blog posts — the tag lists are shared globally and deduplication in merged/ handles any overlap.
# Opens the Excel file and prints every sheet name + its column headers, then exits
# No output files — console only
npm run convert:taxonomy:excel -- --file scripts/excels/blogs-de-de.xlsx
# Converts a single domain's Excel file to JSON
# Reads → scripts/excels/blogs-de-de.xlsx (via config)
# Output → scripts/taxonomy-output/technologyTags-de-de.json
# → scripts/taxonomy-output/productModelTags-de-de.json
# → scripts/taxonomy-output/industryTags-de-de.json
# → scripts/taxonomy-output/applicationTags-de-de.json
# → scripts/taxonomy-output/posts-de-de.json (post → tag mapping rows)
npm run convert:taxonomy:excel -- --domain de-de
# Converts all domains defined in the config, then deduplicates across all languages
# Reads → all Excel files listed in scripts/excel-convert-config.json
# Output → scripts/taxonomy-output/<type>-<domain>.json (one per sheet per domain)
# → scripts/taxonomy-output/merged/technologyTags.json (deduplicated — seed into Sanity)
# → scripts/taxonomy-output/merged/productModelTags.json
# → scripts/taxonomy-output/merged/industryTags.json
# → scripts/taxonomy-output/merged/applicationTags.json
npm run convert:taxonomy:excel
# Same as above but also writes .xlsx files alongside the JSON (optional review format)
# Output → same as above + scripts/taxonomy-output/<type>-<domain>.xlsx
npm run convert:taxonomy:excel -- --format excel
scripts/taxonomy-output/
all-tags.json ← single object with all 5 types (seed into Sanity)
posts-de-de.json ← one row per post: url + 4 tag columns (arrays of tag names)
posts-fr-fr.json
posts-en-us.json
...
all-tags.json structure:
{
"industryTags": [ { "name": "Automotive", "slug": "automotive" }, ... ],
"technologyTags": [ { "name": "CO2 Cleaning", "slug": "co2-cleaning" }, ... ],
"productModelTags": [ { "name": "i3 MicroClean", "slug": "i3-microclean" }, ... ],
"applicationTags": [ { "name": "Surface Cleaning", "slug": "surface-cleaning" }, ... ],
"authors": [
{
"keepAsAuthor": "Replace",
"name": "Matt Caminiti",
"jobTitle": "Director, Corporate Marketing Communications & Strategy",
"bio": "",
"headshotAvailable": "",
"linkedIn": "https://www.linkedin.com/in/matt-caminiti/",
"knowsAbout": [],
"credentials": "",
"education": "",
"slug": "matt-caminiti",
"language": "de-de"
}
]
}
Each post row:
{
"domain": "de-de",
"url": "https://www.coldjet.com/de/blog/article-slug/",
"technologyTags": ["CO2 Cleaning", "Dry Ice"],
"productModelTags": ["i3 MicroClean"],
"industryTags": ["Automotive", "Aerospace"],
"applicationTags": ["Surface Cleaning"]
}
New tag schema files (shape: name, slug, description):
src/sanity/schemaTypes/technologyTag.ts
src/sanity/schemaTypes/productModelTag.ts
src/sanity/schemaTypes/industryTag.ts
src/sanity/schemaTypes/applicationTag.ts
person.ts — ensure it has fields matching the Excel author columns:
name, slug, jobTitle, bio, headshotAvailable, linkedIn,
knowsAbout (array of string), credentials, education, language
blogPost.ts — add a taxonomy group with 4 new reference array fields:
technologyTags → array of reference → technologyTag
productModelTags → array of reference → productModelTag
industryTags → array of reference → industryTag
applicationTags → array of reference → applicationTag
blogPost.ts — add legacyHtml field for rollback safety:
{ name: 'legacyHtml', type: 'text', readOnly: true, hidden: true }
index.ts — register all 4 new types.
# Creates category, person, and all 4 tag type documents in the Sanity staging dataset
# Reads → scripts/taxonomy-output/merged/technologyTags.json
# → scripts/taxonomy-output/merged/productModelTags.json
# → scripts/taxonomy-output/merged/industryTags.json
# → scripts/taxonomy-output/merged/applicationTags.json
# → scripts/hubspot-tags.json (→ category documents)
# → scripts/hubspot-users.json (→ person documents)
# Output → scripts/sanity-id-map.json (maps every hubspot/slug id → sanity document _id)
npm run push:taxonomy -- --dataset staging
# After reviewing staging in Sanity Studio — promote to production
# Reads → same files as above
# Output → updates scripts/sanity-id-map.json with production _ids
npm run push:taxonomy -- --dataset production
scripts/sanity-id-map.json structure:
{
"categories": { "<hubspot_tag_id>": "<sanity_doc_id>" },
"persons": { "<hubspot_user_id>": "<sanity_doc_id>" },
"technologyTags": { "<slug>": "<sanity_doc_id>" },
"productModelTags": { "<slug>": "<sanity_doc_id>" },
"industryTags": { "<slug>": "<sanity_doc_id>" },
"applicationTags": { "<slug>": "<sanity_doc_id>" }
}
The posts-<domain>.json files contain which tag names belong to which post. The taxonomy push script consolidates these (using the sanity-id-map.json to resolve names → Sanity IDs) into:
# Produced automatically at the end of npm run push:taxonomy
# Reads → scripts/taxonomy-output/posts-*.json (all domains)
# → scripts/sanity-id-map.json
# Output → scripts/post-tag-map.json
scripts/post-tag-map.json structure:
{
"<hubspot_post_id>": {
"technologyTags": ["<sanity_id>", ...],
"productModelTags": ["<sanity_id>", ...],
"industryTags": ["<sanity_id>", ...],
"applicationTags": ["<sanity_id>", ...]
}
}
scripts/all-domains-assets.json already exists — every image, video, and document URL found in blog content, grouped by domain.
# Prints the list of domain keys inside the asset file — shows which domains have assets
# No output files — console only
node -e "const d=require('./scripts/all-domains-assets.json'); console.log(Object.keys(d))"
Then open scripts/all-domains-assets.json and for each duplicate entry add:
"dedupOf": "<canonical_url>" — points to the authoritative version"skip": true — for assets that should not be migrated at allBrandfolder detects binary duplicates on upload automatically. Keep the admin view open at
Settings → General Settings → Advanced → Manage Deleted Assetsto catch and resolve those.
# Sends a HEAD request to every asset URL and records the HTTP status
# Reads → scripts/all-domains-assets.json
# Output → scripts/check-valid-all.json (url + status + response time per asset)
# → scripts/check-valid-all.csv (same, spreadsheet-friendly)
npm run check:dom-url
Only upload assets with status 200. Fix or skip anything else before Phase 4.
# Uploads every non-skipped, validated asset to Brandfolder
# Reads → scripts/all-domains-assets.json (asset list with dedup markers)
# → scripts/check-valid-all.json (skip anything not 200)
# Output → scripts/brandfolder-url-map.json (old hubspot url → new brandfolder CDN url)
# → Sanity migrationAssetLog documents (one per uploaded asset: sourceUrl, brandfolderId, CDN url, uploadedAt)
npm run upload:brandfolder
scripts/brandfolder-url-map.jsonis the critical handoff file to Phase 6. Every HubSpot CDN link in post content will be replaced using this map.
Transform raw HubSpot HTML into typed Sanity bodySections blocks using the Anthropic Batches API.
# Submits all posts for the given domain to Claude via the Batches API, then polls for results
# Reads → scripts/hubspot-posts-enriched.json (post metadata + domain info)
# → scripts/hubspot-posts-content.json (raw HTML bodies)
# → scripts/post-blocks-cache.json (resume state — skip already-processed posts)
# Output → scripts/hubspot-posts-blocks.json (structured bodySections array per post ID)
# → scripts/post-blocks-cache.json (updated cache — safe to resume from here)
# → scripts/post-blocks-batches.json (Anthropic batch IDs and status)
# → scripts/post-blocks-errors.json (posts Claude could not parse — fix manually)
npm run analyze:post-blocks:claude -- --domain de-de
# Same but processes all domains in sequence
# Output → same files as above, populated for all posts across all domains
npm run analyze:post-blocks:claude
The script is fully resumable — if it is stopped mid-run, re-running picks up from the cache.
| Block type | Source trigger |
|---|---|
bodyText |
Every H2/H3 section and surrounding paragraphs |
bodyImage |
<img> tags extracted from content flow |
bodyVideo |
YouTube / Vimeo / Wistia iframes |
bodyBlockquote |
<blockquote> elements |
comparisonTable |
<table> elements |
faqItem |
FAQ patterns detected by Claude |
howToModule |
Numbered step guides |
scripts/push-posts-to-sanity.ts (new script)Accepts --domain and --dataset flags. For each post in the target domain:
translation-groups-ai.json — find all posts for that domainblogPost fieldssanity-id-map.jsonpost-tag-map.jsonbrandfolder-url-map.jsonbodySections from hubspot-posts-blocks.jsonlegacyHtml field_type: 'blogPost'scripts/sanity-migration-log-<domain>.json for audit# Creates all blog post documents for de-de in the staging dataset
# Reads → scripts/translation-groups-ai.json (which posts belong to de-de)
# → scripts/hubspot-posts-enriched.json (fields: title, slug, publishedAt, author...)
# → scripts/hubspot-posts-blocks.json (bodySections blocks from Phase 5)
# → scripts/hubspot-posts-content.json (raw HTML for legacyHtml field)
# → scripts/sanity-id-map.json (resolve author, category, tag refs)
# → scripts/post-tag-map.json (resolve 4 tag arrays)
# → scripts/brandfolder-url-map.json (swap image URLs)
# Output → Sanity blogPost documents in the staging dataset
# → scripts/sanity-migration-log-de-de.json (log of every created/skipped/failed doc)
npm run push:posts:sanity -- --domain de-de --dataset staging
# After QA passes in Sanity Studio — promote to production
# Reads → same files as above
# Output → Sanity blogPost documents in the production dataset
# → scripts/sanity-migration-log-de-de.json (updated with production doc _ids)
npm run push:posts:sanity -- --domain de-de --dataset production
de-de → fr-fr → en-us → es-es → pl-pl → es-mx → fr-be → nl-nl → nl-be → pt-br → ja-jp → zh-cn → pt-pt
Start with de-de — one of the largest domains — so issues surface early.
Academy blogs (en, pl, de, fr, ja) should be confirmed in scope separately before migrating.
legacyHtml field populated on each documentnpm run check:dom-url on a sample of migrated Brandfolder URLs# Re-validates all URLs that appear in migrated content — catches any Brandfolder CDN issues
# Reads → scripts/brandfolder-url-map.json (all new CDN URLs to check)
# Output → scripts/check-valid-all.json (updated — any non-200 URLs flagged for manual fix)
# → scripts/check-valid-all.csv
npm run check:dom-url
Use scripts/brandfolder-url-map.json to find any remaining HubSpot CDN URLs not yet replaced and fix them before the next domain runs.
# Checks TypeScript types across the entire project — must pass with zero errors
# No output files — exits non-zero if there are type errors
npx tsc --noEmit
# Runs ESLint across the project — must pass with zero errors before committing
# No output files — exits non-zero on lint errors
npm run lint
bodySections blocks correct?bodyVideo block have the correct embed URL?scripts/post-blocks-errors.json — manually fix any posts Claude could not parseThese two scripts run after migration is complete and posts are live in Sanity. They are optional but strongly recommended before launching each country.
Script: scripts/score-seo.ts ✅ Written
Command: npm run score:seo
Scores every post against 11 SEO best-practice rules and writes the results to JSON/CSV. No content is changed — read-only analysis. Max score is 90 pts.
| Rule | Check | Points |
|---|---|---|
| Title length | Between 50–60 characters | 10 |
| Title has primary keyword | Keyword (derived from slug) appears in title | 10 |
| Meta description length | Between 120–160 characters | 10 |
| H1 present | At least one <h1> in body HTML |
5 |
| H2/H3 structure | At least 2 section headings in body | 10 |
| Word count | Body content ≥ 300 words | 10 |
| Image alt text | All <img> tags in body have non-empty alt text |
10 |
| Featured image | featuredImage field is populated |
5 |
| Keyword density | Primary keyword appears 1–3% of body word count | 10 |
| Slug quality | Lowercase, hyphens only, ≤ 75 chars | 5 |
| Internal links | At least 1 internal link (coldjet.com or relative) in body | 5 |
# Score all posts across all domains
# Reads → scripts/hubspot-posts-enriched.json
# → scripts/hubspot-posts-content.json
# Output → scripts/seo-scores.json (one entry per post with score + per-rule breakdown)
# → scripts/seo-scores.csv (same, spreadsheet-friendly)
npm run score:seo
# Score one domain only
npm run score:seo -- --domain de-de
# Show only posts below threshold + print worst 10 to console (default threshold: 60)
# Output → scripts/seo-scores-bad.json (failing posts sorted worst-first)
npm run score:seo -- --domain de-de --threshold 60 --bad-only
scripts/seo-scores.json — array of post results:
[
{
"postId": "7359675231",
"domain": "en-us",
"title": "3 ways dry ice blasting is used in the automotive industry",
"slug": "3-ways-dry-ice-blasting-is-used-in-the-automotive-industry",
"score": 65,
"passed": 8,
"failed": 3,
"rules": {
"titleLength": { "pass": false, "note": "57 chars — ideal (50–60)" },
"wordCount": { "pass": false, "note": "218 words — below 300 minimum" },
"imageAltText": { "pass": false, "note": "2 of 4 images missing alt text" },
"internalLinks": { "pass": true, "note": "3 internal links found" }
}
}
]
scripts/seo-scores-bad.json — only failing posts (score < threshold), sorted worst-first. Same shape as above.
Script to write: scripts/enhance-content-claude.ts
Command to add: npm run enhance:content
Uses Claude to improve post content based on the SEO score results. This script is gated behind an explicit flag — it will refuse to run unless --ai-enhance is passed. This prevents accidental bulk rewrites.
Enhanced posts are flagged in both the JSON output and in Sanity so editors know to review them before publishing.
| Failing rule | Enhancement |
|---|---|
titleLength |
Rewrites title to fit 50–60 chars while keeping meaning |
metaLength |
Rewrites meta description to hit 120–160 chars |
imageAltText |
Generates descriptive alt text from image URL + post context |
keywordDensity |
Adjusts keyword placement in bodyText blocks |
wordCount |
Flags post as too short — suggests expansion topics (does not auto-expand) |
Claude never auto-expands short posts — those require human judgement. It only rewrites fields where the fix is deterministic (title, meta description, alt text).
# Without --ai-enhance: the script exits immediately with an explanation
npm run enhance:content -- --domain de-de
# ❌ This script rewrites Sanity content. Pass --ai-enhance to confirm.
# With the flag: processes all posts below --threshold (default 60) for the domain
# Reads → scripts/seo-scores.json (which posts need enhancement and which rules failed)
# → Sanity blogPost documents (current field values)
# Output → scripts/enhance-log-de-de.json (what was changed, old value vs new value)
# → Sanity blogPost documents (updated fields + aiEnhanced: true flag)
npm run enhance:content -- --domain de-de --ai-enhance
# Dry run — shows what would change without writing anything to Sanity
# Output → scripts/enhance-preview-de-de.json (proposed changes only)
npm run enhance:content -- --domain de-de --ai-enhance --dry-run
A field aiEnhanced (boolean, hidden from editors by default) is set to true on every post that Claude touches. This lets editors filter and review all AI-enhanced posts in Sanity Studio before sign-off:
// In blogPost.ts — add to schema:
{
name: 'aiEnhanced',
title: 'AI Enhanced — Needs Review',
type: 'boolean',
initialValue: false,
description: 'Set automatically when Claude rewrites any field. Must be reviewed before publishing.',
}
Editors can query all enhanced posts in Sanity Studio:
*[_type == "blogPost" && aiEnhanced == true] | order(publishedAt desc)
scripts/enhance-log-de-de.json — full audit trail of every change:
[
{
"postId": "blogPost-abc123",
"domain": "de-de",
"field": "title",
"before": "How Dry Ice Cleaning Works In Industrial Applications And Why You Should Care",
"after": "How Dry Ice Cleaning Works: Industrial Guide",
"rule": "titleLength",
"enhancedAt": "2026-04-21T14:30:00Z"
}
]
# 1. Score all posts after migration
npm run score:seo -- --domain de-de
# 2. Review scripts/seo-scores-bad.json — decide which posts are worth enhancing
# 3. Run enhancement (gated — requires explicit flag)
npm run enhance:content -- --domain de-de --ai-enhance --dry-run # preview first
npm run enhance:content -- --domain de-de --ai-enhance # apply
# 4. Re-score to verify improvement
npm run score:seo -- --domain de-de
# 5. In Sanity Studio — review all aiEnhanced posts before publishing
# GROQ: *[_type == "blogPost" && aiEnhanced == true && domain == "de-de"]
| Script | Command | Gate flag | Reads | Writes |
|---|---|---|---|---|
score-seo.ts |
npm run score:seo |
None — read-only | hubspot-posts-enriched.json + hubspot-posts-content.json |
seo-scores.json, seo-scores-bad.json, seo-scores.csv |
enhance-content-claude.ts |
npm run enhance:content |
--ai-enhance required |
seo-scores.json + Sanity | enhance-log-<domain>.json + Sanity updates + aiEnhanced flag |
| Script | Command | Status | Reads | Writes |
|---|---|---|---|---|
fetch-hubspot-blogs.ts |
npm run fetch:blogs |
✅ Exists | HubSpot API | hubspot-blogs.json |
fetch-hubspot-users.ts |
npm run fetch:users |
✅ Exists | HubSpot API | hubspot-users.json |
fetch-hubspot-taxonomy.ts |
npm run fetch:taxonomy |
✅ Exists | HubSpot API | hubspot-tags.json |
fetch-post-bodies.ts |
npm run fetch:post-bodies |
✅ Exists | HubSpot API | hubspot-posts-content.json |
enrich-hubspot-posts.ts |
npm run enrich:posts |
✅ Exists | posts-cache + content + blogs | hubspot-posts-enriched.json |
check-language-coverage-ai.ts |
npm run check:language-coverage:ai |
✅ Exists | posts-enriched | translation-groups-ai.json, language-orphans-ai.json, coverage-matrix.csv |
extract-domain-assets.ts |
npm run extract:assets |
✅ Exists | posts-content | all-domains-assets.json |
check-dom-url.ts |
npm run check:dom-url |
✅ Exists | all-domains-assets / brandfolder-url-map | check-valid-all.json, check-valid-all.csv |
upload-to-brandfolder.ts |
npm run upload:brandfolder |
✅ Exists | all-domains-assets + check-valid | brandfolder-url-map.json + Sanity migrationAssetLog |
analyze-posts-blocks-claude.ts |
npm run analyze:post-blocks:claude |
✅ Exists | posts-enriched + posts-content | hubspot-posts-blocks.json, post-blocks-cache.json, post-blocks-errors.json |
convert-taxonomy-excel.ts |
npm run convert:taxonomy:excel |
✅ Written | Excel files + config | taxonomy-output/<type>-<domain>.json, taxonomy-output/merged/<type>.json |
push-taxonomy-to-sanity.ts |
npm run push:taxonomy |
⏳ To write | merged taxonomy JSONs + hubspot-tags/users | sanity-id-map.json, post-tag-map.json + Sanity docs |
push-posts-to-sanity.ts |
npm run push:posts:sanity |
⏳ To write | all JSON files above | sanity-migration-log-<domain>.json + Sanity blogPost docs |
score-seo.ts |
npm run score:seo |
✅ Written | hubspot-posts-enriched.json + hubspot-posts-content.json |
seo-scores.json, seo-scores-bad.json, seo-scores.csv |
enhance-content-claude.ts |
npm run enhance:content |
⏳ To write | seo-scores.json + Sanity | enhance-log-<domain>.json + Sanity updates |
All scripts support
--limit <n>for testing with a small sample. Never use--limitin a full domain run.