Seven locales, two pipelines, and why I bothered to translate at all

← Back to blog

May 22, 20269 min read

Seven locales, two pipelines, and why I bothered to translate at all

How one person maintains seven locales without hand-editing every JSON, and which hiring markets made it worth building.

i18nportfoliofrontend

A portfolio in English only is a valid choice. So is shipping seven locales when you are one person maintaining the repo: it is not “free,” but it is legible if you treat language as product scope, not a checkbox, and you automate the boring parts.

This post is the long version: which languages, why I chose them, and exactly what the code does (packages/types, packages/i18n, scripts/translate.mjs, apps/web/scripts/translate-blog.mjs).

Why these languages (strategy, not vibes)

I did not pick locales because a template shipped with them. I did not optimize for “most speakers worldwide” as a single number. I started from hiring geography: I looked at countries and regions where developers are hired (remote-friendly employers, strong domestic tech markets, EU hubs) and asked which languages I could realistically ship so that those readers could use the site in their mother tongue, not only in English. That is a product choice about who might evaluate this portfolio, not decoration.

English as the default is still non-negotiable: it is the widest common denominator for tech. The six non-English locales in this repo are Portuguese, Spanish, German, Polish, Dutch, and Estonian, listed in LOCALE_CONFIG in packages/types/src/index.ts. Together with en, that is seven surface locales.

The intent is blunt: if someone who might hire or collaborate with me lands here from Brazil or Portugal, Spain or Latin America, Germany, Austria, or Switzerland, Poland, the Netherlands, or Estonia, I want them to read in a language that maps to how they actually read when it matters, while still accepting that one es covers many countries and one pt covers two very different markets. That is a trade-off, not ignorance. One locale code cannot cover every cultural register of those languages, and I am not pretending it can.

Does every sentence sound like a human translator spent an hour on it? No. Machine translation and LLM-assisted batches are part of the stack. The promise is narrower and, I think, more honest: deliberate coverage, reviewable diffs in git, and copy that is not silently English-only for readers who prefer another first language.

What “seven locales” means in this codebase

Everything that needs to know “which languages exist” should derive from LOCALE_CONFIG: keys en, pt, es, de, pl, nl, et, each with name, bcp47, and a franc code for language detection helpers. The i18n package re-exports that config and builds locales as Object.keys(LOCALE_CONFIG); see packages/i18n/src/index.ts.

Runtime dictionaries are static JSON per locale under packages/i18n/src/messages/, imported through a small importer map so bundlers can resolve each file. The htmlLangFromLocale helper maps a locale to a BCP 47 tag for <html lang>. That is the “single source of truth” story: types + JSON files + Next app all line up on the same enum of locales.

If you add a locale, you extend LOCALE_CONFIG, add a new JSON file, and wire the importer, then run the translate pipeline. There is no hidden runtime list elsewhere that can drift.

// packages/types/src/index.ts
export const LOCALE_CONFIG = {
  en: { name: "English",    bcp47: "en", franc: "eng" },
  pt: { name: "Português",  bcp47: "pt", franc: "por" },
  es: { name: "Español",    bcp47: "es", franc: "spa" },
  de: { name: "Deutsch",    bcp47: "de", franc: "deu" },
  pl: { name: "Polski",     bcp47: "pl", franc: "pol" },
  nl: { name: "Nederlands", bcp47: "nl", franc: "nld" },
  et: { name: "Eesti",      bcp47: "et", franc: "est" },
} as const;

export type Locale = keyof typeof LOCALE_CONFIG;

Pipeline A: UI strings, en.json, lockfile, hybrid MT/LLM

Authoritative UI copy lives in packages/i18n/src/messages/en.json. Other languages are generated; the project rule is explicit: edit English only, then run pnpm translate from the repo root, which executes scripts/translate.mjs with environment loaded from apps/web/.env.local (see the root package.json script).

Incremental translation and the lockfile

translate.mjs is long for good reasons. It maintains scripts/translate.lock.json: per-locale hashes of sections of the English source so that when you change one part of en.json, you do not blindly re-send the entire dictionary to an API. That matters when en.json grows (about pages, projects, experience) and you care about cost and repeatability.

The script also prunes keys that disappeared from English so locale files do not accumulate orphan branches, and it can strip known untranslatable terms (proper nouns, tech brands) before a request, then merge them back; see the DO_NOT_TRANSLATE set and related helpers in the file.

Hybrid routing: LibreTranslate vs Groq

The choice of tools was driven by cost and fit. LibreTranslate is self-hostable via Docker with no per-token cost, which makes it the right fit for short, informational strings like nav, meta, and errors; those keys do not need LLM-level nuance, and paying per character to DeepL or Google Translate for them would add up without adding quality. Groq handles the LLM path because its inference is fast and cheap for batch work compared to OpenAI or Anthropic at the same volume; the default TRANSLATE_MODEL is a small model (llama-4-scout) selected specifically to keep per-run spend low, not the same model used for the chat.

The script does not send everything to Groq. A CONFIG object encodes the real policy:

  • Narrative keys matching patterns in groqKeyPatterns (e.g. about.sections.*.body, home.hero.subheadline, projects.items.*.problem, experience.items.*.impact) go through Groq for every locale via shouldUseGroqForKey. Job titles (experience.items.*.role) are also always Groq, because pure MT often returns English-looking titles unchanged.
  • Shorter or more informational namespaces are biased toward LibreTranslate via libreKeyPrefixes (e.g. meta, nav, errors, experience.labels), with LIBRETRANSLATE_URL defaulting to a local Docker-friendly host unless you override it.

So: every locale gets LLM quality on narrative fields; LibreTranslate handles the informational keys where machine translation is accurate enough. That is a cost and quality trade-off baked into code, not a comment in a README.

// scripts/translate.mjs
function shouldUseGroqForKey(localeCode, key) {
  return (
    matchesDotPattern(key, "experience.items.*.role") ||
    CONFIG.groqKeyPatterns.some((pat) => matchesDotPattern(key, pat))
  );
}

Env and operations you actually touch

The two Groq keys are intentionally separate. GROQ_API_KEY lives in production and serves the portfolio chat at runtime; translation batch runs that spike token usage would eat into the same rate limit and affect real users. TRANSLATE_GROQ_API_KEY is a different key used only in local batch runs, with its own quota. The fallback to GROQ_API_KEY exists for convenience when no dedicated key is configured, not as the intended setup.

TRANSLATE_MODEL picks the model for batch translation (default meta-llama/llama-4-scout-17b-16e-instruct in the script header), independent of GROQ_MODEL used by the chat. The default is deliberately a small, fast model to keep per-run spend low. There are knobs for batch size, 429 retries, and logging; read the top-of-file comment block in scripts/translate.mjs before you tune it.

API usage for Groq (and hosting LibreTranslate if you run it locally) is real money. The win here is control and automation, not magic. The lockfile is what makes it manageable: sections that did not change in English do not go to any API on the next run.

Pipeline B: Blog posts, English MDX, then Groq-only batch

Blog posts are not translated with the same hybrid router. pnpm translate:blog runs apps/web/scripts/translate-blog.mjs, which reads apps/web/content/blog/en/*.mdx and, for each target locale in TARGET_LOCALES (pt, es, de, pl, nl, et), calls Groq with a strict prompt: preserve YAML frontmatter structure, keep date and hero identical to the English source, translate title, description, tags, and body, do not mangle URLs. Output files land under apps/web/content/blog/<locale>/.

There is no LibreTranslate path in that script today; blog translation is LLM batch per file, with --locale, --file, --force, and --delay to reduce rate-limit pain. At read time, Next just loads the MDX for the route’s locale; no Groq call happens when a visitor opens a blog page.

Runtime: routes, dictionaries, and blog paths

The Next.js app uses the [locale] segment under apps/web/app/[locale]/. Server components call normalizeLocale on the param so unknown values fall back to en, then getDictionary(locale) loads the matching JSON module from packages/i18n. That is why every page template looks the same structurally: they all read copy from the dictionary, not hardcoded English strings in JSX (with rare exceptions the linter flags).

Blog routes combine that locale with the slug: published posts are read from content/blog/<locale>/<slug>.mdx when the slug is enabled in publish.json. So “supporting seven locales” is not only translation scripts; it is also static routes + static content files that must exist for each language you care about.

The promise on quality is narrower than literary polish: consistent pipelines, reviewable output in git, and honest prioritization: pt/es narrative fields via Groq, everything else via MT unless the key matches the role pattern. That is not a failure mode; it is a stated scope.

If you are building the same thing

Start from who you want to read you and which hiring markets matter to you, not from “how many flags.” The list of locales is a product decision; treat it like one.

Start small. One extra locale (probably the one where you have professional connections or are actively job searching) is enough to validate the pipeline. Add a LOCALE_CONFIG entry, add the JSON file, run pnpm translate. If the output quality is acceptable for that locale, you have a repeatable path. You do not need six non-English locales to prove the architecture works.

Decide your hybrid policy early. LibreTranslate is fast and free if you run it locally; Groq adds quality on narrative fields but costs tokens and adds latency. If you are only shipping one or two non-English locales, Groq-only for the whole dictionary is simpler than building the routing logic in shouldUseGroqForKey. The hybrid complexity in translate.mjs only pays off when you have enough keys and enough locales that MT quality variation actually matters across them.

Keep the English source canonical and the output diff-reviewable. The lockfile in scripts/translate.lock.json is not optional ceremony; it is what lets you run pnpm translate without re-spending tokens on sections that did not change. Any pipeline that retranslates the entire dictionary on every run will hit cost and rate-limit problems before your portfolio has a second reader. Commit the generated locale files so you can review every translation change in a normal pull request.

Then implement one English source, generated satellite files, and scripts you can run in CI or locally, so your portfolio stays a product you can evolve, not a pile of hand-edited copies.

The takeaway

The implementation is not the hard part. LOCALE_CONFIG, two batch scripts, a lockfile: any decent engineer can build that in a weekend. The harder part is deciding upfront that language is a product constraint, not a feature you add when the rest is “done.”

If you treat i18n as decoration, you will ship it exactly once and never touch it again. If you treat it as scope, with automation, reviewable diffs, and an honest list of what it does not cover, and it stays maintainable, and you stay in control of what your portfolio communicates to readers you might never meet in English.

That is the actual bet here: not that every sentence sounds like a human translator spent an hour on it, but that the site works for someone in Warsaw or Tallinn who prefers not to read in English when it matters. Whether that bet pays off is TBD.