Building a Custom Dictionary for Accurate Transcription

Speech recognition models are trained on massive datasets of general-purpose audio and text. They handle everyday language remarkably well. But every profession, every industry, and every individual has vocabulary that falls outside the "everyday" bucket. Medical terminology, legal Latin, brand names, project codenames, technical acronyms, and people's names all present challenges for even the best ASR models.

SuperSpeech's custom dictionary solves this problem at the post-processing level. After the model produces its best transcription, the dictionary scans the output and corrects known misrecognitions. The result is clean, accurate text that reflects your domain's vocabulary -- without requiring you to retrain the underlying model.

This guide walks you through building an effective custom dictionary from scratch, with practical examples and best practices for different professions.

How the Custom Dictionary Works

SuperSpeech's transcription pipeline has three stages:

Speech recognition: The Parakeet-TDT model converts audio to raw text
Dictionary correction: The custom dictionary scans the raw text and replaces known misrecognitions
Grammar correction (optional): A local LLM applies light grammar fixes

The dictionary operates on the text output, not on the audio. This is an important distinction: the dictionary does not change how the model listens to your speech. Instead, it corrects the text after the model has done its best interpretation. This approach is simple, predictable, and easy to configure.

The Correction Process

When SuperSpeech processes a transcription, it takes the raw text output and checks it against every enabled dictionary entry. For each entry, it looks for any of the defined variants in the text. When it finds a match, it replaces that variant with the specified output word. The process runs in milliseconds and adds negligible time to the transcription pipeline.

Dictionary File Format

The custom dictionary uses a JSON array of entries. Each entry has the following fields:

[
  {
    "output": "SuperSpeech",
    "variants": ["super speech", "super speach", "souper speech"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "API",
    "variants": ["a p i", "a pi", "ay p i"],
    "caseSensitive": false,
    "enabled": true
  }
]

Fields explained:

output (string): The correct text you want to appear in your transcription. This is the replacement value.
variants (array of strings): A list of ways the model might mishear or misrender your target word. Each variant is a pattern that will be matched and replaced with the output.
caseSensitive (boolean): When false, the matching ignores case. When true, the variant must match the exact casing in the transcription. Default: false.
enabled (boolean): Allows you to temporarily disable an entry without deleting it. Useful for troubleshooting or seasonal terms. Default: true.

File Location

The dictionary file is stored at:

macOS: ~/Library/Application Support/SuperSpeech/custom_dictionary.json
Windows: %LOCALAPPDATA%/SuperSpeech/custom_dictionary.json

You can edit the file directly with any text editor, or use SuperSpeech's built-in Dictionary editor in the Settings panel.

Step-by-Step Setup

Step 1: Identify Your Problem Words

Before building the dictionary, spend a day or two using SuperSpeech without it. Pay attention to which words consistently get misrecognized. These fall into predictable categories:

Acronyms and abbreviations: "API," "SaaS," "HIPAA," "GDPR"
Brand and product names: "SuperSpeech," "PostgreSQL," "Kubernetes"
People's names: "Müller," "Nakamura," "O'Brien"
Technical jargon: "microservice," "tokenizer," "refactored"
Foreign words used in English: "schadenfreude," "zeitgeist," "vis-a-vis"

Write down each problem word and how the model transcribes it. The model's misrecognition is what you will enter as a variant.

Step 2: Create Your First Entries

Start small. Begin with 10-20 entries covering your most frequently used problem words. Here is an example starter dictionary for a software developer:

[
  {
    "output": "API",
    "variants": ["a p i", "a pi", "ay p i"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "PostgreSQL",
    "variants": ["postgres q l", "post gres q l", "postgres sequel", "post gress"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "Kubernetes",
    "variants": ["kubernetes", "cooper net ease", "kuber net ease"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "GitHub",
    "variants": ["git hub", "get hub"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "refactoring",
    "variants": ["re factoring", "reef factoring"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "microservices",
    "variants": ["micro services", "micro service is"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "DevOps",
    "variants": ["dev ops", "dev op's", "devops"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "CI/CD",
    "variants": ["c i c d", "c i / c d", "c.i. c.d."],
    "caseSensitive": false,
    "enabled": true
  }
]

Step 3: Test and Iterate

After creating your initial dictionary, test each entry by dictating sentences that include the target words. Speak naturally -- do not over-enunciate or slow down. The goal is to capture how you actually say these words in your normal workflow.

For each entry, verify:

The correction triggers when you say the word naturally
The correction does not trigger on unrelated words (false positives)
All common mispronunciations are covered by variants

If you find the model produces a misrecognition you did not anticipate, add it as a new variant to the existing entry.

Step 4: Expand Gradually

Resist the urge to build a 200-entry dictionary on day one. Add entries as you encounter new misrecognitions in your actual workflow. A dictionary that grows organically from real usage will be more accurate and less prone to false positives than one built speculatively.

Best Practices

Use Phonetic Variants

The most effective variants are phonetic approximations of how the model mishears your word. The model produces text based on what it thinks it heard, so your variants should match that phonetic interpretation.

Good variants (phonetically motivated):

{
  "output": "N8N",
  "variants": ["n acht n", "n 8 n", "n eight n", "n-8-n"]
}

Less effective variants (spelling errors the model would not produce):

{
  "output": "N8N",
  "variants": ["N8n", "n8N", "N-8-N"]
}

The model outputs what it hears phonetically, not random misspellings. Focus your variants on phonetic interpretations.

Beware of Short Variants

Short variants (one or two characters) are dangerous because they can match unintended parts of other words. For example, if you create an entry with the variant "ai" to correct to "AI," it would incorrectly match inside words like "said," "main," or "detail."

Solutions for short terms:

Use space-delimited variants: "a i" instead of "ai"
Add word boundary context: " ai " (with spaces) instead of "ai"
Use longer variants that include surrounding context

Handle Plural and Verb Forms Separately

If you need both "API" and "APIs," create separate entries or include both forms in your output strategies:

[
  {
    "output": "API",
    "variants": ["a p i", "a pi"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "APIs",
    "variants": ["a p i s", "a p eyes", "a pis"],
    "caseSensitive": false,
    "enabled": true
  }
]

Test for False Positives

After adding a new entry, dictate several sentences that do not contain the target word but include similar sounds. If the correction triggers incorrectly, your variants are too broad. Narrow them by making the variant strings more specific.

Example of a false positive risk:

{
  "output": "REACT",
  "variants": ["react"]
}

This would incorrectly capitalize the common English word "react" in every context. A better approach:

{
  "output": "React",
  "variants": ["react js", "react j s", "react framework"]
}

Use the Enabled Flag for Seasonal Terms

If you work on projects with codenames, client names, or terms that are only relevant for a period of time, use the enabled flag instead of deleting entries:

{
  "output": "Project Falcon",
  "variants": ["project falcon", "project falken"],
  "caseSensitive": false,
  "enabled": false
}

When the project becomes relevant again, flip enabled back to true. This saves you from recreating entries you have already tested and refined.

Examples by Profession

Medical Practice

[
  {
    "output": "metformin",
    "variants": ["met foreman", "met form in", "met for men"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "HbA1c",
    "variants": ["h b a one c", "h b a 1 c", "hemoglobin a one c"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "echocardiogram",
    "variants": ["echo cardiogram", "echo cardio gram"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "bilateral",
    "variants": ["by lateral", "bi lateral"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "mg/dL",
    "variants": ["milligrams per deciliter", "mg per d l"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "prn",
    "variants": ["p r n", "as needed p r n"],
    "caseSensitive": false,
    "enabled": true
  }
]

Legal Practice

[
  {
    "output": "habeas corpus",
    "variants": ["hay bees corpus", "habeas corpus"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "voir dire",
    "variants": ["vwa deer", "war dire", "voy deer"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "plaintiff",
    "variants": ["plane tiff", "plain tif"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "Fed. R. Civ. P.",
    "variants": ["federal rule of civil procedure", "federal rules of civil procedure", "f r c p"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "LLC",
    "variants": ["l l c", "l.l.c."],
    "caseSensitive": false,
    "enabled": true
  }
]

Finance and Accounting

[
  {
    "output": "EBITDA",
    "variants": ["e bit da", "e b i t d a", "ebitda"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "YoY",
    "variants": ["y o y", "year over year y o y"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "GAAP",
    "variants": ["g a a p", "gap accounting", "gaap"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "P&L",
    "variants": ["p and l", "p & l", "p n l"],
    "caseSensitive": false,
    "enabled": true
  },
  {
    "output": "ROI",
    "variants": ["r o i", "r.o.i.", "are oh eye"],
    "caseSensitive": false,
    "enabled": true
  }
]

Advanced Configuration

Case Sensitivity

By default, matching is case-insensitive, which works for most use cases. Enable case sensitivity when you need to distinguish between a common word and a proper noun:

{
  "output": "Swift",
  "variants": ["swift programming", "swift language"],
  "caseSensitive": true,
  "enabled": true
}

This avoids replacing the common adjective "swift" in general text while still catching references to the programming language.

Combining with Grammar Correction

If you have SuperSpeech's optional grammar correction enabled, the processing order matters: the custom dictionary runs first, then grammar correction. This means:

The model transcribes: "the patient was prescribed met foreman 500 milligrams"
The dictionary corrects: "the patient was prescribed metformin 500 milligrams"
Grammar correction cleans up: "The patient was prescribed metformin 500 milligrams."

This ordering ensures that the grammar model sees correctly-spelled domain terms, which helps it make better grammatical decisions.

Sharing Dictionaries Across a Team

The dictionary file is a simple JSON file that can be shared, version-controlled, and distributed. Practical approaches for teams:

Shared network drive: Place a master dictionary on a shared drive and have team members copy it to their local SuperSpeech directory.
Git repository: Track the dictionary in version control alongside your other team resources. This gives you change history, code review, and easy distribution.
Template approach: Maintain a base dictionary for your organization and let individuals add their own entries. Periodically merge individual additions back into the base.

Maintaining Your Dictionary

Schedule a brief review of your dictionary every few months:

Remove stale entries: Disable or delete entries for projects, clients, or terms you no longer use
Add new entries: Review recent transcriptions for recurring misrecognitions you have not addressed
Refine variants: If an entry is not triggering consistently, add new variants based on the actual model output
Check for conflicts: Look for entries with overlapping variants that might cause unexpected replacements

Troubleshooting

Entry Is Not Triggering

If a dictionary entry does not correct a word you expected it to catch:

Dictate the word and check the raw output (before dictionary correction) to see exactly how the model transcribes it
Add the model's exact output as a new variant
Check that the entry is enabled
Verify the JSON syntax is valid (a missing comma or bracket will prevent the file from loading)

False Positives

If a dictionary entry is replacing words it should not:

Make the variants more specific by adding context words
Use case sensitivity to limit matching
Consider splitting one broad entry into multiple narrow entries

Dictionary File Will Not Load

If SuperSpeech reports an error loading the dictionary, validate your JSON syntax with an online JSON validator. Common issues include trailing commas after the last array entry, curly/smart quotes instead of straight double quotes, and non-UTF-8 file encoding.

The Bottom Line

The custom dictionary is the single most impactful thing you can do to improve SuperSpeech's accuracy for your specific use case. It takes 30 minutes to set up a solid initial dictionary and a few seconds to add new entries as you encounter them. The investment pays off immediately in cleaner transcriptions and less time spent editing.

Start with your 15-20 most common problem words. Test each entry. Add new entries as they arise. Within a week, your dictionary will cover the vocabulary that matters most to your work, and SuperSpeech's output will read as though the model was trained specifically for your domain.

Ready to get started? Download SuperSpeech and try the free online demo, or explore our pricing to find a plan that fits your workflow. Every plan includes full custom dictionary support.