What does the quality score actually mean?

It starts at 100 and deducts points for each error based on severity. 95+ means ready to use as-is. 85+ means it needs light polishing. Instead of just "good" or "bad," you can see exactly where points were lost and why.

How accurate is AI quality assessment?

leapCAT uses 43 specialized agents, each focused on a different quality dimension — meaning, terminology, fluency, and so on. Instead of one reviewer trying to catch everything, each area has its own specialist. The scoring follows MQM, the same international standard used by human evaluators at translation companies.

What if I disagree with the score?

Every error is shown with its exact location, type, and severity. If you disagree with a specific flag, you can review it directly. You can also adjust your project glossary and style guide so future translations reflect your preferences.

Is the same standard applied to all languages?

The framework (7 dimensions, 3 severity levels) is the same, but language-specific nuances are accounted for. Korean translationese patterns, Japanese honorific systems, Simplified vs. Traditional Chinese consistency — each language gets its own fine-tuned checks.

I'm not a translation expert. Can I trust this score?

That's exactly who this is built for. PMs, developers, and marketers who can't evaluate translation quality themselves can now make decisions based on objective numbers instead of gut feeling. That's the whole point of leapCAT's quality system.

Quality Score

How we make sure your translation is actually good

You're not a translation expert — and you shouldn't have to be. 43 AI agents work like a full translation team, checking meaning, terminology, fluency, and technical structure, then giving you an objective score.

What gets checked

•You got a translation back. How do you know if it's actually good? Now you can check with a number.
•A high score isn't the whole story. The report shows exactly what's strong and what still needs work.
•43 specialized AI agents check meaning, terminology, audience fit, technical structure, fluency, locale conventions, and tone — the same things a professional translation team would review.
•Whether you're shipping a game, listing products, or localizing a SaaS app, the quality bar is the same.
•The scoring is built on MQM, the international translation quality standard — translated into a score and grade anyone can understand.

How the quality score works

The score isn't about whether the translation "sounds okay." It's about finding actual errors, classifying how serious they are, and turning that into one clear number. Under the hood, this uses MQM (Multidimensional Quality Metrics) — the same framework used by professional translation companies and international standards bodies.

Quality Score = 100 − (Weighted Error Penalties / Word Count × 1000)

Each error gets a penalty based on severity: critical errors cost 25 points, major errors cost 5 points, minor errors cost 1 point. The framework follows ISO standards, so scores are consistent across any language pair.

Critical — This would make your customer think your product is broken25pt

The meaning is wrong, a number is incorrect, or essential information is missing. Think: a payment amount displayed wrong in your game, or a medication dosage that changed. Must be fixed before release. (Penalty: 25 points)

Major — This sounds noticeably off to a native speaker5pt

The meaning comes through, but a native speaker would definitely notice something is off — or the same term is translated two different ways in the same document. Like 'Settings' appearing as both '설정' and '환경설정' in your app. Shippable, but it hurts perceived quality. (Penalty: 5 points)

Minor — This is fine but could be more natural1pt

Nothing is wrong per se, but the phrasing could be smoother. Slightly awkward word order, a comma in an odd place. Worth polishing if you have time. (Penalty: 1 point)

7 areas checked by 43 AI agents

Translation quality isn't just "does it sound natural." A professional translation team checks seven core areas — and that's exactly what our AI agents do automatically.

Meaning preservation

Checks that the original meaning came through accurately. Catches missing content, reversed meanings, and subtle nuance shifts.

Mistranslation, omission, nuance distortion, false equivalence

Terminology consistency

Makes sure the same term is translated the same way throughout the document. Inconsistent terminology confuses users.

'Settings' translated as both 'Einstellungen' and 'Konfiguration', product name variations

Audience fit

Checks that the tone and complexity match the actual reader. You wouldn't use legal jargon for gamers or slang for doctors.

Expertise-level mismatch, cultural inappropriateness, wrong register for target audience

Technical structure

Verifies HTML tags, variables (like {name} placeholders), and formatting survived the translation intact. Broken structure means broken UI.

Missing tags, lost placeholders, broken formatting

Fluency

Checks that the text reads naturally in the target language. Grammatically correct but awkward still counts as an error.

Translationese, grammar errors, literal translation artifacts

Locale conventions

Verifies that dates, numbers, currencies, and units follow local conventions. The US writes $100; Korea writes 100달러 — details like this matter.

Date format errors (MM/DD vs DD/MM), missing currency symbols, unconverted units

Style and tone

Checks that the brand voice and document purpose are reflected consistently. Mixing formal and casual in the same document undermines professionalism.

Tone inconsistency, register mixing, brand voice drift

What the score means

Grade	Score Range	What it means
A+	≥ 98	Ship immediately
A	95–97	Good for publication
B+	90–94	Needs a second look
B	85–89	Needs partial review
C+	80–84	Needs careful review
C	75–79	Rework recommended
D	70–74	Partial retranslation needed
F	< 70	Should be retranslated

Ship it vs. Needs another look

Ship it

Ship it: target score met, zero critical errors, terminology is consistent. Good to go.

Needs another look

Needs another look: score below target, critical or major errors unresolved, terminology inconsistencies remain. We recommend fixing flagged issues and re-checking.

Why this matters for your content

"The translation looks fine to me" and "the translation is actually fine" are two different things. Without an objective quality standard, you're guessing — and guessing doesn't scale.

Game developers: verify localization quality yourself

Your Japanese and Chinese translations came back, but you can't read them. The score and error report tell you exactly where the problems are — no language skills required.

Ecommerce sellers: know your listings land

Product descriptions, review replies, CS templates — check translation quality with a number before you publish to international marketplaces.

SaaS teams: keep UI text quality consistent

Every string in your app, checked against the same quality bar. When terminology drifts, users get confused. This catches it.

Content teams: guarantee multilingual quality

Blog posts, help docs, marketing pages — verify that each language version reads naturally before it goes live.