This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Redaction Languages

Detailed list of languages that are supported.

Cobalt features core support for 14 languages and extended support for 39 additional languages, with core languages featuring the highest level of performance. The complete list of supported languages below details which languages have core support, which have extended or beta support, and which are upcoming additions. New languages are continually being added, please contact us if you require a language not in the list below.

In addition to supporting 50+ languages, Cobalt offers support for regional language varieties in recognition of the large differences in vocabulary and grammar that can exist in the same language when spoken in different regions. So far, this includes support for varieties of English (US, UK, Canada and Australia), Spanish (Spain and Mexico), French (France and Canada), and Portuguese (Portugal and Brazil). Cobalt also supports code-switching, or mixing of different languages. This means that, in a phrase such as J’ai payé 76,88RM por ein Haarschnitt da 范玉菲 habang ko ay nasa Україна, multilingual PII is accurately de-identified. The selection of supported regional language varieties is continually being expanded, please let us know if there is a specific request.

Cobalt’s supported entity types function across each supported language, with multilingual equivalents of different PII (Personally Identifiable Information) entities, PHI (Protected Health Information) entities, and PCI (Payment Card Industry) entities being detected in each language. Our Supported Entity Types page provides a more detailed look at our coverage of language and region-specific entity equivalents. The solution is also sensitive to cross-linguistic differences in how names are structured, how place names are referred to, and how monetary units are described in different languages, among other differences.

Core Support

Language ISO Code Supported Regional Varieties Support Level Text Support Audio Support File Support Labels
Dutch nl The Netherlands Core
English en Australia, Canada, United Kingdom, United States Core
French fr Canada (Quebec), France, Switzerland Core
German de Germany, Belgium, Austria, Switzerland Core
Hindi hi India Core
Italian it Italy, Switzerland Core
Japanese ja Japan Core
Korean ko Korea Core
Mandarin (simplified) zh-Hans China, Singapore Core
Portuguese pt Brazil, Portugal Core
Russian ru Russia Core
Spanish es Mexico, Spain Core
Tagalog tl Philippines Core
Ukrainian uk Ukraine Core

Extended Support

Language ISO Code Support Level Text Support Audio Support File Support Labels
Afrikaans af Extended
Arabic ar Extended
Bambara bm Extended
Bengali bn Extended
Belarusian be Extended
Bulgarian bg Extended
Burmese my Extended
Cantonese (traditional) zh-Hant Extended
Catalan ca Extended
Croatian hr Extended
Czech cs Extended
Danish da Extended
Estonian et Extended
Finnish fi Extended
Georgian ka Extended
Greek el Extended
Hebrew he Extended
Hungarian hu Extended
Icelandic is Extended
Indonesian id Extended
Khmer km Extended
Latvian lv Extended
Lithuanian lt Extended
Luxembourgish lb Extended
Malay ms Extended
Moldovan ro Extended
Norwegian (Bokmål) nb Extended
Persian (Farsi) fa Extended
Polish pl Extended
Punjabi pa Extended
Romanian ro Extended
Slovak sk Extended
Slovenian sl Extended
Swahili sw Extended
Swedish sv Extended
Tamil ta Extended
Thai th Extended
Turkish tr Extended
Vietnamese vi Extended