Docs: Privacy Screen

Mon, 01 Jan 0001 00:00:00 +0000

Cobalt’s Privacy Screen engine can redact various categories of sensitive information automatically from text and audio. Every business that collects or deals with personal data should redact sensitive information in order to protect customer privacy, comply with laws and regulations, and discover new business opportunities.

Privacy Screen makes audio and text redaction possible in real-time with the advantage of our low latency and accurate speech recognition engine, Transcribe, combined with a robust redaction backend engine that identifies several types of sensitive or confidential information. There are several categories:

Personally Identifying Information (PII) such as names, addresses, phone numbers etc.
Protected Health Information (PHI) such as medical conditions, injuries, names of medication etc.
Payment Card Industry (PCI) such as credit card and bank details.

A detailed list of all the categories that are identified by Privacy Screen can be found here.

How does redaction work

Sensitive information redaction typically works as a two step process. First, a machine learning model detects and classifies the desired entities in the text. Then, this classification is used to determine if the entity needs to be redacted, and if it does, the entity is replaced with an entity label in the redacted transcript. Currently, Cobalt uses state-of-the-art deep neural network (DNN) model for PII, PHI and PCI redaction.

There are three different options of using Cobalt’s redaction solution:

Redact PII from a text transcript
Redact PII from an audio file
Redact PII from an audio file with a text transcript

Each of these services can be used in two operating modes:

Streaming mode: Redaction will run utterance by utterance, and output will be streamed out as soon as the result is ready.
Batch mode: All input audio/transcript will be processed, redacted in one batch and the output will be available at the end of the process.

Redact PII from a text transcript

In this use case, you can identify and redact sensitive PII from an input text transcript. Detected PII entities are replaced with an appropriate PII token in the redacted text transcript. Both the input and redacted transcripts are specified as JSON with a list of utterances. Each utterance has a list of words that has:

Text
Redaction class
Redaction confidence score

You can specify the desired redaction classes applicable for your use case in the config file.

Redact PII from an audio file

In this case, the input audio file is first transcribed using Cobalt’s transcribe API and then text redaction is applied on the ASR generated transcript. Detected PII entities are replaced with an appropriate PII token in the redacted text transcript. In the output, you can get:

Redacted text transcript
Unredacted text transcript
Redacted audio file where the PII has been masked with a beep sound The redacted text transcript contains a redaction confidence score, ASR confidence score, and associated starting and ending timestamps for each utterance and/or word.

Redact PII from both an audio file with a text transcript

In this use case, an audio file and associated transcript is given as input in order to get the redacted transcript and redacted audio file as output. The input transcript should be specified as JSON with a list of utterances:

Each utterance has:
- Audio Channel in the audio file. Indexed from 0
- A list of words. Each word has:
  - Text
  - Timestamp in the audio file where this word starts (in milliseconds)
  - Duration of this word in the audio file (in milliseconds)

Output transcript has the same format as the input, except each word has extra fields such as “redaction_class”, “redaction_confidence”, and “is_redacted”.

Text Redaction

Here is an example of text redaction:

Raw text	Redacted text
Good morning, everybody. My name is Robert, and today I am going to share some personal information with you. I live at 123 Park Ave Apt 123 New York City, NY 10002. My Social Security number is 999999999, credit card number is 6666666666666666, and CVV code is 777. I love cats.	Good morning, everybody. My name is [NAME], and today I am going to share some personal information with you. I live at [LOCATION_ADDRESS] [LOCATION_CITY], [LOCATION_ZIP]. My Social Security number is [SSN], credit card number is [CREDIT_CARD], and CVV code is [CVV]. I love cats.

System requirements

Minimum requirements

Minimum	Recommended (Text only)	Recommended (All Features)	Recommended Concurrency
CPU	Any x86 (Intel or AMD) processor with 6GB RAM and 50GB disk volume	Intel Sapphire Rapids or newer CPUs supporting AMX with 16GB RAM and 50GB disk volume	Intel Sapphire Rapids or newer CPUs supporting AMX with 64GB RAM and 100GB disk volume
GPU	Any x86 (Intel or AMD) processor with 28GB RAM. Nvidia GPU with compute capability 7.0 or higher (Volta or newer) and at least 16GB VRAM. 100GB disk volume	Any x86 (Intel or AMD) processor with 32GB RAM and Nvidia Tesla T4 GPU. 100GB disk volume	Any x86 (Intel or AMD) processor with 64GB RAM and Nvidia Tesla T4 GPU. 100GB disk volume

Recommended requirements for CPU container

Platform	Recommended Instance Type (Text only)	Recommended Instance Type (All Features)
Azure	Standard_E2_v5 (2 vCPUs, 16GB RAM)	Standard_E8_v5 (8 vCPUs, 64GB RAM)
AWS	M7i.large (2 vCPUs, 8GB RAM)	m7i.4xlarge (16 vCPUs, 64GB RAM)
GCP	N2-Standard-2 (2 vCPUs, 8GB RAM)	N2-Standard-16 (16 vCPUs, 64GB RAM)

Recommended requirements for GPU container

Platform	Recommended Instance Type (Text only)	Recommended Instance Type (All Features)
Azure	Standard_NC8as_T4_v3	Standard_NC8as_T4_v3
AWS	G4dn.2xlarge	G4dn.4xlarge
GCP	N1-Standard-8 + Tesla T4	N1-Standard-16 + Tesla T4

Docs: VoiceBio