<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Cobalt Speech: API Documentation – Voice Intelligence</title>
    <link>/docs/voice_intelligence/</link>
    <description>Recent content in Voice Intelligence on Cobalt Speech: API Documentation</description>
    <generator>Hugo -- gohugo.io</generator>
    
	  <atom:link href="/docs/voice_intelligence/index.xml" rel="self" type="application/rss+xml" />
    
    
      
        
      
    
    
    <item>
      <title>Docs: Privacy Screen</title>
      <link>/docs/voice_intelligence/privacy_screen/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/docs/voice_intelligence/privacy_screen/</guid>
      <description>
        
        
        &lt;p&gt;&lt;a href=&#34;https://demo.cobaltspeech.com/privacyscreen/&#34;&gt;Demo&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Cobalt&amp;rsquo;s Privacy Screen engine can redact various categories of sensitive
information automatically from text and audio. Every business that collects or
deals with personal data should redact sensitive information in order to protect
customer privacy, comply with laws and regulations, and discover new business
opportunities.&lt;/p&gt;
&lt;p&gt;Privacy Screen makes audio and text redaction possible in real-time with the
advantage of our low latency and accurate speech recognition engine,
&lt;a href=&#34;../transcribe&#34;&gt;Transcribe&lt;/a&gt;, combined with a robust redaction backend engine
that identifies several types of sensitive or confidential information. There
are several categories:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Personally Identifying Information (PII) such as names, addresses, phone numbers etc.&lt;/li&gt;
&lt;li&gt;Protected Health Information (PHI) such as medical conditions, injuries, names of medication etc.&lt;/li&gt;
&lt;li&gt;Payment Card Industry (PCI) such as credit card and bank details.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A detailed list of all the categories that are identified by Privacy Screen can
be found &lt;a href=&#34;./redaction_categories&#34;&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;how-does-redaction-work&#34;&gt;How does redaction work&lt;/h2&gt;
&lt;p&gt;Sensitive information redaction typically works as a two step process. First, a machine learning model detects and classifies the desired entities in the text. Then, this classification is used to determine if the entity needs to be redacted, and if it does, the entity is replaced with an entity label in the redacted transcript. Currently, Cobalt uses state-of-the-art deep neural network (DNN) model for PII, PHI and PCI redaction.&lt;/p&gt;
&lt;p&gt;There are three different options of using Cobalt&amp;rsquo;s redaction solution:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Redact PII from a text transcript&lt;/li&gt;
&lt;li&gt;Redact PII from an audio file&lt;/li&gt;
&lt;li&gt;Redact PII from an audio file with a text transcript&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each of these services can be used in two operating modes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Streaming mode:&lt;/strong&gt; Redaction will run utterance by utterance, and output will be streamed out as soon as the result is ready.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Batch mode:&lt;/strong&gt; All input audio/transcript will be processed, redacted in one batch and the output will be available at the end of the process.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;redact-pii-from-a-text-transcript&#34;&gt;Redact PII from a text transcript&lt;/h3&gt;
&lt;p&gt;In this use case, you can identify and redact sensitive PII from an input text transcript.  Detected PII entities are replaced with an appropriate PII token in the redacted text transcript. Both the input and redacted transcripts are specified as JSON with a list of utterances. Each utterance has a list of words that has:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Text&lt;/li&gt;
&lt;li&gt;Redaction class&lt;/li&gt;
&lt;li&gt;Redaction confidence score&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can specify the desired redaction classes applicable for your use case in the config file.&lt;/p&gt;
&lt;h3 id=&#34;redact-pii-from-an-audio-file&#34;&gt;Redact PII from an audio file&lt;/h3&gt;
&lt;p&gt;In this case, the input audio file is first transcribed using Cobalt&amp;rsquo;s transcribe API and then text redaction is applied on the ASR generated transcript. Detected PII entities are replaced with an appropriate PII token in the redacted text transcript. In the output, you can get:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Redacted text transcript&lt;/li&gt;
&lt;li&gt;Unredacted text transcript&lt;/li&gt;
&lt;li&gt;Redacted audio file where the PII has been masked with a beep sound
The redacted text transcript contains a redaction confidence score, ASR confidence score, and associated starting and ending timestamps for each utterance and/or word.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;redact-pii-from-both-an-audio-file-with-a-text-transcript&#34;&gt;Redact PII from both an audio file with a text transcript&lt;/h3&gt;
&lt;p&gt;In this use case, an audio file and associated transcript is given as input in order to get the redacted transcript and redacted audio file as output. The input transcript should be specified as JSON with a list of utterances:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Each utterance has:
&lt;ul&gt;
&lt;li&gt;Audio Channel in the audio file. Indexed from 0&lt;/li&gt;
&lt;li&gt;A list of words.
Each word has:
&lt;ul&gt;
&lt;li&gt;Text&lt;/li&gt;
&lt;li&gt;Timestamp in the audio file where this word starts (in milliseconds)&lt;/li&gt;
&lt;li&gt;Duration of this word in the audio file (in milliseconds)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Output transcript has the same format as the input, except each word has extra fields such as &amp;ldquo;redaction_class&amp;rdquo;, &amp;ldquo;redaction_confidence&amp;rdquo;, and &amp;ldquo;is_redacted&amp;rdquo;.&lt;/p&gt;
&lt;h2 id=&#34;text-redaction&#34;&gt;Text Redaction&lt;/h2&gt;
&lt;p&gt;Here is an example of text redaction:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Raw text&lt;/th&gt;
&lt;th&gt;Redacted text&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Good morning, everybody. My name is Robert, and today I am going to share some personal information with you. I live at 123 Park Ave Apt 123 New York City, NY 10002. My Social  Security number is 999999999, credit card number is 6666666666666666, and CVV code is 777. I love cats.&lt;/td&gt;
&lt;td&gt;Good morning, everybody. My name is [NAME], and today I am going to share some personal information with you. I live at [LOCATION_ADDRESS] [LOCATION_CITY], [LOCATION_ZIP]. My Social  Security number is [SSN], credit card number is [CREDIT_CARD], and CVV code is [CVV]. I love cats.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2 id=&#34;system-requirements&#34;&gt;System requirements&lt;/h2&gt;
&lt;h3 id=&#34;minimum-requirements&#34;&gt;Minimum requirements&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Minimum&lt;/th&gt;
&lt;th&gt;Recommended (Text only)&lt;/th&gt;
&lt;th&gt;Recommended (All Features)&lt;/th&gt;
&lt;th&gt;Recommended Concurrency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;CPU&lt;/td&gt;
&lt;td&gt;Any x86 (Intel or AMD) processor with 6GB RAM and 50GB disk volume&lt;/td&gt;
&lt;td&gt;Intel Sapphire Rapids or newer CPUs supporting AMX with 16GB RAM and 50GB disk volume&lt;/td&gt;
&lt;td&gt;Intel Sapphire Rapids or newer CPUs supporting AMX with 64GB RAM and 100GB disk volume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPU&lt;/td&gt;
&lt;td&gt;Any x86 (Intel or AMD) processor with 28GB RAM. Nvidia GPU with compute capability 7.0 or higher (Volta or newer) and at least 16GB VRAM. 100GB disk volume&lt;/td&gt;
&lt;td&gt;Any x86 (Intel or AMD) processor with 32GB RAM and Nvidia Tesla T4 GPU. 100GB disk volume&lt;/td&gt;
&lt;td&gt;Any x86 (Intel or AMD) processor with 64GB RAM and Nvidia Tesla T4 GPU. 100GB disk volume&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&#34;recommended-requirements-for-cpu-container&#34;&gt;Recommended requirements for CPU container&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Recommended Instance Type (Text only)&lt;/th&gt;
&lt;th&gt;Recommended Instance Type (All Features)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Azure&lt;/td&gt;
&lt;td&gt;Standard_E2_v5 (2 vCPUs, 16GB RAM)&lt;/td&gt;
&lt;td&gt;Standard_E8_v5 (8 vCPUs, 64GB RAM)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS&lt;/td&gt;
&lt;td&gt;M7i.large (2 vCPUs, 8GB RAM)&lt;/td&gt;
&lt;td&gt;m7i.4xlarge (16 vCPUs, 64GB RAM)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GCP&lt;/td&gt;
&lt;td&gt;N2-Standard-2 (2 vCPUs, 8GB RAM)&lt;/td&gt;
&lt;td&gt;N2-Standard-16 (16 vCPUs, 64GB RAM)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h3 id=&#34;recommended-requirements-for-gpu-container&#34;&gt;Recommended requirements for GPU container&lt;/h3&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Recommended Instance Type (Text only)&lt;/th&gt;
&lt;th&gt;Recommended Instance Type (All Features)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Azure&lt;/td&gt;
&lt;td&gt;Standard_NC8as_T4_v3&lt;/td&gt;
&lt;td&gt;Standard_NC8as_T4_v3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS&lt;/td&gt;
&lt;td&gt;G4dn.2xlarge&lt;/td&gt;
&lt;td&gt;G4dn.4xlarge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GCP&lt;/td&gt;
&lt;td&gt;N1-Standard-8 + Tesla T4&lt;/td&gt;
&lt;td&gt;N1-Standard-16 + Tesla T4&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

      </description>
    </item>
    
    <item>
      <title>Docs: VoiceBio</title>
      <link>/docs/voice_intelligence/voicebio/</link>
      <pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate>
      
      <guid>/docs/voice_intelligence/voicebio/</guid>
      <description>
        
        
        
      </description>
    </item>
    
  </channel>
</rss>
