Protecting Your Most Sensitive Data


Protecting Your Most Sensitive Data

I admit it. I cannot get redaction off my mind. My company’s speech analytics clients are practically consumed by the topic, so I guess it shouldn’t be a surprise that I’m always thinking about it.

I’ve been in the speech analytics business for 16 years and my team of analysts hear about redaction requirements all the time. If only I could temporarily erase parts of my brain when I come home each night and then unlock certain portions at my discretion only when the time was appropriate.

Oops, there I go again. See what I mean? I’m a bit consumed.

But that’s where our industry is right now. Granted, we’ve come a long way since call centers went into its own version of military style full Defcon 1 lockdown mode when breaches became increasingly more problematic in the last decade.

And I fully recognize the reason why companies needed to err on the side of caution. Take the Equifax breach in 2017, which exposed sensitive data associated with nearly 146 million Americans.

That breach was caused by one employee’s mistake. CEO Richard M. Smith repeatedly referred to an individual who failed security warnings.

It’s scary that one careless person could cause so much damage, but the Equifax case proves it can and does happen. Putting the legal costs and non-compliance fines aside, the consequences are significant. The following survey data from KPMG tells the story:

  • 86% of respondents said they feel a growing concern about data privacy.
  • 43% of breaches were unauthorized access in the U.S. alone.
  • 78% expressed fears about the amount of data being collected.
  • 40% don’t trust companies to use their data ethically.


But complete lockdowns? I call it “over-redaction caused by over-reaction.” True, contact centers are governed by a set of regulations unique to the nature of their business model and subject to thousands of different state and federal regulations. And yes, they need to be especially cautious given most compliance costs are associated with consumer protections.

As a result, contact centers tend to lock down all information because the data is likely to contain sensitive information that may not be suitable to share, such as names, email addresses, financial information (account numbers, balances, etc.), and credit card information.

It’s a shame because there’s a treasure trove of insights within agent-customer interactions that can be analyzed by Marketing, Product, Security, Operations, and Sales to drive business strategy, improve processes, and generate more revenue and retention.

Contact center management may have good intentions, but to me, out of fear, it’s almost as though contact centers are holding vital data hostage. Result? Some business intelligence tools such as Power BI and Tableau have big gaps for voice data.

As a workaround, most companies employ a combination of methods to diminish the risk of exposing sensitive data such as:

  • Pause-and-resume
  • Speech analytics numeric redaction

These methods might mask some of the sensitive data, but each option is ripe for leakage and leave an organization vulnerable.

Let’s take the common but flawed pause-and-resume technique. If an agent forgets to pause the recording, sensitive data may be inadvertently captured, putting the call recording back in scope for non-compliance purposes and leaving the information vulnerable should there be a data breach.

Automated pause-and-resume can be susceptible to a data leak when an agent or customer goes off script. If an agent forgets to resume the call recording, vital information may be excluded that is required to solve a transaction dispute or support quality control.

And if a recorded call is incomplete, a company may not be able to demonstrate compliance with industry or government regulations.

Accurate Transcriptions Key

It’s difficult to estimate the number of companies that have inquired about better techniques to mask sensitive data.

In a perfect world, organizations would be able to leverage automated technology to mask sensitive data and maintain the integrity of the audio and transcriptions for analytics.

By doing so, they could analyze customer interactions safely and key data across the enterprise while:

  • Lowering the risk of data breaches.
  • Escaping penalties and fines.
  • Retaining credit card privileges.
  • Maintaining customer trust.

The foundation of any redaction solution is accurate transcriptions, starting with numbers. This was a breakthrough concept 10 years ago due to organizations realizing their QA teams should not see or hear Social Security numbers, credit card numbers, and other important sensitive information about callers.

But consider how a transcript might appear if speech-to-text software couldn’t accurately distinguish homophones:

“The Packers 1 the big game.”

“All 4 1 and 1 for all.”

“2 be or not 2 be.”

“She 8 the entire thing.”

My team has worked with more than 20 different speech technologies ranging from poor to excellent quality, but we’ve seen great improvement over the years thanks to some industry leaders that have been working hard to solve this issue.

  • Not only are transcriptions now more accurate but some programs allow users to mask numerals only if they appear in a predetermined string as one would find in Social Security numbers, account numbers, and phone numbers, or located near keywords associated.
  • Some technologies take it a step further by allowing users to redact the “social” and the credit card information but maintain the phone numbers for cohort research and account numbers for perhaps tracking the shipping.
  • Wilmac, a New York-based company that develops and delivers solutions to access voice data, for example helps companies unlock voice recordings to convert to a usable format and has added numeric and keyword redaction capabilities to their offering.
  • More advanced technologies such as Neuraswitch allow users to redact keywords tailored to their business such as email addresses, passwords, physical addresses, and medical conditions commonly associated with PII, PHI, and HIPAA.

This is a “conversational redaction” tool, not just a numeral text redaction. The technology operates in the background using natural language processing (NLP) to efficiently detect, redact, and mask all sensitive data from call recordings, transcripts, and conversations.

Users can label the “redaction type” and choose various masking capabilities within the audio and transcriptions. Labeling the masking type is important to allow analytics to continue with the anonymized data. Traditional redaction is useless for analytics without understanding what data was redacted.

Rules-Based Solutions

And then there are rules-based solutions which redacts or maintains the data that their client requests.

ASR (automated speech recognition/transcription) providers such as AWS, Google, and Microsoft currently offer the “tools” necessary to customize redaction programs based on specific requirements.

These tools exist to enable redacting of either the agent or customer but lack dynamic conversational, yet. It won’t be long before the big tech companies make conversational redaction a possibility. CallMiner is another company who can offer a rules-based redaction solution.

Conversational redaction from Voci Technologies, Wilmac, and Call Journey, can dynamically redact unprompted utterances in part because their solutions (powered by Neuraswitch) can consider speaker sequence dynamically and redact key portions of conversations from both audio and transcriptions in real time. They also offer an analytics platform for search and play. All without storing any form of sensitive data.

Here are examples of targeted and conversational redaction:

  • My social is 999-777-5555 and um, wait a second, I mean 999-777-4444 and my account number is 16409.
    • Redaction results: My social is (SSN) and my account number is 16409.
  • My email address is [email protected].
    • Redaction results: My email address is (EMAIL).
  • I’m trying to get an appointment to set up for my 15-year-old daughter Rebecca, who sounds like she has strep throat.
    • Redaction results: I’m trying to get an appointment set up for my 15-year-old daughter (MINOR) who sounds like she has (SYMPTOMS).

Wayne Ramprashad, Chief Product Officer of Voci Technologies is seeing the changes in demand.

“We’ve been able to support clients with redacting numbers and simple words for years. This performs well as our transcription accuracy is high, but now based upon demand we are in the process of offering conversational redaction that considers what both speakers are saying and detecting sensitive information that might not be prompted or uttered in normal speaker sequence.”

Ramprashad goes on to say, “this is not easy to do, yet so important to diminish risk. And then you add on the growing demand of real-time transcription, and this is a breakthrough in the world of data security within contact centers.”

Wilmac helps companies unlock their voice recordings to be either stored, cleansed, or pushed to other analytics tools.

Wilmac Vice President Steve McDonnell Jr. stated, “Our ability to redact beyond numbers and detect sensitive information dynamically is becoming a box our clients want to check.”

Wilmac has a niche in Fintech and mentioned that getting recordings unlocked is a significant service of theirs, and says, “The ultimate deliverable is providing customer interactions from accurate transcriptions that are masked for sensitive information and ready to be ingested into a variety of business intelligence tools.”

Ask The Expert Tip:

When thinking through potential improvements to your redaction capabilities, always keep in mind transcriptions are often imperfect and can be difficult to read for context. That fact alone, makes accurate transcription important along with conversational redaction platforms that can detect hard to identify sensitive data.


For compliance, every agent and every call counts. Failure to comply with industry regulations and requirements can be devastating for any business.

The best way to prove compliance to relevant authorities and customers is to show that your company has adopted and follows its own internal corporate best practices, security policies, and any other written processes mandates, in addition to any specific regulatory requirements within your industry or business vertical.

Conversational redaction services are the future ensuring no call is overlooked. Pause-and-resume has been a commendable tool for compliance: until now. Employing automated call redaction allows call centers to rest easy knowing they’ve protected their consumers as well as themselves.

And I’ll rest easier, too.