The sound of your voice is becoming a new type of fingerprint.
Increasingly sophisticated technology that detects nuances in sound inaudible to humans is capturing clues about people’s likely locations, medical conditions and even physical features.
Law-enforcement agencies are turning to those clues from the human voice to help sketch the faces of suspects. Banks are using them to catch scammers trying to imitate their customers on the phone, and doctors are using such data to detect the onset of dementia or depression.
That has created new possibilities for health care, finance and criminal justice organizations while also raising fresh privacy concerns, as consumers’ biometric data is harnessed in novel ways.
“People have known that voice carries information for centuries,” said Rita Singh, a voice and machine-learning researcher at Carnegie Mellon University who receives funding from the Department of Homeland Security. “It’s not new, but there wasn’t a way to get it out,” she said, noting it is possible today because of artificial intelligence.
Ms. Singh measures dozens of voice-quality features—such as raspiness or tremor—that relate to the inside of a person’s vocal tract and how an individual voice is produced. She detects so-called microvolumes of air that help create the sound waves that make up the human voice. The way they resonate in the vocal tract, along with other voice characteristics, provides clues on a person’s skull structure, height, weight and physical surroundings, she said.
Her work points to a future of surveillance and investigation in which law-enforcement officials can rely on audio as well as video content. Some financial firms already use the human voice to catch fraudsters.
Pindrop, an information-security company based in Atlanta, studies 1,380 audio features that fall into three main buckets: the kinds of noise detected on a phone line, the frequency characteristics of the call and how much of the sound is lost through transmission.
Those factors throw off hints about a call’s likely origin and whether it was transmitted over the internet, a mobile phone or a landline. Calls from overseas, particularly developing countries, are often less clear than those from developed countries, even if that difference is hard for the human ear to detect.
That information helps banks and finance firms verify whether callers are who they say they are.
Discover Financial Services Inc. receives so-called voiceprints of callers—not recordings of their voice—and flags known fraudsters. If a scammer is detected, a customer-service agent can ask the caller for a code sent to a device owned by the actual customer.
Losses from fraud, which are counted as operating expenses, have declined by 10% since Discover began using Pindrop’s voice-analytics system in 2015, said Daniel Capozzi, president of credit operations and decision management at Discover.
Scammers increasingly call banks and use Social Security numbers exposed through data breaches to determine who is wealthy. Bad actors then research customers with high account balances, learning about their lives, relatives and preferences before calling their bank back pretending to be them.
Some financial firms match audio recordings with other biometric and behavioral information they have about their customers to prevent fraud because bad actors often answer security questions about a victim’s life faster than real customers.
Nuance Communications Inc., a Burlington, Mass., software technology company whose customers include HSBC and Kennebunk Savings in Maine, examines factors like the pitch, rhythm and dialect of speech as well as vocabulary, grammar and sentence structure.
Nuance’s voice-biometric and recognition software is designed to detect the gender, age and linguistic background of callers and whether a voice is synthetic or recorded. It helped one bank determine that a single person was responsible for tens of millions of dollars of theft, or 18% of the fraud the firm encountered in a year, said Brett Beranek, general manager of Nuance’s security and biometrics business.
Audio data from customer-service calls is also combined with information on how consumers typically interact with mobile apps and devices, said Howard Edelstein, chairman of behavioral biometric company Biocatch. The company can detect the cadence and pressure of swipes and taps on a smartphone.
How a person holds a smartphone gives clues about their age, for example, allowing a financial firm to compare the age of the normal account user to the age of the caller.
Share Your Thoughts
What kind of legislation, if any, do you think states should pass around biometric privacy? Join the conversation below.
How much consumers are told about the voice and behavioral data that financial firms collect varies widely. Some financial firms ask customers to consent to voice recordings, while others simply say all customer-service calls are recorded for quality or safety reasons.
Some states have passed biometric privacy laws and others are crafting legislation. One law in Illinois requires firms to obtain explicit written consent from customers to collect biometric information such as voiceprints or iris scans.
Privacy advocates say the collection of biometric information is invasive and could lead law-enforcement officials to reach unfair conclusions. If such data collected by a company were improperly sold or hacked, some fear recovering from identity theft could be even harder because physical features are innate and irreplaceable.
In medicine, measuring slight changes in voice is starting to help doctors detect the onset of diseases like Parkinson’s or more quickly measure the efficacy of treatments for illnesses like depression, researchers say.
Boston-based Sonde Health asked more than 4,000 people to download a smartphone app and answer prompts designed to make them generate many different sounds. From those audio samples researchers identified and grouped features like rhythm, melody and how precisely the person articulates words.
Slower speech, for example, could indicate fatigue or sorrow at one point in time, but over longer periods could signal something more severe, co-founder Jim Harper said.
That voice-based data isn’t yet robust enough to base medical decisions on alone, but is being used alongside clinical trials for drugs to treat depression, Mr. Harper said.
Toronto-based Winterlight Labs Inc. parses features in speech such as syntax, grammar, complexity of vocabulary, pitch and rate of speech to monitor mental health and dementia.
Winterlight works with Janssen Pharmaceuticals Inc. to try to detect Alzheimer’s in elderly patients. Some of those patients, for example, tend to use words they acquired earlier in life as their recent memories deteriorate.
Write to Sarah Krouse at email@example.com
Copyright ©2019 Dow Jones & Company, Inc. All Rights Reserved. 87990cbe856818d5eddac44c7b1cdeb8