When Donald Trump flatly denied that he posed as public-relations man John Miller in a 1991 phone interview, we asked a speech scientist at Carnegie Mellon University to analyze the voice on the tape and to compare it to an interview of Trump from around the same time. Her conclusion: “John Miller” is Trump.
“Same person,” said Rita Singh, a researcher at Carnegie Mellon, who compared the 1991 phone interview of “John Miller” to a TV interview with Donald Trump by CBS’ Connie Chung. “Several micro-sections of his voice in the John Miller tape match the corresponding micro-sections in his Connie Chung interview from 1990.”
The Washington Post published a recording of a 1991 People magazine interview with “John Miller” in a May 13 article that ran under the headline “Donald Trump masqueraded as publicist to brag about himself.” The Post claimed Trump conducted interviews in the 1980s and ’90s while posing as “John Miller” or “John Barron” — “public-relations men who sound precisely like Trump himself — who indeed are Trump, masquerading as an unusually helpful and boastful advocate for himself, according to the journalists and several of Trump’s top aides.”
Moreover, the reporter who conducted the People interview says Trump later admitted as much to her at the time, saying it was “a joke that went awry.” The reporter, Sue Carswell, reported the ruse in a July 1991 People story that ran under the headline “Trump Says Goodbye Marla, Hello Carla: And a mysterious PR man who sounds just like Donald calls to spread the story.”
You can read a transcript of the phone conversation here.
But when Trump was confronted about the 25-year-old recording on the “Today” show on May 13, Trump flatly denied that it was him.
“No, I don’t know anything about it,” Trump said. “You’re telling me about it for the first time and it doesn’t sound like my voice at all. I have many, many people that are trying to imitate my voice and you can imagine that. This sounds like one of these scams, one of the many scams. It doesn’t sound like me.”
Trump was asked if he regularly posed as his own PR man for phone interviews, as the Post alleged.
“No, and it was not me on the phone,” Trump said. “It was not me on the phone. And it doesn’t sound like me on the phone, I will tell you that. And it was not me on the phone.”
Jimmy Kimmel, host of ABC’s “Jimmy Kimmel Live,” also raised the issue of the recording during an interview on May 25.
“It didn’t sound like me, though, really,” Trump said. “You think that sounded like me?”
“Yeah,” Kimmel said emphatically, to laughter, and later added, “If it was you, I think it was a very funny thing to do, to call a guy and take him through the ringer like that.”
Trump admitted that he sometimes has used aliases when purchasing real estate, explaining that if he used his real name, “you had to pay more money for the land.” Trump said one of the names he used was “John Barron.” Using aliases in the real estate business is common, Trump said, to prevent sellers from inflating the asking price.
Indeed, it is well-documented that Walt Disney used fake names to buy up land in Florida that would become Walt Disney World.
We don’t take any position on Trump’s use of an alias, but the fact is that he denied using one while being interviewed by People magazine in 1991 even though there is evidence that it was indeed him.
We asked Rita Singh, a speech scientist at Carnegie Mellon University, to analyze the voice on the tape. She then compared it to a recording of an interview of Trump with Connie Chung in 1990.
Singh is part of a team at Carnegie Mellon using a cutting-edge voice profiling technique called micro-articulometry. At various conferences this year, Singh unveiled a series of peer-reviewed papers: two in Cyprus in March; one in Croatia in May; and one at the 2016 IEEE International Symposium on Technologies for Homeland Security in Waltham, Massachusetts, in May.
But the short version is this: Speech is produced when air from the lungs passes through the vocal tract. Different speech sounds are produced by controlling the airflow, vibration of the vocal folds, and the position of articulators like the jaw, tongue and lips, Singh explained. As we speak, sounds follow one another rapidly. In going from one sound to the next, the size, shape and agility of all of these elements come into play.
For example, when a sound that does not require the vocal folds to vibrate is followed by a sound that does, there is a time lag (of the order of milliseconds) between the two states. “Much like the pickup on your car when you hit the accelerator,” Singh said.
Here’s how it is explained in a paper written by Singh and Eduard Hovy, both of Carnegie Mellon, and Joseph Keshet of Bar Ilan University in Israel: “The term articulometry itself conventionally refers to the measurement of the movements, dimensions and positions of the articulators in the human vocal tract during the process of speech production.”
“That is very characteristic of people,” said Singh, who for two years has been working with federal investigators at Homeland Security on voice-based crimes, such as hoax callers reporting fake threats.
Researchers at Carnegie Mellon fragment speech into tiny sections, on the order of fractions of a second. From these they measure many different micro-scale properties of sounds and of transitions between sounds. Singh likened these micro-properties to bar codes. “And if those bar codes match,” she said, “there is a very high probability that it’s the same person.”
“No matter how much you try to change or disguise your voice, many things about your voice are out of your voluntary control, you cannot change them,” Singh said. And so, “matching these helps us determine if someone is impersonating a speaker or not.”
Comparing on a micro level the speech of “John Miller” to that of Trump in the Connie Chung interview showed four matching micro properties. That’s enough to convince Singh it was Trump in both interviews.
James Baker, who founded Dragon Systems, a pioneer in speech recognition technology, said Singh’s technique is “scientifically valid” and her work is performed with proper scientific method.
“Absolutely it is a quality scientific technique,” Baker said.
Baker said he too heard the tape of “John Miller” and based on voice quality and his 50 years of experience in the speech recognition field, he was convinced it was Trump as well.
“Could you find someone to imitate his voice good enough to fool me? Yes,” Baker said. But you’d have to then assume the alleged PR person was trying to imitate Trump, and, he noted, “No one’s saying that it’s someone trying to be Donald Trump.”
Singh, using her advanced computer-enhanced techniques, would be able to analyze that much more reliably, he said. “[Singh’s] work is absolutely very solid for that,” he said.
Still, he said, no current technology can be certified as 100 percent reliable. The probability of her being wrong is not absolute zero, he said, but “informally” he believes it is less than 1 percent.
Singh has worked in the field of speech and audio processing for 18 years, and has been at Carnegie Mellon since 2009. In the early 2000s, she worked for three years as a visiting scientist in the Spoken Language Systems Group at the Massachusetts Institute of Technology, and in 2007 was the first employee of the Human Language Technology Center of Excellence at Johns Hopkins University. She is the associate editor of the IEEE Signal Processing Letters, a monthly publication “designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing.”
Singh herself cautioned that audio signal processing has not been perfected yet and that the scientific confidence level is not fully 100 percent. But the data and her experience leave no doubt.
“I am convinced it is the same person,” Singh said. “There is no doubt in my mind about it.”