Home » News » Columns » Frank: AI Ain’t So Smart

Frank: AI Ain’t So Smart

Assistant professor of information science Allison Koenecke, an author of a recent study that found hallucinations in a speech-to-text transcription tool, works in her office at Cornell University in Ithaca, N.Y., Friday, Feb. 2, 2024. The text preceded by “#Ground truth” shows what was actually said while the sentences preceded by “”text”” was how the transcription program interpreted the words. (AP Photo/Seth Wenig)

Russell Frank

,

Threatened as we are by The Machine, we love it when we can say, “Hah! The Machine is dumber than we are!”

Case in point: I had some transcribing to do. I always have transcribing to do, because in my professional life I am often asking people questions and recording their answers. 

My adventures in interviewing and transcribing began when I studied folklore in graduate school. Part of what we learned in folklore school was the folklore of folklore – stories about our intrepid forebears who lugged reel-to-reel tape recorders as heavy as bowling balls around the southern mountains to record ballads first heard in the British Isles 200 years prior.

By the time I was tromping around with a recording device, reel-to-reel had given way to portable cassette as the fieldwork medium of choice. Many are the hours I then spent stopping and starting the tape while madly scribbling the words I was hearing.  

I considered paying someone to transcribe for me, but doing it myself put me on more intimate terms with the material – which meant that when I sat down to write I could readily recall the most quotable quotes and where to find them. More time on the front end meant less time on the back end. 

And then along came transcription software. Early versions were comically bad, especially if the speaker did not speak Americanese. A New Zealander friend and I took turns testing some long-ago transcribing app. It did fine with my voice. With the Kiwi, though, the app may as well have told us, “Fellas, I have no idea what that guy just said.”

Which brings us to the present moment. Now we record on our phones rather than on tape. And now we can drop audio files onto an online site and let The Machine spit out the transcript.

The first app I tried was dazzlingly fast. At first glance it looked accurate as well. Then I looked closer. Here are some of my favorite howlers from several interviews with Irvin Moore, the 79-year-old Penn State undergraduate who spent 52 years in prison:

  • Me: “So tell me about Rafi”: Rafi is Rockview, the recently closed state prison where Irvin spent the last 26 years of his incarceration. Elsewhere, Rockview appears as Rock Group, Rock Me and Raptor.
  • Dr. Joseph Magic Habits/ Dr. Joseph Magic Campus: Dr. Joseph F. Mazurkiewicz, Rockview’s superintendent from 1970 to 1997.
  • Shortly after Irvin got out of prison at age 75, he got his first driver’s license and his first car. He described the car thus: “It has a Nimby line sticker on this side and a Philadelphia eel stick on this side.” In other words, the car was decorated with a Nittany Lions sticker and a Philadelphia Eagles sticker.
  • In telling me about the brutal police response to a riot at Holmesburg Prison in Philadelphia in 1970, Irvin recalled a photo of Police Commissioner (and future mayor) Frank Rizzo coming to the prison from a formal dinner with “a nice stick” sheathed in his “cucumber bun.” That would be a nightstick tucked into his cummerbund.
  • Irvin: “I understand why terror is such a pope weapon.” Make that a potent weapon.
  • During his early years in prison, many of Irvin’s fellow inmates became followers of a philosophy “created by Denise of Islam.” Who, I wondered in reviewing the transcript, was Denise? Then I listened: It was the Nation of Islam.
  • The transcription software had the most trouble with Graterford, the prison where Irvin spent the first 26 years of his life sentence. Graterford was variously rendered as Greater Ford/ Greater Fork/ Greater For It/ Greta Ford/ Gradle for/ Gratitude/ Great Afford and my favorite, Cradifur.

And so on. To be fair to The Machine – not that one needs to be fair to a machine — many of these garbles are understandable. We humans do not always enunciate as clearly as we might; our words often run together. During an extended stay in France many decades ago, I struggled to hear where one word ended and another began. When I speak to groups of non-native speakers of English, I warn them I’m from New York, City of Fast Talkers, yaknowwhatImean? 

Still, the most annoying thing about dealing with artificial intelligence may be its lack of humility. When manual transcribers can’t make out a word or a phrase, they bracket a question mark next to it or maybe an [inaudible]. Then they go back and listen again. And again. 

If that doesn’t work, similar interpolations in the published work will tell the reader that certain words were uncertain or indecipherable.

But no, The Machine just barrels ahead, hurling Magic Habits and Cradifur and all the rest at us like it knows exactly what it’s doing. 

The Machine can do a lot of things that we mere mortals cannot. But it doesn’t know what it doesn’t know, which means it may be artificially intelligent, but it isn’t artificially wise. 

[empowerlocal_ad localaction]