design decisions or racist lack thereof?

Original Post: iPhone Voice Command Is, Um, Racist!

Over on Difference Engines, Lilly Wrote:

I don’t much mileage out of the words racist and sexist because they get used so broadly and differently that they usually just get read as attack but don’t actually explain why the situation is messed up. That said, these kinds of technology examples are exactly the sort of thing that set me down the path of feminist technology obsession that brings you this blog today. In Margolis and Fisher’s “Unlocking the Clubhouse,” I learned about conference phone systems tuned and tested only on male voices, literally giving voices that tended higher the short shrift. Got other examples of this sort of thing?

I am pretty conflicted over — and pretty good at ranting about — these kind of issues (esp. on the gender side) as anyone whose ever had a couple beers with me knows. But… over my delicious lunch of Honey Nut Cheerios here in the office, I was pondering this blog post sober-ly…

On first glance, it seems so straightforward. Obviously Apple sucks (what else is new? here’s a fun recent story – TC: Be warned apple has apparently trademarked those shiny chat bubbles). Those people that created the unusable by women phone systems so many years ago also sucked.

IIRC, Margolis & Fisher use such examples to give an explanation as to why we should care about a lack of women in computing. That is, the systems were designed badly because no women were designing them; clearly if we had more women involved in computing, we’d have better technology (or at least more broadly-usable technology). But, any competent UX/UI designer could’ve seen the same mistake – woman or not. And as much as I’d like to love the M&F argument, the only legitimate reason I can see for caring about the lack of women in computing is that it is a reflection of greater social and cultural pressures that steer women away from computing – it is a reflection of much deeper stereotypes and inequalities that I don’t like. And that’s all I’m going to say here on that.

So, back to the voice recognition on the iPhone — if you didn’t click through, basically the iPhone fails at recognizing ‘Nguyen’ but it recognizes more typical “white” names just fine. I wonder if this was a conscious decision – and possibly, from a make this work the best for the largest number of people perspective – dare I say a good decision? I could be way off, and I’m not one to defend apple, especially as of late. But… I also have spent some time working this summer as a contract employee at Tellme (a Microsoft Subsidiary) and it’s been interesting to see (1) just how good voice recognition is these days – try calling 1-800-555-TELL for an example (2) just how far there is to go, and how much the system needs guidance. People are really good at figuring out speech that is, without contextual knowledge that only humans ‘get’, really not very intelligible.

So, say you’re apple. You’re designing a voice recognition system that you’re going to use to match what people are saying on iPhones distributed in the U.S. to entries in their address books. Maybe you bias that process towards more common American names – which also are probably white names? You have to bias it towards something or you probably don’t get good results. Maybe the numbers say, white names are the way to go for your user base?

It’s also certainly possible that this failure is not by design, but by ignorance and/or poor programming. That is, maybe it’s possible to do a better job and apple didn’t because they didn’t think about non-english names – which sucks way more than the purposeful case IMO. And maybe that means apple needs to hire some more competent UX people. And consider how they also need to involve UX people at a deeper level of design (e.g. speech engine technical details) than just making shiny chat bubbles.

But, I guess my point was, I don’t know the way that this system works, and so not only am I not comfortable playing the race card here, but I might even consider arguing that even if the system is purposefully ‘racist,’ being racist is better for the majority of users?

I don’t know what trade-offs would be involved with recognizing Nguyen better, and in technology design, I’ve learned that there are always trade-offs. If you’re in the minority, you probably get shafted more frequently than the rest of us. No doubt, that sucks for you, but if you’re a company trying to please the largest number of people possible in order to turn the most profit with the least amount of effort/cost, individualized apps likely won’t happen, and you make decisions for all based on what’s best for the majority, morals & ethics aside. Companies put morals and ethics aside all the time in a capitalist economy. Because whatever value you have – e.g. equality – the monetary value is greater in a capitalist system.

Money and capitalism aside, as computing designers, we make decisions all the time – I’m going to shaft elderly people by making the text in my twitter application really tiny because I think mostly younger people with better eyes are going to be using it. I’m going to shaft people with non-english names in my voice dialer because I think mostly people with english names are going to be using it. It’s not really as obnoxious as it seems.

And I wonder if the iPhone distributed in Japan recognizes a different set of names better and completely fails on things that my American English self thinks are totally easy to distinguish – Matt, Mark, Luke, John?

Hell, perhaps at a technical level (I do not know or pretend to know the math behind voice processing), Nguyen is simply *hard*? Maybe this is because the math that’s been developed to process such audio is done by primarily people named Chris, and it hasn’t even occurred to them that there are radically different equations they could use and filters they could design (back to the M&F argument, but as a counterexample, I’d point out off the top of my head that SpinVox, based in the UK, is the worst at recognizing UK speech. In theory, it’s because Brittish accents are way harder than American accents to understand, although SpinVox is full of disgrace lately and might just have bad programmers… or no programmers.)

I don’t really know… but for all their hype, I’d rather a voice-dialer that has me record each person’s name and then it matches what I say to what I’ve previously recorded, rather than a futuristic voice-dialer that craps out if I have contacts with ‘abnormal’ names ;)

P.S. At first I titled this “racist design decisions or lack thereof” but, that was wrong. So, as a nice little summary:

Any design decision is inherently a *decision* among multiple options. In the case of language, such decisions likely to appear ‘racist’ because they are *linguistic* and really what would bother me more is if nobody considered testing it with non-”American” names — that would be sad, and perhaps I’d be more comfortable with the baggage that comes with ‘racist’ being tied to such a lack of decision. But, if the speech recognition was designed for a certain user base and linguistic trade-offs had to be made, either Apple got it right or wrong. I’d imagine that for the demographics of iPhone users in the U.S., they might’ve gotten it right.


No Comments Yet


There are no comments yet. You could be the first!

You must log in to post a comment.

black bean quinoa b&n ereader

qualitysidewalk ends. merge... uhh...IMG_3585IMG_3583IMG_3582IMG_3579IMG_3578