Page 16 - Fortune-November 01, 2018
P. 16

installed on nearly all new phones. Amazon,                colloquialisms, and the context of conversa-
           in contrast, needs to get consumers to install             tions by analyzing, for example, recordings of
           and then open the Alexa app on their iPhones               call-center agents talking with customers or
           or Android devices. “The extra step to open                interactions with a digital assistant.
           the Alexa voice app puts Amazon at a distinct                Voice-recognition systems rely as much on
           disadvantage,” says Loup’s Munster, formerly               physics as on computer science. Speech creates
           a Wall Street analyst of computer companies.               vibrations in the air, which voice engines pick
           By contrast, all that’s required to activate Siri          up as analog sound waves and then translate
           and the Google Assistant is to say their names.  NICK FOX  into a digital format. Computers can then
             That said, iOS and Android are open to  GOOGLE           analyze that digital data for meaning. Artificial
           third-party developers of all stripes, and Ama-            intelligence turbocharges the process by first
           zon is one of them—meaning that nothing is  “Every once    figuring out whether the sound is directed
           stopping developers on both platforms from  in a while     toward its systems by detecting a customer-
           writing Alexa programs. Bezos bragged in an  there is a    chosen “wake word” such as “Alexa.” Then they
           earnings release earlier this year that “tens  tectonic    use machine-learning models trained by what
                                                     shift in
           of thousands of developers across more than  technology,   millions of other customers have said to them
           150 countries” are building Alexa apps and  and we         before to make highly accurate guesses as to
           incorporating them into non-Amazon devices.  think voice   what was said. “A voice-recognition system
           Indeed, partnerships are a key battleground for  is one of  first recognizes the sound, and then it puts the
           voice applications. Alexa is built into “sound-  those.”   words in context,” explains Johan Schalkwyk,
           bars” from Sonos, headphones from Jabra,                   an engineering vice president for the Google
           and cars from BMW, Ford, and Toyota. Google                Assistant. “If I say, ‘What’s the weather in…,’
           boasts integrations with audio equipment mak-              the A.I. knows that the next word is a country
           ers Sony and Bang & Olufsen, August smart                  or a city. We have a 5-million-word English vo-
           locks, and Philips LED lighting systems, and               cabulary in our database, and to recognize one
           Apple has partnerships that allow its HomePod              word out of 5 million without context is a super
           to work with First Alert Security systems and              hard problem. If the A.I. knows you’re asking
           Honeywell smart thermostats. “The beauty of                about a city, then it’s only a one-in-30,000 task,
           these partnerships,” says Google’s Fox, “is that  ROHIT PRASAD  which is much easier to get right.”
           they allow us to link voice into the whole smart-  AMAZON    Computing power allows the systems multi-
           appliance ecosystem. I don’t have to open my               ple opportunities to learn. In order to ask Alexa
           phone and go to an app. I can just say to the  “We wanted  to turn on the microwave—a real example—the
           device, ‘Show me who’s at my front door,’ and it  to remove  voice engine first needs to understand the
           will pop right up. It’s simplifying by unifying.”  friction  command. That means learning to decipher
             Artificial intelligence has long been a staple  for our   thick Southern accents (“MAH-cruhwave”),
                                                     customers,
           of dystopian popular culture, notably from  and the        high-pitched kids’ voices, non-native speakers,
           films such as The Terminator and The Matrix,  most          and so on, while at the same time filtering out
           where wickedly clever machines rise up and  natural        background noise like song lyrics playing on
           pose a threat to humankind. Thankfully, we’re  means       the radio. It then has to understand the many
       FOX : COUR T ESY OF GOOGLE; PR AS A D: COUR T ESY OF AM A ZON  voice-recognition programs were only as good  ‘Choose  other voice assistants match questions with
                                                                      ways people might ask to use the microwave:
                                                     was voice.
           not there yet, but advances in A.I. and the
                                                     It’s not
                                                                      “Reheat my food,” “Turn on my microwave,”
           availability of cheap computing have made im-
                                                     merely a
           pressively futuristic applications a reality. Early
                                                                      “Nuke the food for two minutes.” Alexa and
                                                     search en-
                                                     gine with
                                                                      similar commands in the database, thereby
           as the programmers who wrote them. Now
                                                     a bunch
           these apps keep getting better because they
                                                                      “learning” that “reheat my food” is how a par-
                                                     of results
                                                                      ticular user is likely to ask in the future.
           are connected through the Internet to data
                                                     that says,
           centers. These complex mathematical models
                                                                        The technology has taken off in part because
           sift through huge amounts of data that com-
                                                                      it has gotten so proficient at translating human
                                                     one.’ It tells
                                                                      commands into action. Google’s Schalkwyk
           panies have spent years compiling and learn
                                                     you the
           to recognize different speech patterns. They
                                                                      says his company’s voice engine now responds
                                                     answer.”
           can recognize vocabulary, regional accents,
                                                                      with 95% accuracy, up from only 80% in
                                                                                                      117
                                                                                          FO R T U N E. CO M //  N O V. 1 . 1 8
   11   12   13   14   15   16   17   18   19   20   21