1 April 2016

Google opens access to its speech recognition API

Reading Time: 2 minutes

Google is planning to compete with Nuance and other voice recognition companies head on by opening up its speech recognition API to third-party developers. To attract developers, the app will be free at launch with pricing to be introduced at a later date.

We’d been hearing murmurs about this service developing for weeks now. The company formally announced the service today during its NEXT cloud user conference, where it also unveiled a raft of other machine learning developments and updates, most significantly a new machine learning platform.

The Google Cloud Speech API, which will cover over 80 languages and will work with any application in real-time streaming or batch mode, will offer full set of APIs for applications to “see, hear and translate,” Google says. It is based on the same neural network tech that powers Google’s voice search in the Google app and voice typing in Google’s Keyboard. There are some other interesting features, such as working in noisy environments and in real-time.

Google’s move will have a large impact on the industry as a whole — and particularly on Nuance, the company long thought of as offering the best voice recognition capabilities in the business, and most certainly the biggest offering such services. A number of Nuance customers, including startups, could leave it in favor of Google’s technology, which not only offers an improved experience over current providers, but will also be made available at a lower cost.

To attract developer interest initially, the API will be completely free to use. Over time, the API will be paid, but likely have low-cost pricing tiers, we understand. Google may choose to raise those prices over time after it becomes the dominant player in the industry.

Google has offered limited access to its voice technology in its products to date. Developers can make JavaScript calls to the Chrome API, for example, which are then routed to the speech recognition API. And Google announced a Voice Interaction API at Google I/O in 2015, which allows Android developers to add voice interactions to their apps. But Google had yet to open up access to the speech recognition API directly.

The introduction of the speech API won’t only impact Nuance and other speech recognition providers but is also being seen as an attack on Apple, whose virtual assistant Siri’s voice recognition capabilities pale in comparison to Google’s. It’s also yet to offer an API for developers to use the Siri tech in their own apps.

There were hints that Google would be putting a greater emphasis on its voice technology and many use cases. For example, the company announced in February that it would allow Google Docs users to edit and format their documents by voice.

The original source of the article here.