Deep Learning: Training the Technology of the Future
A talk by Sayan Pathak, Principal ML Scientist at Microsoft, at North Star AI conference, powered by Proekspert.
This year’s North Star AI conference in Tallinn invited industry rockstars to present the latest and most radical developments of the field. Among the featured speakers was Sayan Pathak, Principal Machine Learning Scientist at Microsoft, who gave a thorough overview of how the company is utilizing deep learning (DL). Moreover, Pathak ended his speech by encouraging developers to take initiative and join in the effort to build this technology of the future.
Neural networks are not new, began Pathak. While the technology itself has been around for some time—and many have had big hopes for it—the fact is that we have not been able to train neural nets well enough to do the things that we would like them to do, let alone to have their own minds (as many AI enthusiasts might hope).
Pathak assured his audience that the process of training these systems is not simple and requires a lot of work. That said, in the last few years we have witnessed several innovations that have enabled neural networks to take their first baby steps to meeting their greater potential. This is the era when we may actually see this technology come to fruition.
The Landscape of Machine Learning
To explain and explore the greater potential of deep learning, Pathak first gave a comprehensive overview of all the places where scalable DL is currently occurring in the industry. He also hinted at how this technology might increasingly infiltrate our daily lives as we move forward.
Examples of where Microsoft Deep Learning Technology is being used:
- Cortana – uses speech related APIs for transcribing and translating speech in real-time.
- Bing – uses deep learning to enhance advertising platform and tools.
- HoloLens – while use cases for this product are still being formed, there are many ways for augmented reality (AR) to integrate and utilize DL technology.
- Office 2016 – uses deep learning to enhance popular programs such as PowerPoint.
Pathak explained that all of these powerful products have AI components already built into them and are constantly being further enhanced with deep learning capabilities. He also noted that Microsoft’s deep learning tools are being used beyond the Microsoft family of consumer-facing products.
For his first example Pathak cited the Chinese startup Airdoc, which provides doctors with powerful tools to help them detect and diagnose life-altering diseases such as diabetic retinopathy—a disease which is rampant in a country with over one hundred million diabetics. Using Microsoft’s Cognitive Toolkit together with Azure GPUs, Airdoc was able to cut down on training time for its own deep learning models and assure data security.
Another example that Pathak pointed to was Microsoft’s speech breakthroughs. With new tools, such as a powerful translator that can translate real-time conversations between people speaking in two different languages, the company is vamping up favorite products such as Skype and PowerPoint. The Asian phone carrier Huawei has also partnered with Microsoft to utilize these capabilities in newer versions of their phones.
These are just a few of the innovative ways that deep learning technology is being leveraged to radically transform useful tools and platforms. In his talk, Pathak noted that it is important for developers from all backgrounds to take part in further building this technology; there is a need for openness and democratization.
While machine learning has not always been accessible to coders with fewer resources—mainly due to the high energy cost and tremendous amounts of data involved in the training process—Microsoft is actively trying to remedy the issue. In fact, Pathak himself offers a free deep learning course online.
How You Can Contribute to the Industry
After providing background on where machine learning is being used today, Pathak turned his attention to some of the tools that Microsoft offers to developers. “We have a role to play” he said in regards to the crucial development of neural nets.
Whether it be individuals or whole startup teams, Microsoft Cognitive Services is an invaluable set of APIs that developers can tap into to complete various tasks related to speech, text, and image-related vision. Instead of building and training machines from scratch, coders can take advantage of this resource and actively start building their own applications.
According to Pathak, the Microsoft Cognitive Toolkit (CNTK) provides low level APIs which offer incredible flexibility. “The distributed framework is one of the easiest ones I have come across,” he said. “I’m biased maybe,” he added with a smirk.
CNTK allows developers to compose arbitrary neural networks into complex computational networks. The toolkit uses graphs to represent neural networks, explained Pathak. Most of these networks have automatic differentiation built into them, which enables the engine to optimize the workflow. With these tools developers can edit the models, clone them, and fine tune them to fit their applications. A major plus of CTNK is that it allows you to have lego-like compositibility for different applications.
One of the challenges of machine learning, said Pathak, is that a variety of toolkits have to operate on all framework/hardware combinations. Moreover, it is not easy to translate from one framework/platform combination to another. Even in a large company like Microsoft, this means increasing fragmentation across teams. The solution, Pathak argued, is an intermediate representation.
Enter ONNX – an online neural network exchange that is an open format to represent deep learning models, supported by CNTK, PyTorch, Caffe 2, and MXNet. Microsoft and Facebook teamed up to create this open ecosystem for AI model interoperability out of the need to homogenize the stack. What’s more, a wide range of partners are already working on this ecosystem.
The strategy, said Pathak, is to drive AI apps and models to be built and run on Microsoft platforms. “From Microsoft’s perspective, the key is to support any popular network. The idea is to provide this IP to the community so that they can build solutions.”
ONNX is an open source intermediate representation of a computation graph with common OPs and semantics. “This is what is driving the AI field for common people who don’t have large access to resources, data, technologies, or software know-how,” said Pathak. As an open source technology, ONNX is opening up a whole realm of opportunities for the community.
“We are looking for contributions from the community. The more the community participates, the more the AI ecosystem will grow better and democratize AI resources and tools,” Pathak explained as his talk drew to a close. To learn more about these technologies, be sure to visit cntk.ai and onnx.ai.
If you want to listen to Sayan Pathak´s speech, check the video: