5 Ways to Employ AI in Your Enterprise Web and Mobile App Strategy Today
- Enterprise /
- Technology /
Artificial Intelligence (AI) is changing the way users engage with digital products and the way businesses are delivering innovative experiences. Getting started with AI can seem daunting for product teams that want to leverage AI in their enterprise applications. Fortunately, there are ways to employ AI in your digital product or service without a steep learning curve or needing to hire machine learning experts. In this article, I will examine the most prominent players in the AI space and provide examples for how product teams can easily integrate AI solutions in their web and mobile applications.
Popular AI applications are continuing to dominate the headlines: self-driving cars, voice assistants, face recognition, and task automation are some of the technologies receiving their fair amount of attention. While these products heavily leverage AI capabilities, there are facets of the underlying machine-learning technology that can elevate everyday digital products to the next echelon. This is enabled by combining cutting edge AI capabilities with modern, best-practice APIs. This amalgamation enables the integration of AI into your enterprise application without the need to deploy your own AI engine and train your own models.
Leveraging Big Tech Company Infrastructure to Build Your Machine Learning Models
Before I jump into specific examples of leveraging AI APIs, I want to cover the offerings of big tech companies. Google, Amazon, Microsoft, and IBM, among others, are currently jockeying to become the primary providers for the world’s AI infrastructure of the future. The most significant power of these services is the ability to train your own machine learning (ML) models, and execute them on their specialized infrastructure. This entails preparing your own data sets and training ML models around them, from which the AI can then base its identification, diagnosis, and decisions. This results in artificial intelligence systems that are designed around your unique business data and able to solve your specific business problems. Once you’re ready to integrate deeper machine learning functionality into your digital product, you will likely start leveraging some of these offerings.
While custom machine learning models are the holy grail of AI implementations, not every business problem requires its own ML model for resolution. For example, if your application needs to separate bananas from oranges, someone else’s model trained to identify different types of fruit will likely suffice. Therefore, you could leverage a pre-trained model as opposed to training your own model. The same logic applies to models built around common data sets, such as voice recognition and sentiment analysis, image and document analysis, unstructured text analysis, and so on. Many companies offer pre-trained cloud-based AI API services that leverage their powerful systems infrastructure while remaining easy to work with.
Five API examples to easily get started with AI in your enterprise application
Here are five APIs that provide a starting point to leverage AI in your application, without the need to train your team on machine learning or bring in AI experts.
1) Image upscaling
If your application incorporates images from different sources, you may have encountered a situation where an image source doesn’t have the necessary resolution or quality to satisfy the needs of your application. This is where an AI upscaling service, such as Let’s Enhance, may be beneficial. Using ML models that have been trained on millions of images, their AI can predict the missing detail in your images and return a version of your image source that has eight times the resolution of that source. It also removes compression fragments and can enhance the color of images. Looking at the results almost feels like there is some sort of magic at play. Their API accepts images from AWS S3, Google Cloud Storage, or any HTTP source out of the box.
2) Content Moderation
Any digital product that incorporates user-generated content is susceptible to inappropriate content being added. Traditionally, this would require administrative oversight, moderators, or community self-policing. Fortunately, AI tools, such as the Microsoft Azure Content Moderator, are now available to automate the review of text, images, and video content. This can detect offensive, profane, and adult content across different content types in more than 100 languages. It includes its own human review tool in cases where the AI engine does not have strong confidence to detect inappropriate content. This helps the AI learn and improve its prediction confidence over time. The Microsoft Azure Content Moderator can be integrated into your applications via Azure’s API or via their SDKs for Python, Node.js, and .NET.
3) Product recognition
Image and facial recognition are the most well known applications of AI. Fortunately, this is also one of the use cases where building your own ML model is relatively easy to do, opening up use cases that previously weren’t easily solvable. For example, we’re currently working with an industrial manufacturing client who needs their mobile application to be able to identify their industrial devices. Traditionally, this would need to be done by scanning a barcode or QR code. Now it is possible for a trained AI engine to identify a product simply by pointing the camera at the device. Google’s Cloud AutoML Vision provides a complete toolset for non-AI experts to train ML models with custom image data that can subsequently be used in Google’s Vision API. Upload your product catalog, label images correctly, and AutoML handles the rest. AutoML Vision even allows exporting the models to edge formats, such as Tensorflow Lite and Core ML, which can run on mobile devices directly without needing to engage an API. Because data does not need to be sent back to Google’s servers, image recognition on mobile devices is substantially more expedient.
4) Data extraction
Some digital products rely heavily on data collected from sources on the web, but those sources don’t always provide structured data that can be easily ingested. This can result in large manual efforts with “scraping engines” that constantly need to be modified as the content sources evolve. The better solution is to use an AI service that analyzes a website and processes it into structured content that can then be consumed via an API. Diffbot AI can analyze many types of web content and extract structured content. For example, if we needed to extract the post content from a user forum, Diffbot will return the posts on the page in cleanly structured JSON format as if we had just retrieved the content from the website’s database. Diffbot works on articles, discussions, image pages, product pages, and videos.
5) Voice Control
AI voice assistants have become standard in many households and users are becoming increasingly more comfortable interacting with them on a daily basis. Voice is also becoming a more prevalent input on mobile devices. If you believe that integrating voice control in your digital product could benefit your users, leveraging a Natural Language Processing (NLP) API is the easiest way to get access to those capabilities. Wit.ai provides an NLP API that accepts streaming audio in over 130 languages and returns commands that are tailored to your application’s functionality. Training and improving intent detection are all handled through their user-friendly web interface, avoiding the need to build a complex custom ML model.
The most popular AI APIs to help you scale your product
Looking beyond these five examples, the big tech companies are routinely launching new APIs that can be leveraged in your enterprise applications. Here are some of the pre-trained AI solutions that Amazon, Microsoft, Google, and IBM offer:
Amazon Web Services
Personalize experiences for your customers with the same recommendation technology used at Amazon.com.
Build accurate forecasting models based on the same machine learning forecasting technology used by Amazon.com.
- Image and Video Analysis
Add image and video analysis to your applications to catalog assets, automate media workflows, and extract meaning. This API can also provide content moderation.
- Advanced Text Analytics
Use natural language processing to extract insights and relationships from unstructured text. The service can identify critical elements in data, including references to language, people, and places, and can detect customer sentiment to improve customer experience in real time.
- Document Analysis
OCR meets AI. Automatically extract printed and hand-written text and data from millions of documents in just hours, reducing manual efforts.
Turn text into lifelike speech to give voice to your applications. You even can create a unique voice that’s unique to your service, or have your webpages automatically converted to audio.
- Conversational Agents
Easily build conversational agents (voice or text chatbots) to improve customer service and increase contact center efficiency.
Expand your reach through efficient and cost-effective translation to reach audiences in multiple languages.
Easily add high-quality speech-to-text capabilities to your applications and workflows, including medical speech.
Microsoft Azure Cognitive Services
- Computer Vision
Image classification, scene and activity recognition in images, celebrity and landmark recognition in images, optical character recognition (OCR) in images.
Face detection in images, person identification, emotion recognition, similar face recognition, and grouping
- Azure Video Analyzer
Extract actionable insights from videos, whether stored or streaming. Full analysis for the visual and audio channels of the video, conducts facial, object, keyframe recognition, OCR, and transcription; advanced insights such as topic inference, brands and emotion detection.
- Speech Services
Automatic speech-to-text transcription with customizable models; natural text-to-speech with custom voice fonts; real-time speech translation.
- Text Analytics
Named entity recognition, key phrase extraction, text sentiment analysis.
- Translator Text
Automatic language detection and text translation.
- QnA Maker
QnA extraction from unstructured text, knowledge base creation from collections of Q&As, semantic matching for knowledge bases.
- Immersive Reader
Help users read and comprehend text, features for readers of all abilities.
- Language Understanding
Contextual language understanding.
- Content Moderator
Detect potential offensive and unwanted images, filter possible profanity and undesirable text, moderate adult and racy content in videos, use built-in review tool for best results.
Deliver rich personalized experiences in your apps; deploy anywhere, from the cloud to the edge; understand and easily manage the reinforcement learning loop.
- Anomaly Detector
Embed time-series anomaly detection capabilities into your apps to help users identify problems quickly. Detect spikes, dips, deviations from cyclic patterns, and trend changes through both univariate and multivariate APIs.
- Vision AI
Analyze images in the cloud or at the edge. Pre-trained Vision API models quickly classify images into thousands of categories (such as “sailboat” or “Eiffel Tower”) and recognize individual objects, faces, and words.
- Video AI
Precise video analysis — down to the frame. These AI products make your video library more searchable and valuable. Video Intelligence API’s pre-trained models extract metadata, identifies key nouns, and annotates video content.
- Natural Language
Multimedia and multi-language processing. Natural Language uses machine learning to reveal the structure and meaning of text. You can extract information about people, places, and events; better understand social media sentiment and call center conversations; and integrate analyzed text with your document archive on Google Cloud Storage. Natural Language API’s pre-trained models deliver language understanding features, including content classification and sentiment, entity, and syntax analysis.
Fast, dynamic translation tailored to your content. With Translation, you can quickly translate between languages, using the best model for your content needs. If you want your website and apps to be able to instantly translate texts, you can use Translation API’s pre-trained neural machine translation to deliver fast, dynamic results for more than one hundred languages.
- Cloud Speech-to-Text API
Speech recognition across 120 languages. Cloud Speech-to-Text enables developers to convert audio to text by applying neural network models in an easy-to-use API. The API recognizes 120 languages and variants to support your global user base. You can enable voice command-and-control, transcribe audio from call centers, and more. It can process real-time streaming or prerecorded audio, using Google’s machine learning technology.
- Cloud Text-to-Speech API
Lifelike text-to-speech interactions. Cloud Text-to-Speech applies DeepMind’s groundbreaking research in WaveNet and Google’s neural networks to enable developers to synthesize natural-sounding speech with 32 voices in multiple languages and variants, at the highest possible fidelity. With this easy-to-use API, you can create lifelike interactions with your users across many applications and devices. This service allows creating unique voices for your business.
Conversational experiences across devices and platforms. This end-to-end development suite lets you build conversational interfaces — chatbots — that use machine learning to conduct rich, natural interactions with your users on websites, mobile apps, messaging platforms, and IoT devices.
- Recommendations AI
Deliver highly personalized product recommendations at scale. Google has spent years delivering recommended content to properties like Google Search, Google Ads, and YouTube. Recommendations AI draws on that knowledge to deliver hyper-personalized recommendations that adapt in real time to customer behavior and product and pricing changes across all channels.
Watson Assistant is an offering for building conversational interfaces into any application, device, or channel.
Watson Discovery is an AI search technology that eliminates data silos and retrieves information buried inside enterprise data.
Simplify and scale data science to predict and optimize your business outcomes.
- Language Translator
Expand to new markets by instantly translating your documents, apps, and webpages. Create multilingual chatbots to communicate with your customers on their terms.
- Natural Language Classifier
Text classification made easy. Use machine learning to analyze text, and label and organize data into custom categories.
- Natural Language Understanding
Take your understanding of unstructured data to a whole new level with a full suite of advanced text analytics features to extract entities, relationships, keywords, semantic roles and more.
- Speech to Text
Easily converts audio and voice into written text for quick understanding of content.
- Text to Speech
Convert written text into natural-sounding audio in a variety of languages and voices.
- Tone Analyzer
Analyze emotions and tones in what people write online, like tweets or reviews. Predict whether they are happy, sad, confident, and more.
Whether you are just dipping your toes into the world of AI or are ready to scale your digital product using everything these APIs have to offer, product managers now have more fully trained, ready-to-go AI computing power at their fingertips than ever before. If you are designing your product for longevity, also consider how to build a product platform for scale.
Updated May 21, 2021: Removed deprecated offerings, added new ones.