Skip to main content

How Machines Learn: The Top Four Approaches to ML in Business

Machine learning sits at the forefront of innovation across a growing number of industries in today’s business world. Still, it’s a mistake to think of machine learning as one monolithic business solution — there are many forms of machine learning and each is capable of solving different sets of problems. The most popular forms of ML used in business today are supervised, unsupervised, semi-supervised, and reinforcement learning. At Vidora, we’ve used these techniques to help Fortune 500 partners solve some of their most pressing problems in innovative ways. This article draws from our experiences to demystify these four common approaches to ML, introducing practical applications of each technique so that anyone in your organization can recognize how machine learning can enhance your business.

Machine Learning at a Glance

Machine learning is an approach to Artificial Intelligence which borrows principles from computer science and statistics to model relationships in data. Unlike other AI systems which distill human knowledge into explicit rules (e.g. Expert Systems), ML instructs an algorithm to learn for itself by analyzing data. The more data it processes, the smarter the algorithm gets.

Machine learning is not a new concept. Its theoretical foundation was laid in the 1950s when Alan Turing conceptualized a “learning machine”. That same decade, Frank Rosenblatt invented the “perceptron” to roughly simulate the learning process of the brain. More algorithms followed, but machine learning remained largely confined to academia until only recently. With explosions in data availability and computational power, it is finally possible for businesses to deploy machine learning at scale. Organizations have had success with each type of learning, but making the right choice for your business problem requires an understanding of which conditions are best suited for each approach.

Supervised Learning

If you know which metric you’d like to predict and have examples labeled with that metric, supervised learning is the best approach. A supervised algorithm is shown the “right answer” for a set of sample data and finds a function which approximates the relationship between the inputs and outputs. This functional mapping takes the general form y = f(x) — specify your target output y, provide your inputs x, and the ML algorithm will learn the optimal f() by finding patterns in the data.

y = f(x)
Description Training Phase Live Model
y Output Supplied Predicted
x Input Supplied Supplied
f() Functional mapping Learned Used to generate predictions

Supervised learning outputs typically have one of two forms. Regression outputs are real-valued numbers that exist in a continuous space. For instance, many of Vidora’s eCommerce customers want to forecast how much money each customer is likely to spend, so that high-value customer may be targeted with personalized promotional offers. A simple linear regression structures this problem through the familiar formula y = mx + b, where y is predicted expenditure and x is some attribute of each customer — say, number of site visits. During training, we supply labeled input-output pairs — i.e. customers for which transaction history is already known — and the algorithm finds the optimal parameters m and b to make this relationship as accurate as possible. In reality, Vidora’s regression model is likely to input hundreds of customer attributes each with its own parameter, but the algorithm’s mechanism of action remains the same.

Popular supervised learning algorithms:

Regression:
  • Linear regression
  • Random forest
  • Multi-layer perceptron
  • Convolutional deep neural networks
Classification:
  • Logistic regression
  • Support vector machines
  • Convolutional deep neural networks
  • Naive Bayes

Unsupervised Learning

Unsupervised learning is used when training data has no specific label for the algorithm to predict. Without “right answers” to train on, the job of an unsupervised algorithm becomes clustering the data in order to uncover new rules and patterns. Finding inherent structures in the data can yield important and practical insights, from detecting data anomalies that mark credit card fraud, to revealing what your best customers have in common.

Popular unsupervised learning algorithms:

  • K-means clustering
  • Principal component analysis
  • Non-negative matrix factorization
  • Hidden Markov model
  • Hebbian learning
  • Autoencoders

Semi-supervised Learning

At Vidora, we’ve seen that collecting labeled data at scale is a challenge for many business organizations, but unlabeled data is relatively abundant. Semi-supervised learning makes use of this plentiful unlabeled data to gain a better understanding of the population structure and distribution. For instance, a bank which offers home loans may wish to identify which of its customers own a house, but may have limited access to this information. Under the semi-supervised approach, an algorithm would first use information obtained from labeled data to predict homeownership for unlabeled data. Next, both the labeled and predicted data are passed through a supervised framework to learn a homeowner identification model. Despite never being evaluated, the estimated labels may improve performance of the supervised model by providing a larger set of potential homeowners from which the algorithm can learn.

Popular semi-supervised learning algorithms:

  • PU classification
  • Transductive SVM
  • Co-training

Reinforcement Learning

Reinforcement learning is used in situations where the computer is an agent interacting with its environment in pursuit of a goal. Here, feedback is the key ingredient. Rather than being shown a “right answer”, the algorithm is provided a reward signal against which it evaluates and adjusts its methods. With experience, the algorithm learns which sequence of actions gives it the best chance of maximizing its reward and achieving its goal.

Reinforcement learning typically requires huge amounts of data, but doesn’t force your business to be highly specific about its goals. Some autonomous vehicles learn to drive through reinforcement. These cars are instructed to get from point A to point B under only two broad conditions: obey the rules of the road, and don’t crash. The rest is learned through trial and error. Google’s famed AlphaGo program also learned to play the ancient Chinese board game Go using reinforcement. Armed with only the game’s rules and a goal of winning, AlphaGo learned which moves tended to maximize its chance of success. Merely two years after making its first move, AlphaGo famously dethroned the Go world champion in 2016.

Popular reinforcement learning algorithms:

  • Q-learning
  • Temporal difference
  • Monte Carlo tree search
  • Sarsa

ML and Your Business

Each of supervised, unsupervised, semi-supervised, and reinforcement learning has shown meaningful success in the business world. As the practical scope of machine learning broadens, fluency in its key concepts becomes an increasingly important business skill even for those with no data science experience. Recognizing which sorts of problems each ML approach is best-equipped to solve empowers business experts to recognize where the technology may make its greatest contributions to key business outcomes.

Michael Firn is a Product Manager at Vidora, where he works closely with both Vidora’s engineering team and Vidora’s Fortune 500 partners such as News Corp, Walmart and Time to help develop and implement machine learning solutions to their business problems. 



from Gigaom https://gigaom.com/2018/03/23/955138/

Comments

Popular posts from this blog

Who is NetApp?

At Cloud Field Day 9 Netapp presented some of its cloud solutions. This comes on the heels of NetApp Insight , the annual corporate event that should give its user base not just new products but also a general overview of the company strategy for the future. NetApp presented a lot of interesting news and projects around multi-cloud data and system management. The Transition to Data Fabric This is not the first time that NetApp radically changed its strategy. Do you remember when NetApp was the boring ONTAP-only company? Not that there is anything wrong with ONTAP of course (the storage OS originally designed by NetApp is still at the core of many of its storage appliances). It just can’t be the solution for everything, even if it does work pretty well. When ONTAP was the only answer to every question (even with StorageGrid and EF systems already part of the portfolio), the company started to look boring and, honestly, not very credible. The day the Data Fabric vision was announced