Data Analytics Vs AI Vs Machine & Deep Learning

Data Analytics – Data Science, Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) are closely interconnected. The Venn-diagram shown below visualizes overlapping AI-related terminology.

We will explore in detail on each one of the following terms one by one:

Artificial Intelligence

Artificial intelligence, or AI for short, has been around since the mid 1950s. It’s not necessarily new. But it became super popular recently because of the advancements in processing capabilities. Back in the 1900s, there just wasn’t the necessary computing power to realize AI. Today, we have some of the fastest computers the world has ever seen. And the algorithm implementations have improved so much that we can run them on commodity hardware, even your laptop or smartphone that you’re using to read this right now. And given the seemingly endless possibilities of AI, everybody wants a piece of it.

But what exactly is artificial intelligence? Artificial intelligence is the ability that can be imparted to computers which enables these machines to understand data, learn from the data, and make decisions based on patterns hidden in the data, or inferences that could otherwise be very difficult (to almost impossible) for humans to make manually. AI also enables machines to adjust their “knowledge” based on new inputs that were not part of the data used for training these machines.

Another way of defining AI is that it’s a collection of mathematical algorithms that make computers understand relationships between different types and pieces of data such that this knowledge of connections could be utilized to come to conclusions or make decisions that could be accurate to a very high degree.

But there’s one thing you need to make sure, that you have enough data for AI to learn from. If you have a very small data lake that you’re using to train your AI model, the accuracy of the prediction or decision could be low. So more the data, better is the training of the AI model, and more accurate will be the outcome. Depending on the size of your training data, you can choose various algorithms for your model. This is where machine learning and deep learning start to show up.

In the early days of AI, neural networks were all the rage. There were multiple groups of people across the globe working on bettering their neural networks. But as I mentioned earlier in the post, the limitations of the computing hardware kind of hindered the advancement of AI. But from the late 1980s all the way up to the 2010s, machine learning it was. Every major tech company was investing heavily in machine learning. Companies such as Google, Amazon, IBM, Facebook, etc. were virtually dragging AI and ML PhD. people straight from universities. But these days, even machine learning has taken a back seat. It’s all about deep learning now. There’s definitely been an evolution of AI in the last few decades, and it’s getting better with every passing year. You can visualize this evolution from the image below.

Artificial Neural Network

In information technology (IT), an artificial neural network (ANN) is a system of hardware and/or software patterned after the operation of neurons in the human brain. ANNs also called, simply, neural networks are a variety of deep learning technology, which also falls under the umbrella of artificial intelligence, or AI.

Commercial applications of these technologies generally focus on solving complex signal processing or pattern recognition problems. Examples of significant commercial applications since 2000 include handwriting recognition for check processing, speech-to-text transcription, oil-exploration data analysis, weather prediction and facial recognition.

How artificial neural networks work

An ANN usually involves a large number of processors operating in parallel and arranged in tiers. The first tier receives the raw input information analogous to optic nerves in human visual processing. Each successive tier receives the output from the tier preceding it, rather than from the raw input in the same way neurons further from the optic nerve receive signals from those closer to it. The last tier produces the output of the system.

Each processing node has its own small sphere of knowledge, including what it has seen and any rules it was originally programmed with or developed for itself. The tiers are highly interconnected, which means each node in tier n will be connected to many nodes in tier n-1 its inputs and in tier n+1, which provides input data for those nodes. There may be one or multiple nodes in the output layer, from which the answer it produces can be read.

Artificial neural networks are notable for being adaptive, which means they modify themselves as they learn from initial training and subsequent runs provide more information about the world. The most basic learning model is centered on weighting the input streams, which is how each node weights the importance of input data from each of its predecessors. Inputs that contribute to getting right answers are weighted higher.

How neural networks learn

Typically, an ANN is initially trained or fed large amounts of data. Training consists of providing input and telling the network what the output should be. For example, to build a network that identifies the faces of actors, the initial training might be a series of pictures, including actors, non-actors, masks, statuary and animal faces. Each input is accompanied by the matching identification, such as actors’ names, “not actor” or “not human” information. Providing the answers allows the model to adjust its internal weightings to learn how to do its job better.

For example, if nodes David, Dianne and Dakota tell node Ernie the current input image is a picture of Brad Pitt, but node Durango says it is Betty White, and the training program confirms it is Pitt, Ernie will decrease the weight it assigns to Durango’s input and increase the weight it gives to that of David, Dianne and Dakota.

In defining the rules and making determinations that is, the decision of each node on what to send to the next tier based on inputs from the previous tier neural networks use several principles. These include gradient-based training, fuzzy logic, genetic algorithms and Bayesian methods. They may be given some basic rules about object relationships in the space being modeled.

For example, a facial recognition system might be instructed, “Eyebrows are found above eyes,” or, “Moustaches are below a nose. Moustaches are above and/or beside a mouth.” Preloading rules can make training faster and make the model more powerful sooner. But it also builds in assumptions about the nature of the problem space, which may prove to be either irrelevant and unhelpful or incorrect and counterproductive, making the decision about what, if any, rules to build in very important.

Further, the assumptions people make when training algorithms causes neural networks to amplify cultural biases. Biased data sets are an ongoing challenge in training systems that find answers on their own by recognizing patterns in data. If the data feeding the algorithm isn’t neutral and almost no data is, the machine propagates bias.

Types of neural networks

Neural networks are sometimes described in terms of their depth, including how many layers they have between input and output, or the model’s so-called hidden layers. This is why the term neural network is used almost synonymously with deep learning. They can also be described by the number of hidden nodes the model has or in terms of how many inputs and outputs each node has. Variations on the classic neural network design allow various forms of forward and backward propagation of information among tiers.

Specific types of artificial neural networks include:

  • Feed-forward neural networks
  • Recurrent neural networks
  • Convolutional neural networks
  • Deconvolutional neural networks
  • Modular neural networks

Feed-forward neural networks are one of the simplest variants of neural networks.  They pass information in one direction, through various input nodes, until it makes it to the output node. The network may or may not have hidden node layers, making their functioning more interpretable. It is prepared to process large amounts of noise. This type of ANN computational model is used in technologies such as facial recognition and computer vision.

Recurrent neural networks (RNN) are more complex. They save the output of processing nodes and feed the result back into the model. This is how the model is said to learn to predict the outcome of a layer. Each node in the RNN model acts as a memory cell, continuing the computation and implementation of operations. This neural network starts with the same front propagation as a feed-forward network, but then goes on to remember all processed information in order to reuse it in the future. If the network’s prediction is incorrect, then the system self-learns and continues working towards the correct prediction during backpropagation. This type of ANN is frequently used in text-to-speech conversions.

Convolutional neural networks (CNN) are one of the most popular models used today. This neural network computational model uses a variation of multilayer perceptrons and contains one or more convolutional layers that can be either entirely connected or pooled. These convolutional layers create feature maps that record a region of image which is ultimately broken into rectangles and sent out for nonlinear processing. The CNN model is particularly popular in the realm of image recognition; it has been used in many of the most advanced applications of AI, including facial recognition, text digitization and natural language processing. Other uses include paraphrase detection, signal processing and image classification.

Deconvolutional neural networks utilize a reversed CNN model process. They aim to find lost features or signals that may have originally been considered unimportant to the CNN system’s task. This network model can be used in image synthesis and analysis.

Modular neural networks contain multiple neural networks working separately from one another. The networks do not communicate or interfere with each other’s activities during the computation process. Consequently, complex or big computational processes can be performed more efficiently.

Advantages of artificial neural networks

Advantages of artificial neural networks include:

  • Parallel processing abilities mean the network can perform more than one job at a time.
  • Information is stored on an entire network, not just a database.
  • The ability to learn and model nonlinear, complex relationships helps model the real life relationships between input and output.
  • Fault tolerance means the corruption of one or more cells of the ANN will not stop the generation of output.
  • Gradual corruption means the network will slowly degrade over time, instead of a problem destroying the network instantly.
  • The ability to produce output with incomplete knowledge with the loss of performance being based on how important the missing information is.
  • No restrictions are placed on the input variables, such as how they should be distributed.
  • Machine learning means the ANN can learn from events and make decisions based on the observations.
  • The ability to learn hidden relationships in the data without commanding any fixed relationship means an ANN can better model highly Volatile data and non-constant variance.
  • The ability to generalize and infer unseen relationships on unseen data means ANNs can predict the output of unseen data.

Disadvantages of artificial neural networks

The disadvantages of ANNs include:

  • The lack of rules for determining the proper network structure means the appropriate artificial neural network architecture can only be found through trial and error and experience.
  • The requirement of processors with parallel processing abilities makes neural networks hardware dependent.
  • The network works with numerical information, therefor all problems must be translated into numerical values before they can be presented to the ANN.
  • The lack of explanation behind probing solutions is one of the biggest disadvantages in ANNs. The inability to explain the why or how behind the solution generates a lack of trust in the network.

Applications of artificial neural networks

Image recognition was one of the first areas to which neural networks were successfully applied, but the technology uses have expanded to many more areas, including:

  • Chatbots
  • Natural language processing, translation and language generation
  • Stock market prediction
  • Delivery driver route planning and optimization
  • Drug discovery and development

These are just a few specific areas to which neural networks are being applied today. Prime uses involve any process that operates according to strict rules or patterns and has large amounts of data. If the data involved is too large for a human to make sense of in a reasonable amount of time, the process is likely a prime candidate for automation through artificial neural networks.

History of neural networks

The history of artificial neural networks goes back to the early days of computing. In 1943, mathematicians Warren McCulloch and Walter Pitts built a circuitry system intended to approximate the functioning of the human brain that ran simple algorithms.

In 1957, Cornell University researcher Frank Rosenblatt developed the perceptron, an algorithm designed to perform advanced pattern recognition, ultimately building toward the ability for machines to recognize objects in images. But the perceptron failed to deliver on its promise, and during the 1960s, artificial neural network research fell off.

In 1969, MIT researchers Marvin Minsky and Seymour Papert published the book Perceptrons, which spelled out several issues with neural networks, including the fact that computers of the day were too limited in their computing power to process the data needed for neural networks to operate as intended. Many feel this book led to a prolonged “AI winter” in which research into neural networks stopped.

It wasn’t until around 2010 that research picked up again. The big data trend, where companies amass vast troves of data, and parallel computing gave data scientists the training data and computing resources needed to run complex artificial neural networks. In 2012, a neural network was able to beat human performance at an image recognition task as part of the ImageNet competition. Since then, interest in artificial neural networks as has soared and the technology continues to improve.

Machine Learning

Machine learning (ML) is a type of artificial intelligence (AI) that allows software applications to become more accurate at predicting outcomes without being explicitly programmed to do so. Machine learning algorithms use historical data as input to predict new output values.

Recommendation engines are a common use case for machine learning. Other popular uses include fraud detection, spam filtering, malware threat detection, business process automation (BPA) and predictive maintenance.

Types of machine learning

Classical machine learning is often categorized by how an algorithm learns to become more accurate in its predictions. There are four basic approaches: supervised learning, unsupervised learning, semi-supervised learning and reinforcement learning. The type of algorithm a data scientist chooses to use depends on what type of data they want to predict.

  • Supervised learning. In this type of machine learning, data scientists supply algorithms with labeled training data and define the variables they want the algorithm to assess for correlations. Both the input and the output of the algorithm is specified.
  • Unsupervised learning. This type of machine learning involves algorithms that train on unlabeled data. The algorithm scans through data sets looking for any meaningful connection. Both the data algorithms train on and the predictions or recommendations they output are predetermined.
  • Semi-supervised learning. This approach to machine learning involves a mix of the two preceding types. Data scientists may feed an algorithm mostly labeled training data, but the model is free to explore the data on its own and develop its own understanding of the data set.
  • Reinforcement learning. Reinforcement learning is typically used to teach a machine to complete a multi-step process for which there are clearly defined rules. Data scientists program an algorithm to complete a task and give it positive or negative cues as it works out how to complete a task. But for the most part, the algorithm decides on its own what steps to take along the way.

How supervised machine learning works

Supervised machine learning requires the data scientist to train the algorithm with both labeled inputs and desired outputs. Supervised learning algorithms are good for the following tasks:

  • Binary classification. Dividing data into two categories.
  • Multi-class classification. Choosing between more than two types of answers.
  • Regression modeling. Predicting continuous values.
  • Ensembling. Combining the predictions of multiple machine learning models to produce an accurate prediction.

How unsupervised machine learning works

Unsupervised machine learning algorithms do not require data to be labeled. They sift through unlabeled data to look for patterns that can be used to group data points into subsets. Most types of deep learning, including neural networks, are unsupervised algorithms. Unsupervised learning algorithms are good for the following tasks:

  • Clustering. Splitting the data set into groups based on similarity.
  • Anomaly detection. Identifying unusual data points in a data set.
  • Association mining. Identifying sets of items in a data set that frequently occur together.
  • Dimensionality Reduction. Reducing the number of variables in a data set.

How semi-supervised learning works

Semi-supervised learning works by data scientists feeding a small amount of labeled training data to an algorithm. From this, the algorithm learns the dimensions of the data set, which it can then apply to new, unlabeled data. The performance of algorithms typically improves when they train on labeled data sets. But labeling data can be time-consuming and expensive. Semi-supervised learning strikes a middle ground between the performance of supervised learning and the efficiency of unsupervised learning. Some areas where semi-supervised learning is used include:

  • Machine translation. Teaching algorithms to translate language based on less than a full dictionary of words.
  • Fraud detection. Identifying cases of fraud when you only have a few positive examples.
  • Labeling data. Algorithms trained on small data sets can learn to apply data labels to larger sets automatically.

How reinforcement learning works

Reinforcement learning works by programming an algorithm with a distinct goal and a prescribed set of rules for accomplishing that goal. Data scientists also program the algorithm to seek positive rewards — which it receives when it performs an action that is beneficial toward the ultimate goal — and avoid punishments — which it receives when it performs an action that gets it farther away from its ultimate goal. Reinforcement learning is often used in areas like:

  • Robotics. Robots can learn to perform tasks in the physical world using this technique.
  • Video gameplay. Reinforcement learning has been used to teach bots to play a number of video games.
  • Resource management. Given finite resources and a defined goal, reinforcement learning can help enterprises plan how to allocate resources.

Uses of machine learning

Today, machine learning is used in a wide range of applications. Perhaps one of the most well-known examples of machine learning in action is the recommendation engine that powers Facebook’s News Feed.

Facebook uses machine learning to personalize how each member’s feed is delivered. If a member frequently stops to read a particular group’s posts, the recommendation engine will start to show more of that group’s activity earlier in the feed.

Behind the scenes, the engine is attempting to reinforce known patterns in the member’s online behavior. Should the member change patterns and fail to read posts from that group in the coming weeks, the News Feed will adjust accordingly.

In addition to recommendation engines, other uses for machine learning include the following:

Customer relationship management — CRM software can use machine learning models to analyze email and prompt sales team members to respond to the most important messages first. More advanced systems can even recommend potentially effective responses.

Business intelligence — BI and analytics vendors use machine learning in their software to identify potentially important data points, patterns of data points and anomalies.

Human resource information systems — HRIS systems can use machine learning models to filter through applications and identify the best candidates for an open position.

Self-driving cars — Machine learning algorithms can even make it possible for a semi-autonomous car to recognize a partially visible object and alert the driver.

Virtual assistants — Smart assistants typically combine supervised and unsupervised machine learning models to interpret natural speech and supply context.

Advantages and disadvantages

Machine learning has seen powerful use cases ranging from predicting customer behavior constituting the operating system for self-driving cars. But just because some industries have seen benefits doesn’t mean machine learning is without its downsides.

When it comes to advantages, machine learning can help enterprises understand their customers at a deeper level. By collecting customer data and correlating it with behaviors over time, machine learning algorithms can learn associations and help teams tailor product development and marketing initiatives to customer demand.

Some internet companies use machine learning as a primary driver in their business models. Uber, for example, uses algorithms to match drivers with riders. Google uses machine learning to surface the right advertisements in searches.

But machine learning comes with disadvantages. First and foremost, it can be expensive. Machine learning projects are typically driven by data scientists, who command high salaries. These projects also require software infrastructure that can be high-cost.

There is also the problem of machine learning bias. Algorithms that trained on data sets that exclude certain populations or contain errors can lead to inaccurate models of the world that, at best, fail and, at worst, are discriminatory. When an enterprise bases core business processes on biased models, it can run into regulatory and reputational harm.

Choosing the right machine learning model

The process of choosing the right machine learning model to solve a problem can be time-consuming if not approached strategically.  

Step 1: Align the problem with potential data inputs that should be considered for the solution. This step requires help from data scientists and experts who have a deep understanding of the problem.

Step 2: Collect data, format it and label the data if necessary. This step is typically led by data scientists, with help from data wranglers.

Step 3: Chose which algorithm(s) to use and test to see how well they perform. This step is usually carried out by data scientists.

Step 4:  Continue to fine-tune outputs until they reach an acceptable level of accuracy. This step is usually carried out by data scientists with feedback from experts who have a deep understanding of the problem.

Importance of human-interpretable machine learning

Explaining how a specific ML model works can be challenging when the model is complex. There are some vertical industries where data scientists have to use simple machine learning models because it’s important for the business to explain how each and every decision was made. This is especially true in industries with heavy compliance burdens like banking and insurance.

Complex models can accurate predictions, but explaining to a layperson how an output was determined can be difficult. 

The future of machine learning

While machine learning algorithms have been around for decades, they’ve attained new popularity as artificial intelligence (AI) has grown in prominence. Deep learning models, in particular, power today’s most advanced AI applications.

Machine learning platforms are among enterprise technology’s most competitive realms, with most major vendors, including Amazon, Google, Microsoft, IBM and others, racing to sign customers up for platform services that cover the spectrum of machine learning activities, including data collection, data preparation, data classification, model building, training and application deployment.

As machine learning continues to increase in importance to business operations and AI becomes ever more practical in enterprise settings, the machine learning platform wars will only intensify.

Continued research into deep learning and AI is increasingly focused on developing more general applications. Today’s AI models require extensive training in order to produce an algorithm that is highly optimized to perform one task. But some researchers are exploring ways to make models more flexible and are seeking techniques that allow a machine to apply context learned from one task to future, different tasks.

History of machine learning

1642 – Blaise Pascal invents a mechanical machine that can add, subtract, multiply and divide.

1679 – Gottfried Wilhelm Leibniz devises the system of binary code.

1834 – Charles Babbage conceives the idea for a general all-purpose device that could be programmed with punched cards.

1842 – Ada Lovelace describes a sequence of operations for solving mathematical problems using Charles Babbage’s theoretical punch-card machine and becomes the first programmer.

1847 – George Boole creates Boolean logic, a form of algebra in which all values can be reduced to the binary values of true or false.

1936 – English logician and cryptanalyst Alan Turing proposes a universal machine that could decipher and execute a set of instructions. His published proof is considered the basis of computer science.

1952 – Arthur Samuel creates a program to help an IBM computer get better at checkers the more it plays.

1959 – MADALINE becomes the first artificial neural network applied to a real-world problem: removing echoes from phone lines.

1985 – Terry Sejnowski and Charles Rosenberg’s artificial neural network taught itself how to correctly pronounce 20,000 words in one week.

1997 – IBM’s Deep Blue beat chess grandmaster Garry Kasparov.

1999 – A CAD prototype intelligent workstation reviewed 22,000 mammograms and detected cancer 52% more accurately than radiologists did.

2006 – Computer scientist Geoffrey Hinton invents the term deep learning to describe neural net research.

2012 – An unsupervised neural network created by Google learned to recognize cats in YouTube videos with 74.8% accuracy.

2014 – A chatbot passes the Turing Test by convincing 33% of human judges that it was a Ukrainian teen named Eugene Goostman.

2014 – Google’s AlphaGo defeats the human champion in Go, the most difficult board game in the world.

2016 – LipNet, DeepMind’s artificial-intelligence system, identifies lip-read words in video with an accuracy of 93.4%.

2019 – Amazon controls 70% of the market share for virtual assistants in the U.S.

Types of Machine Learning Algorithms

Model development is not one-size-fits-all affair — there are different types of machine learning algorithms for different business goals and data sets. For example, the relatively straightforward linear regression algorithm is easier to train and implement than other machine learning algorithms, but it may fail to add value to a model requiring complex predictions.  

The nine machine learning algorithms that follow are among the most popular and commonly used to train enterprise models. The models each support different goals, range in user friendliness and use one or more of the following machine learning approaches: supervised learning, unsupervised learning, semi-supervised learning or reinforcement learning.

Supervised machine learning algorithms

Supervised learning models require data scientists to provide the algorithm with data sets for input and parameters for output, as well as feedback on accuracy during the training process. They are task-based, and test on labeled data sets.

Linear regression

The most popular type of machine learning algorithm is arguably linear regression. Linear regression algorithms map simple correlations between two variables in a set of data. A set of inputs and their corresponding outputs are examined and quantified to show a relationship, including how a change in one variable affects the other. Linear regressions are plotted via a line on a graph.

Linear regression’s popularity is due to its simplicity: The algorithm is easily explainable, relatively transparent and requires little to no parameter tuning. Linear regression is frequently used in sales forecasting and risk assessment for enterprises that seek to make long-term business decisions.

Linear regression is best for when “you are looking at predicting your value or predicting a class,” said Shekhar Vemuri, CTO of technology service company Clairvoyant, based in Chandler, Ariz.

Support vector machines

Support vector machines, or SVM, is a machine learning algorithm that separates data into classes. During model training, SVM finds a line that separates data in a given set into specific classes and maximizes the margins of each class. After learning these classification lines, the model can then apply them to future data.

This algorithm works best for training data that can clearly be separated by a line, also referred to as a hyperplane. Nonlinear data can be programmed into a facet of SVM called nonlinear SVMs. But, with training data that’s hyper-complex — faces, personality traits, genomes and genetic material — the class systems become smaller and harder to identify and require a bit more human assistance.

SVMs are used heavily in the financial sector, as they offer high accuracy on both current and future data sets. The algorithms can be used to compare relative financial performance, value and investment gains virtually.

Companies with nonlinear data and different kinds of data sets often use SVM, Vemuri said.

Decision tree

A decision tree algorithm takes data and graphs it out in branches to show the possible outcomes of a variety of decisions. Decision trees classify response variables and predict response variables based on past decisions.

Decision trees are a visual method of mapping out decisions. Their results are easy to explain and can be accessible to citizen data scientists. A decision tree algorithm maps out various decisions and their likely impact on an end result and can even be used with incomplete data sets.

Decision trees, due to their long-tail visuals, work best for small data sets, low-stakes decisions and concrete variables. Because of this, common decision tree use cases involve augmenting option pricing — from mortgage lenders classifying borrowers to product management teams quantifying the shift in market that would occur if they changed a major ingredient.

Decision trees remain popular because they can outline multiple outcomes and tests without requiring data scientists to deploy multiple algorithms, said Jeff Fried, director of product management for InterSystems, a software company based in Cambridge, Mass.

Unsupervised machine learning algorithms

Unsupervised machine learning algorithms are not trained by data scientists. Instead, they use deep learning to identify patterns in data by combing through sets of unlabeled training data and observing correlations. Unsupervised learning models receive no information about what to look for in the data or which data features to examine.  


The Apriori algorithm, based on the Apriori principle, is most commonly used in market basket analysis to mine item sets and generate association rules. The algorithms check for a correlation between two items in a data set to determine if there’s a positive or negative correlation between them.

The Apriori algorithm is primed for sales teams that seek to notice which products customers are more likely to buy in combination with other products. If a high percentage of customers who purchase bread also purchase butter, the algorithm can conclude that purchase of A (bread) will often lead to purchase of B (butter). This can be cross-referenced in data sets, data points and purchase ratios.

Apriori algorithms can also determine that purchase of A (bread) is only 10% likely to lead to the purchase of C (corn). Marketing teams can use this information to inform things like product placement strategies. Besides sales functions, Apriori algorithms are favored by e-commerce giants, like Amazon and Alibaba, but are also used to understand searcher intent by sites like Bing and Google to predict searches by correlating associated words.

K-means clustering 

The K-means algorithm is an iterative method of sorting data points into groups based on similar characteristics. For example, a K-means cluster algorithm would sort web results for the word civic into groups relating to Honda Civic and civic as in municipal or civil.

K-means clustering has a reputation for accurate, streamlined groupings processed in a relatively short period of time, compared to other algorithms. K-means clustering is popular among search engines to produce relevant information and enterprises looking to group user behaviors by connotative meaning, or IT performance monitoring.

Semi-supervised machine learning algorithms

Semi-supervised learning teaches an algorithm through a mix of labeled and unlabeled data. This algorithm learns certain information through a set of labelled categories, suggestions and examples. Semi-supervised algorithms then create their own labels by exploring the data set or virtual world on their own, following a rough outline or some data scientist feedback.

Generative Adversarial Networks

GANs are deep generative models that have gained popularity. GANs have the ability to imitate data in order to model and predict. They work by essentially pitting two models against each other in a competition to develop the best solution to a problem. One neural network, a generator, creates new data while another, the discriminator, works to improve on the generator’s data. After many iterations of this, data sets become more and more lifelike and realistic. Popular media uses GANs to do things like face creation and audio manipulation. GANs are also impactful for creating large data sets using limited training points, optimizing models and improving manufacturing processes.

Self-trained Naïve Bayes classifier

Self-trained algorithms are all examples of semi-supervised learning. Developers can add to these models a Naïve Bayes classifier, which allows self-trained algorithms to perform classification tasks simply and easily. When developing a self-trained model, researchers train the algorithm to recognize object classes on a labeled training set. Then the researchers have the model classify unlabeled data. Once that cycle is finished, researchers upload the correct self-categorized labels to the training data and retrain. Self-trained models are popular in natural language processing (NLP) and among organizations with limited labeled data sets.

Reinforcement learning

Reinforcement learning algorithms are based on a system of rewards and punishments learned through trial and error. The model is given a goal and seeks maximum reward for getting closer to that goal based on limited information and learns from its previous actions. Reinforcement learning algorithms can be model-free — creating interpretations of data through constant trial and error — or model-based — adhering more closely to a set of predefined steps with minimal trial and error.


Q-learning algorithms are model-free, which means they seek to find the best method of achieving a defined goal by seeking the maximum reward by trying the maximum amount of actions. Q-learning is often paired with deep learning models in research projects, including Google’s DeepMind. Q-learning further breaks down into various algorithms, including deep deterministic policy gradient (DDPG) or hindsight experience replay (HER).

Model-based value estimation

Unlike model-free approaches like Q-learning, model-based algorithms have a limited depth of freedom to create potential states and actions and are statistically more efficient. Such algorithms, like the popular MBVE, are fitted with a specific data set and base action using supervised learning. Designers of MBVE note that “model-based methods can quickly arrive at near-optimal control with learned models under fairly restricted dynamics classes.” Model-based methods are designed for specific use cases.

Automated Machine Learning tools pave the way to AI

Automated machine learning is one of the trendiest and most popular areas of enterprise AI software right now. With vendors offering everything from individual automated machine learning tools to cloud-based, full-service programs, autoML is quickly helping enterprises streamline business process and dive into AI.

In light of the rise of autoML, analysts and experts are encouraging enterprises to evaluate their specific needs alongside the intended purpose of the tools — to augment data scientists’ work — instead of trying to use autoML without a larger AI framework.

Whether your enterprise has a flourishing data science team, citizen data science team or relies heavily on outsourcing data science work, autoML can provide value if you choose tools and use cases wisely.

AutoML and data scientists

Enterprises are applying automated machine learning in a diverse range of use cases, from developing retail insights to training robots. Whatever the environment or the business process being automated, experts said the real promise of autoML is the ability to collaborate with data scientists. 

“Make sure that you’re using [autoML] for the right intended purpose, which is automate the grunt work that a data scientist typically has to do,” said Shekhar Vemuri, CTO of technology service company Clairvoyant, based in Chandler, Ariz.  

AutoML tools are being used to augment and speed up the modeling process, because data scientists spend most of their time on data engineering and data washing, said Evan Schnidman, CEO of natural language processing company Prattle, based in St. Louis.

“The first ranges of tools are all about how [to] streamline the data ingestion, data washing process. The next ranges of the tools are how [to] then streamline model development and model deployment. And then the third ranges are how [to] streamline model testing and validation,” he said.

Still, experts warned autoML users not to expect automated machine learning tools to replace data scientists.

AutoML and augmented analytics do not fully replace expert data scientists, said Carlie Idoine, senior director and analyst of data science and business analytics at Gartner.

“This is an extension of data science and machine learning capability, not a replacement,” she said. “We can automate some of the capabilities, but it’s still a good idea to have experts involved in processes that may be evaluating or validating the models.”

Intention equals value

If an enterprise intends to automate or augment a part of the data science process, it has a chance to succeed. If it intends to replace data science teams or expects results overnight, autoML technology will disappoint. Choosing the tool or program will depend heavily on the intention, goal and project for which the enterprise is solving.

“A key realization should be that we’re using autoML to essentially gain scale and try out more things than we could do manually or hand code,” Vemuri said.

Schnidman echoed the sentiment, calling autoML a support tool for data scientists. Businesses that have a mature data science team are poised to get the most net value, because the automated tools are an extension of data scientists’ capabilities.

“AutoML works for those who say, We’ve done this manually and taken it as far as we can go. So, we want to use these augmented tools to do feature engineering, maybe take out some bias we have and see what it finds that we didn’t consider,'” Idoine said.

If enterprises intend for autoML to replace their data science team, or be their only point of AI development, the tools will give limited advantages. AutoML is only one step of many in an overall AI strategy — especially in enterprises that are heavily regulated and those affected by recent data protection laws.

“Regulated industries and verticals have all these other legal concerns that they need to keep in mind and stay on top of. Make sure that you’re able to ensure that your tool of choice is able to integrate into your overall AI workflow,” Vemuri said.

Limitations of tools

The biggest limitation of automated machine learning tools today is they work best on known types of problems using algorithms like regression and classification. Because autoML has to be programmed to follow steps, some algorithms and forms of AI are not compatible with automation.

“Some of the newer types of algorithms like deep neural nets aren’t really well suited for autoML; that type of analysis is much more sophisticated, and it’s not something that can be easily explained,” Idoine said.

AutoML is also wrapped up in the problem of black box algorithms and testing. If a process can’t be easily outlined — even if the automated machine learning tool can complete it — the process will be hard to explain. Black box functionality comes with a whole host of its own issues, including bias and struggles with incomplete data sets.

“We don’t want to encourage black boxes for people that aren’t experts in this type of work,” Idoine said.

Getting to machine learning in production takes focus

Data scientists that build AI models and data engineers that deploy machine learning in production work in two different realms. This makes it hard to efficiently bring a new predictive model into production.

But some enterprises are finding ways to work around this problem. At the Flink Forward conference in San Francisco, engineers at Comcast and Capital One described how they are using Apache Flink to help bridge this gap to speed the deployment of new AI algorithms.

Version everything

The tools used by data scientists and engineers can differ in subtle ways. That leads to problems replicating good AI models in production.

Comcast is experimenting with versioning all the artifacts that go into developing and deploying AI models. This includes machine learning data models, model features and code running machine learning predictions. All the components are stored in GitHub, which makes it possible to tie together models developed by data scientists and code deployed by engineers.

“This ensures that what we put into production and feature engineering are consistent,” said Dave Torok, senior enterprise architect at Comcast.

At the moment, this process is not fully automated. However, the goal is to move toward full lifecycle automation for Comcast’s machine learning development pipeline.

Bridging the language gap

Data scientists tend to like to use languages like Python, while production systems run Java. To bridge this gap, Comcast has been building a set of Jython components for its data scientists.

Jython is an implementation designed to enable data scientists to run Python apps natively on Java infrastructure. It was first released in 1997 and has grown in popularity among enterprises launching machine initiatives because Python is commonly used by data scientists to build machine learning models. One limitation of this approach is that it can’t take advantage of many of the features running on Flink. Jython compiles Python code to run as native Java code.

However, Java developers are required to implement bindings to take advantage of new Java methods introduced with tools like Flink.

“At some point, we want to look at doing more generation of Flink-native features,” Torok said. “But on the other hand, it gives us flexibility of deployment.”

Capital One ran into similar problems trying to connect Python for its data scientists and Java for its production environment to create better fraud detection algorithms. They did some work to build up a Jython library that acts as an adaptor.

“This lets us implement each feature as accessible in Python,” said Jeff Sharpe, senior software engineer at Capital One.

These applications run within Flink as if they were Java code. One of the benefits of this approach is that the features can run in parallel, which is not normally possible in Jython.

Need for fallback mechanisms

Comcast’s machine learning models make predictions by correlating multiple features. However, the data for some of these features is not always available at runtime, so fallback mechanisms must be implemented.

For example, Comcast has developed a set of predictive models to prioritize repair truck rolls based on a variety of features, including the number of prior calls in the last month, a measurement of degraded internet speeds and the behavior of consumer equipment. But some of this data may not be available to predict the severity of a customer problem in a timely manner, which can cause a time-out, triggering the use of a less accurate model that runs with the available data.

The initial models are created based on an assessment of historical data. However, Comcast’s AI infrastructure enables engineers to feed information about the performance of machine learning in production back into the model training process to improve performance over time. The key lies in correlating predictions of the models with factors like a technician’s observations.

Historical data still a challenge

Capital One is using Flink and microservices to make historical and recent data easier to use to both develop and deploy better fraud detection models.

Andrew Gao, software engineer at Capital One, said the bank’s previous algorithms did not have access to all of a customer’s activities. On the production side, these models needed to be able to return an answer in a reasonable amount of time.

“We want to catch fraud, but not create a poor customer experience,” Gao said.

The initial project started off as one monolithic Flink application. However, Capital One ran into problems merging data from historical data sources and current streaming data, so they broke this up into several smaller microservices that helped address the problem.

This points to one of the current limitations of using stream processing for building AI apps. Stephan Ewen, chief technology officer at Data Artisans and lead developer of Flink, said that the development of Flink tooling has traditionally focused on AI and machine learning in production.

“Engineers can do model training logic using Flink, but we have not pushed for that. This is coming up more and more,” he said.

Deep Learning

Deep learning is a type of machine learning (ML) and artificial intelligence (AI) that imitates the way humans gain certain types of knowledge. Deep learning is an important element of data science, which includes statistics and predictive modeling. It is extremely beneficial to data scientists who are tasked with collecting, analyzing and interpreting large amounts of data; deep learning makes this process faster and easier.

At its simplest, deep learning can be thought of as a way to automate predictive analytics. While traditional machine learning algorithms are linear, deep learning algorithms are stacked in a hierarchy of increasing complexity and abstraction.

To understand deep learning, imagine a toddler whose first word is dog. The toddler learns what a dog is — and is not — by pointing to objects and saying the word dog. The parent says, “Yes, that is a dog,” or, “No, that is not a dog.” As the toddler continues to point to objects, he becomes more aware of the features that all dogs possess. What the toddler does, without knowing it, is clarify a complex abstraction — the concept of dog — by building a hierarchy in which each level of abstraction is created with knowledge that was gained from the preceding layer of the hierarchy.

How deep learning works

Computer programs that use deep learning go through much the same process as the toddler learning to identify the dog. Each algorithm in the hierarchy applies a nonlinear transformation to its input and uses what it learns to create a statistical model as output. Iterations continue until the output has reached an acceptable level of accuracy. The number of processing layers through which data must pass is what inspired the label deep.

In traditional machine learning, the learning process is supervised, and the programmer has to be extremely specific when telling the computer what types of things it should be looking for to decide if an image contains a dog or does not contain a dog. This is a laborious process called feature extraction, and the computer’s success rate depends entirely upon the programmer’s ability to accurately define a feature set for “dog.” The advantage of deep learning is the program builds the feature set by itself without supervision. Unsupervised learning is not only faster, but it is usually more accurate.

Initially, the computer program might be provided with training data — a set of images for which a human has labeled each image “dog” or “not dog” with meta tags. The program uses the information it receives from the training data to create a feature set for “dog” and build a predictive model. In this case, the model the computer first creates might predict that anything in an image that has four legs and a tail should be labeled “dog.” Of course, the program is not aware of the labels “four legs” or “tail.” It will simply look for patterns of pixels in the digital data. With each iteration, the predictive model becomes more complex and more accurate.

Unlike the toddler, who will take weeks or even months to understand the concept of “dog,” a computer program that uses deep learning algorithms can be shown a training set and sort through millions of images, accurately identifying which images have dogs in them within a few minutes.

To achieve an acceptable level of accuracy, deep learning programs require access to immense amounts of training data and processing power, neither of which were easily available to programmers until the era of big data and cloud computing. Because deep learning programming can create complex statistical models directly from its own iterative output, it is able to create accurate predictive models from large quantities of unlabeled, unstructured data. This is important as the internet of things (IoT) continues to become more pervasive, because most of the data humans and machines create is unstructured and is not labeled.

What are deep learning neural networks?

A type of advanced machine learning algorithm, known as artificial neural networks, underpins most deep learning models. As a result, deep learning may sometimes be referred to as deep neural learning or deep neural networking.

Neural networks come in several different forms, including recurrent neural networks, convolutional neural networks, artificial neural networks and feedforward neural networks — and each has benefits for specific use cases. However, they all function in somewhat similar ways, by feeding data in and letting the model figure out for itself whether it has made the right interpretation or decision about a given data element.

Neural networks involve a trial-and-error process, so they need massive amounts of data on which to train. It’s no coincidence neural networks became popular only after most enterprises embraced big data analytics and accumulated large stores of data. Because the model’s first few iterations involve somewhat-educated guesses on the contents of an image or parts of speech, the data used during the training stage must be labeled so the model can see if its guess was accurate. This means, though many enterprises that use big data have large amounts of data, unstructured data is less helpful. Unstructured data can only be analyzed by a deep learning model once it has been trained and reaches an acceptable level of accuracy, but deep learning models can’t train on unstructured data.

Deep learning methods 

Various different methods can be used to create strong deep learning models. These techniques include learning rate decay, transfer learning, training from scratch and dropout.

Learning rate decay. The learning rate is a hyperparameter — a factor that defines the system or sets conditions for its operation prior to the learning process — that controls how much change the model experiences in response to the estimated error every time the model weights are altered. Learning rates that are too high may result in unstable training processes or the learning of a suboptimal set of weights. Learning rates that are too small may produce a lengthy training process that has the potential to get stuck.

The learning rate decay method — also called learning rate annealing or adaptive learning rates — is the process of adapting the learning rate to increase performance and reduce training time. The easiest and most common adaptations of learning rate during training include techniques to reduce the learning rate over time.

Transfer learning. This process involves perfecting a previously trained model; it requires an interface to the internals of a preexisting network. First, users feed the existing network new data containing previously unknown classifications. Once adjustments are made to the network, new tasks can be performed with more specific categorizing abilities. This method has the advantage of requiring much less data than others, thus reducing computation time to minutes or hours.

Training from scratch. This method requires a developer to collect a large labeled data set and configure a network architecture that can learn the features and model. This technique is especially useful for new applications, as well as applications with a large number of output categories. However, overall, it is a less common approach, as it requires inordinate amounts of data, causing training to take days or weeks.

Dropout. This method attempts to solve the problem of overfitting in networks with large amounts of parameters by randomly dropping units and their connections from the neural network during training. It has been proven that the dropout method can improve the performance of neural networks on supervised learning tasks in areas such as speech recognition, document classification and computational biology.

Examples of deep learning applications

Because deep learning models process information in ways similar to the human brain, they can be applied to many tasks people do. Deep learning is currently used in most common image recognition tools, natural language processing and speech recognition software. These tools are starting to appear in applications as diverse as self-driving cars and language translation services.

What is deep learning used for?

Use cases today for deep learning include all types of big data analytics applications, especially those focused on natural language processing, language translation, medical diagnosis, stock market trading signals, network security and image recognition.

Specific fields in which deep learning is currently being used include the following:

  • Customer experience. Deep learning models are already being used for chatbots. And, as it continues to mature, deep learning is expected to be implemented in various businesses to improve the customer experiences and increase customer satisfaction.
  • Text generation. Machines are being taught the grammar and style of a piece of text and are then using this model to automatically create a completely new text matching the proper spelling, grammar and style of the original text.
  • Aerospace and military. Deep learning is being used to detect objects from satellites that identify areas of interest, as well as safe or unsafe zones for troops.
  • Industrial automation. Deep learning is improving worker safety in environments like factories and warehouses by providing services that automatically detect when a worker or object is getting too close to a machine.
  • Adding color. Color can be added to black and white photos and videos using deep learning models. In the past, this was an extremely time-consuming, manual process.
  • Medical research. Cancer researchers have started implementing deep learning into their practice as a way to automatically detect cancer cells.
  • Computer vision. Deep learning has greatly enhanced computer vision, providing computers with extreme accuracy for object detection and image classification, restoration and segmentation.

Limitations and challenges

The biggest limitation of deep learning models is they learn through observations. This means they only know what was in the data on which they trained. If a user has a small amount of data or it comes from one specific source that is not necessarily representative of the broader functional area, the models will not learn in a way that is generalizable.

The issue of biases is also a major problem for deep learning models. If a model trains on data that contains biases, the model will reproduce those biases in its predictions. This has been a vexing problem for deep learning programmers, because models learn to differentiate based on subtle variations in data elements. Often, the factors it determines are important are not made explicitly clear to the programmer. This means, for example, a facial recognition model might make determinations about people’s characteristics based on things like race or gender without the programmer being aware.

The learning rate can also become a major challenge to deep learning models. If the rate is too high, then the model will converge too quickly, producing a less-than-optimal solution. If the rate is too low, then the process may get stuck, and it will be even harder to reach a solution.

The hardware requirements for deep learning models can also create limitations. Multicore high-performing graphics processing units (GPUs) and other similar processing units are required to ensure improved efficiency and decreased time consumption. However, these units are expensive and use large amounts of energy. Other hardware requirements include random access memory (RAM) and a hard drive or RAM-based solid-state drive (SSD).

Other limitations and challenges include the following:

  • Deep learning requires large amounts of data. Furthermore, the more powerful and accurate models will need more parameters, which, in turn, requires more data.
  • Once trained, deep learning models become inflexible and cannot handle multitasking. They can deliver efficient and accurate solutions, but only to one specific problem. Even solving a similar problem would require retraining the system.
  • Any application that requires reasoning — such as programming or applying the scientific method — long-term planning and algorithmic-like data manipulation is completely beyond what current deep learning techniques can do, even with large data.

Deep learning vs. machine learning

Deep learning is a subset of machine learning that differentiates itself through the way it solves problems. Machine learning requires a domain expert to identify most applied features. On the other hand, deep learning learns features incrementally, thus eliminating the need for domain expertise. This makes deep learning algorithms take much longer to train than machine learning algorithms, which only need a few seconds to a few hours. However, the reverse is true during testing. Deep learning algorithms take much less time to run tests than machine learning algorithms, whose test time increases along with the size of the data.

Furthermore, machine learning does not require the same costly, high-end machines and high-performing GPUs that deep learning does.

In the end, many data scientists choose traditional machine learning over deep learning due to its superior interpretability, or the ability to make sense of the solutions. Machine learning algorithms are also preferred when the data is small.

Instances where deep learning becomes preferable include situations where there is a large amount of data, a lack of domain understanding for feature introspection or complex problems, such as speech recognition and natural language processing.


Deep learning can trace its roots back to 1943 when Warren McCulloch and Walter Pitts created a computational model for neural networks using mathematics and algorithms. However, it was not until the mid-2000s that the term deep learning started to appear. It gained popularity following the publication of a paper by Geoffrey Hinton and Ruslan Salakhutdinov that showed how a neural network with many layers could be trained one layer at a time.

In 2012, Google made a huge impression on deep learning when its algorithm revealed the ability to recognize cats. Two years later, in 2014, Google bought DeepMind, an artificial intelligence startup from the U.K. Two years after that, in 2016, Google DeepMind’s algorithm, AlphaGo, mastered the complicated board game Go, beating professional player Lee Sedol at a tournament in Seoul.

Recently, deep learning models have generated the majority of advances in the field of artificial intelligence. Deep reinforcement learning has emerged as a way to integrate AI with complex applications, such as robotics, video games and self-driving cars. The primary difference between deep learning and reinforcement learning is, while deep learning learns from a training set and then applies what is learned to a new data set, deep reinforcement learning learns dynamically by adjusting actions using continuous feedback in order to optimize the reward.

A reinforcement learning agent has the ability to provide fast and strong control of generative adversarial networks (GANs). The Adversarial Threshold Neural Computer (ATNC) combines deep reinforcement learning with GANs in order to design small organic molecules with a specific, desired set of pharmacological properties.

GANs are also being used to generate artificial training data for machine learning tasks, which can be used in situations with imbalanced data sets or when data contains sensitive information.

Here is a very simple illustration of how a deep learning program works. This video by the LuLu Art Group shows the output of a deep learning program after its initial training with raw motion capture data. This is what the program predicts the abstract concept of “dance” looks like.

With each iteration, the program’s predictive model became more complex and more accurate.

Deep learning powers a motion-tracking revolution

A surge in the development of artificial-intelligence technology is driving a new wave of open-source tools for analysing animal behaviour and posture.

As a postdoc, physiologist Valentina Di Santo spent a lot of time scrutinizing high-resolution films of fish.

Di Santo was investigating the motions involved when fish such as skates swim. She filmed individual fish in a tank and manually annotated their body parts frame by frame, an effort that required about a month of full-time work for 72 seconds of footage. Using an open-source application called DLTdv, developed in the computer language MATLAB, she then extracted the coordinates of body parts — the key information needed for her research. That analysis showed, among other things, that when little skates (Leucoraja erinacea) need to swim faster, they create an arch on their fin margin to stiffen its edge1.

But as the focus of Di Santo’s research shifted from individual animals to schools of fish, it was clear a new approach would be required. “It would take me forever to analyse [those data] with the same detail,” says Di Santo, who is now at Stockholm University. So, she turned to DeepLabCut instead.

DeepLabCut is an open-source software package developed by Mackenzie Mathis, a neuroscientist at Harvard University in Cambridge, Massachusetts, and her colleagues, which allows users to train a computational model called a neural network to track animal postures in videos. The publicly available version didn’t have an easy way to track multiple animals over time, but Mathis’ team agreed to run an updated version using the fish data, which Di Santo annotated using a graphical user interface (GUI). The preliminary output looks promising, Di Santo says, although she is waiting to see how the tool performs on the full data set. But without DeepLabCut, she says, the study “would not be possible”.


Researchers have long been interested in tracking animal motion, Mathis says, because motion is “a very good read-out of intention within the brain”. But conventionally, that has involved spending hours recording behaviours by hand. The previous generation of animal-tracking tools mainly determined centre of mass and sometimes orientation, and the few tools that captured finer details were highly specialized for specific animals or subject to other constraints, says Talmo Pereira, a neuroscientist at Princeton University in New Jersey.

Over the past several years, deep learning — an artificial-intelligence method that uses neural networks to recognize subtle patterns in data — has empowered a new crop of tools. Open-source packages such as DeepLabCut, LEAP Estimates Animal Pose (LEAP) and DeepFly3D use deep learning to determine coordinates of animal body parts in videos. Complementary tools perform tasks such as identifying specific animals. These packages have aided research on everything from the study of motion in hunting cheetahs to collective zebrafish behaviour.

Each tool has limitations; some require specific experimental set-ups, or don’t work well when animals always crowd together. But methods will improve alongside advances in image capture and machine learning, says Sandeep Robert Datta, a neuroscientist at Harvard Medical School in Boston, Massachusetts. “What you’re looking at now is just the very beginning of what is certain to be a long-term transformation in the way neuroscientists study behaviour,” he says.

Strike a pose

DeepLabCut is based on software used to analyse human poses. Mathis’ team adapted its underlying neural network to work for other animals with relatively few training data. Between 50 and 200 manually annotated frames are generally sufficient for standard lab studies, although the amount needed depends on factors such as data quality and the consistency of the people doing the labelling, Mathis says. In addition to annotating body parts with a GUI, users can issue commands through a Jupyter Notebook, a computational document popular with data scientists. Scientists have used DeepLabCut to study both lab and wild animals, including mice, spiders, octopuses and cheetahs. Neuroscientist Wujie Zhang at the University of California, Berkeley, and his colleague used it to estimate the behavioural activity of Egyptian fruit bats (Rousettus aegyptiacus) in the lab2.

The deep-learning-based posture tracking package LEAP, developed by Pereira and his colleagues requires 50–100 annotated frames for lab animals, says Pereira. More training data would be needed for wildlife footage, although his team has not yet conducted enough experiments to determine how much. The researchers plan to release another package called Social LEAP (SLEAP) this year to better handle footage of multiple, closely interacting animals.

Jake Graving, a behavioural scientist at the Max Planck Institute of Animal Behavior in Konstanz, Germany, and his colleagues compared the performance of a re-implementation of the DeepLabCut algorithm and LEAP on videos of Grevy’s zebras (Equus grevyi)3. They report that LEAP processed images about 10% faster, but the DeepLabCut algorithm was about three times as accurate.

Graving’s team has developed an alternative tool called DeepPoseKit, which it has used to study behaviours of desert locusts (Schistocerca gregaria), such as hitting and kicking. The researchers report that DeepPoseKit combines the accuracy of DeepLabCut with a batch-processing speed that surpasses LEAP. For instance, tracking one zebra in 1 hour of footage filmed at 60 frames per second takes about 3.6 minutes with DeepPoseKit, 6.4 minutes with LEAP and 7.1 minutes with his team’s implementation of the DeepLabCut algorithm, Graving says.

DeepPoseKit offers “very good innovations”, Pereira says. Mathis disputes the validity of the performance comparisons, but Graving says that “our results offer the most objective and fair comparison we could provide”. Mathis’ team reported an accelerated version of DeepLabCut that can run on a mobile phone in an article posted in September on the arXiv preprint repository4.

Biologists who want to test multiple software solutions can try Animal Part Tracker, developed by Kristin Branson, a computer scientist at the Howard Hughes Medical Institute’s Janelia Research Campus in Ashburn, Virginia, and her colleagues. Users can select any of several posture-tracking algorithms, including modified versions of those used in DeepLabCut and LEAP, as well as another algorithm from Branson’s lab. DeepPoseKit also offers the option to use alternative algorithms, as will SLEAP.

Other tools are designed for more specialized experimental set-ups. DeepFly3D, for instance, tracks 3D postures of single tethered lab animals, such as mice with implanted electrodes or fruit flies walking on a tiny ball that acts as a treadmill. Pavan Ramdya, a neuroengineer at the Swiss Federal Institute of Technology in Lausanne (EPFL), and his colleagues, who developed the software, are using DeepFly3D to help identify which neurons in fruit flies are active when they perform specific actions.

And DeepBehavior, developed by neuroscientist Ahmet Arac at the University of California, Los Angeles, and his colleagues, allows users to track 3D movement trajectories and calculate parameters such as velocities and joint angles in mice and humans. Arac’s team is using this package to assess the recovery of people who have had a stroke and to study the links between brain-network activity and behaviour in mice.

Making sense of movement

Scientists who want to study multiple animals often need to track which animal is which. To address this challenge, Gonzalo de Polavieja, a neuroscientist at Champalimaud Research, the research arm of the private Champalimaud Foundation in Lisbon, and his colleagues developed, a neural-network-based tool that identifies individual animals without manually annotated training data. The software can handle videos of up to about 100 fish and 80 flies, and its output can be fed into DeepLabCut or LEAP, de Polavieja says. His team has used to probe, among other things, how zebrafish decide where to move in a group5. However, the tool is intended only for lab videos rather than wildlife footage and requires animals to separate from one another, at least briefly.

Other software packages can help biologists to make sense of animals’ motions. For instance, researchers might want to translate posture coordinates into behaviours such as grooming, Mathis says. If scientists know which behaviour they’re interested in, they can use the Janelia Automatic Animal Behavior Annotator (JAABA), a supervised machine-learning tool developed by Branson’s team, to annotate examples and automatically identify more instances in videos.

An alternative approach is unsupervised machine learning, which does not require behaviours to be defined beforehand. This strategy might suit researchers who want to capture the full repertoire of an animal’s actions, says Gordon Berman, a theoretical biophysicist at Emory University in Atlanta, Georgia. His team developed the MATLAB tool MotionMapper to identify often repeated movements. Motion Sequencing (MoSeq), a Python-based tool from Datta’s team, finds actions such as walking, turning or rearing.

By mixing and matching these tools, researchers can extract new meaning from animal imagery. “It gives you the full kit of being able to do whatever you want,” Pereira says.