Analytical modeling is both science and art

Advanced analytics won’t produce an ounce of business insight without models, the statistical and machine learning algorithms that tease patterns and relationships from data and express them as mathematical equations. The algorithms tend to be immensely complex, mathematicians and statisticians (think data scientists) are needed to create them and then tweak the models to better fit changing business needs and conditions.

But analytical modeling is not a wholly quantitative, left-brain endeavor. It’s a science, certainly, but it’s an art, too.

The art of modeling involves selecting the right data sets, algorithms and variables and the right techniques to format data for a particular business problem. But there’s more to it than model-building mechanics. No model will do any good if the business doesn’t understand its results. Communicating the results to executives so they understand what the model discovered and how it can benefit the business is critical but challenging, it’s the “last mile” in the whole analytical modeling process and often the most treacherous. Without that understanding, though, business managers might be loath to use the analytical findings to make critical business decisions.

An analytical model estimates or classifies data values by essentially drawing a line through data points. When applied to new data or records, a model can predict outcomes based on historical patterns. But not all models are transparent, and some are downright opaque. That’s a problem for execs, who often don’t trust models until they see something positive result from decisions based on modeling-generated insights for example, operating costs go down or revenues go up. For analytics to work, modelers need to build models that reflect business managers’ perceptions of business realities and they need to make those connections clear.

They should also be realistic about the likely fruits of their scientific and artistic labors. Though some models make fresh observations about business data, most don’t; they extract relationships or patterns that people already know about but might overlook or ignore otherwise.

For example, a crime model predicts that the number of criminal incidents will increase in a particular neighborhood on a particular summer night. A grizzled police sergeant might cavalierly dismiss the model’s output, saying he was aware that would happen because an auto race takes place that day at the local speedway, which always spawns spillover crime in the adjacent neighborhood. “Tell me something I don’t already know,” he grumbles. But that doesn’t mean the modeling work was for naught: In this case, the model reinforces the policeman’s implicit knowledge, bringing it to the forefront of his consciousness, so he can act on it.

Occasionally, models do uncover breakthrough insights. An example comes from the credit card industry, which uses analytical models to detect and prevent fraud. Several years ago, analysts using a combination of algorithms and data sets uncovered a new racket: Perpetrators were using automatic number generators to guess credit card numbers on e-commerce websites. The models found that nearly identical credit card numbers were spitting out a huge number of transactions and card declines, thus uncovering a new pattern. The companies quickly figured out what was going on and implemented safeguards, avoiding millions of dollars in fraudulent transactions.

Many people think that to excel at analytics their companies need only hire a bunch of statisticians who understand the nuances of sophisticated algorithms and give them high-powered tools to crunch data. But that only gets you so far. The art of analytical modeling is a skill that requires intimate knowledge of an organization’s processes and data as well as the ability to communicate with business executives in business terms. Like fine furniture makers, analytics professionals who master these skills can build high-quality models with lasting value and reap huge rewards for their organizations in the process.

Algorithm

An algorithm (pronounced AL-go-rith-um) is a procedure or formula for solving a problem, based on conducting a sequence of specified actions. A computer program can be viewed as an elaborate algorithm. In mathematics and computer science, an algorithm usually means a small procedure that solves a recurrent problem.

Algorithms are widely used throughout all areas of IT (information technology). A search engine algorithm, for example, takes search strings of keywords and operators as input, searches its associated database for relevant web pages, and returns results.

An encryption algorithm transforms data according to specified actions to protect it. A secret key algorithm such as the U.S. Department of Defense’s Data Encryption Standard (DES), for example, uses the same key to encrypt and decrypt data. As long as the algorithm is sufficiently sophisticated, no one lacking the key can decrypt the data.

The word algorithm derives from the name of the mathematician, Mohammed ibn-Musa al-Khwarizmi, who was part of the royal court in Baghdad and who lived from about 780 to 850. Al-Khwarizmi’s work is the likely source for the word algebra as well.

Business Analytics

Business analytics (BA) is the iterative, methodical exploration of an organization’s data, with an emphasis on statistical analysis. Business analytics is used by companies that are committed to making data-driven decisions. Data-driven companies treat their data as a corporate asset and actively look for ways to turn it into a competitive advantage. Successful business analytics depends on data quality, skilled analysts who understand the technologies and the business, and an organizational commitment to using data to gain insights that inform business decisions.

Specific types of business analytics include:

  • Descriptive analytics, which tracks key performance indicators (KPIs) to understand the present state of a business;
  • Predictive analytics, which analyzes trend data to assess the likelihood of future outcomes; and
  • Prescriptive analytics, which uses past performance to generate recommendations about how to handle similar situations in the future.

How business analytics works

Once the business goal of the analysis is determined, an analysis methodology is selected and data is acquired to support the analysis. Data acquisition often involves extraction from one or more business systems, cleansing and integration into a single repository such as a data warehouse or data mart. 

Initial analysis is typically performed against a smaller sample set of data. Analytic tools range from spreadsheets with statistical functions to complex data mining and predictive modeling applications. As patterns and relationships in the data are uncovered, new questions are asked and the analytic process iterates until the business goal is met.

Deployment of predictive models involves scoring data records — typically in a database — and using the scores to optimize real-time decisions within applications and business processes. BA also supports tactical decision-making in response to unforeseen events. And, in many cases, the decision-making is automated to support real-time responses.

Business analytics vs. business intelligence

While the terms business intelligence and business analytics are often used interchangeably, there are some key differences:

Business analytics vs. data science

The more advanced areas of business analytics can start to resemble data science, but there is also a distinction between these two terms. Even when advanced statistical algorithms are applied to data sets, it doesn’t necessarily mean data science is involved. That’s because true data science involves more custom coding and exploring answers to open-ended questions.

Data scientists generally don’t set out to solve a specific question, as most business analysts do. Rather, they will explore data using advanced statistical methods and allow the features in the data to guide their analysis. There are a host of business analytics tools that can perform these kinds of functions automatically, requiring few of the special skills involved in data science.

Business analytics applications

Business analytics tools come in several different varieties:

  • Data visualization tools
  • Business intelligence reporting software
  • Self-service analytics platforms
  • Statistical analysis tools
  • Big data platforms

Self-service has become a major trend among business analytics tools. Users now demand software that is easy to use and doesn’t require specialized training. This has led to the rise of simple-to-use tools from companies such as Tableau and Qlik, among others. These tools can be installed on a single computer for small applications or in server environments for enterprise-wide deployments. Once they are up and running, business analysts and others with less specialized training can use them to generate reports, charts and web portals that track specific metrics in data sets.

Edge Analytics

Edge analytics is an approach to data collection and analysis in which an automated analytical computation is performed on data at a sensor, network switch or other device instead of waiting for the data to be sent back to a centralized data store.

Edge analytics has gained attention as the internet of things (IoT) model of connected devices has become more prevalent. In many organizations, streaming data from manufacturing machines, industrial equipment, pipelines and other remote devices connected to the IoT creates a massive glut of operational data, which can be difficult — and expensive — to manage. By running the data through an analytics algorithm as it’s created, at the edge of a corporate network, companies can set parameters on what information is worth sending to a cloud or on-premises data store for later use — and what isn’t.

Analyzing data as it’s generated can also decrease latency in the decision-making process on connected devices. For example, if sensor data from a manufacturing system points to the likely failure of a specific part, business rules built into the analytics algorithm interpreting the data at the network edge can automatically shut down the machine and send an alert to plant managers so the part can be replaced. That can save time compared to transmitting the data to a central location for processing and analysis, potentially enabling organizations to reduce or avoid unplanned equipment downtime.

Another primary benefit of edge analytics is scalability. Pushing analytics algorithms to sensors and network devices alleviates the processing strain on enterprise data management and analytics systems, even as the number of connected devices being deployed by organizations — and the amount of data being generated and collected increases.

How is edge analytics used?

One of the most common use cases for edge analytics is monitoring edge devices. This is particularly true for IoT devices. A data analytics platform might be deployed for the purpose of monitoring a large collection of devices for the purpose of making sure that the devices are functioning normally. If a problem does occur, an edge analytics platform might be able to take corrective action automatically. If automatic remediation isn’t possible, then the platform might instead provide the IT staff with actionable insights that will help them to fix the problem.

Benefits of edge analytics

Edge analytics delivers several compelling benefits:

  • Near real-time analysis of data. Because analysis is performed near the data — often on board the device itself — the data can be analyzed in near real time. This would simply not be the case if the device had to transmit the data to a back-end server in the cloud or in a remote data center for processing.
  • Scalability. Edge analytics is by its very nature scalable. Because each device analyzes its own data, the computational workload is distributed across devices.
  • Possible reduction of costs. Significant costs are associated with traditional big data analytics. Regardless of whether the data is processed in a public cloud or in an organization’s own data center, there are costs tied to data storage, data processing and bandwidth consumption. Some of the edge analytics platforms for IoT devices use the IoT device’s hardware to perform the data analytics, thereby eliminating the need for back-end processing.
  • Improved security. If data is analyzed on board the device that created it, then it’s not necessary to transmit the full data set across the wire. This can help improve security because the raw data never leaves the device that created it.

Limitations of edge analytics

Like any other technology, edge analytics has its limits. Those limitations include:

  • Not all hardware supports it. Simply put, not every IoT device has the memory, CPU and storage hardware required to perform deep analytics onboard the device.
  • You might have to develop your own edge analytics platform. Edge analytics is still a relatively new technology. Although off-the-shelf analytical platforms do exist, it’s entirely possible that an organization might have to develop its own edge analytics platform based on the devices that it wants to analyze.

Applications of edge analytics

Edge analytics tend to be most useful in industrial environments that use many IoT sensors. In such environments, edge analytics can deliver benefits such as:

  • Improved up time. If an edge analytics platform can monitor a sensor array, it might be able to take corrective action when problems occur. Even if the resolution isn’t automated, simply alerting an operator to a problem can help improve the overall up time.
  • Lower maintenance costs. By performing in-depth analysis of IoT devices, it might be possible to gain deep insight into device health and longevity. Depending on the environment, this might help the organization to reduce its maintenance costs by performing maintenance when it’s necessary rather than blindly following a maintenance schedule.
  • Predict failures. An in-depth analysis of IoT hardware might make it possible to accurately predict hardware failures in advance. This can enable organizations to take proactive steps to head off a failure.

Edge analytics vs. edge computing

Edge computing is based on the idea that data collection and data processing can be performed near the location where the data is either being created or consumed. Edge analytics uses these same devices and the data that they have already produced. An analytics model performs a deeper analysis of the data than what was initially performed. These analytics capabilities enable the creation of actionable insights, often directly on the device.

Cloud analytics vs. edge analytics

Both cloud analytics and edge analytics are techniques for gathering relevant data and then using that data to perform data analysis. The key difference between the two is that cloud analytics requires raw data to be transmitted to the cloud for analysis.

Although cloud analytics has its place, edge analytics has two main advantages. First, edge analytics incurs far lower latency than cloud analytics because data is analyzed on site — often within the device itself, in real time, as the data is created. The second advantage is that edge analytics doesn’t require network connectivity to the cloud. This means that edge analytics can be used in bandwidth-constrained environments, or in locations where cloud connectivity simply isn’t available.

Inductive Reasoning

Inductive reasoning is a logical process in which multiple premises, all believed true or found true most of the time, are combined to obtain a specific conclusion.

Inductive reasoning is often used in applications that involve prediction, forecasting, or behavior. Here is an example:

  • Every tornado I have ever seen in the United States rotated counterclockwise, and I have seen dozens of them.
  • We see a tornado in the distance, and we are in the United States.
  • I conclude that the tornado we see right now must be rotating counterclockwise.

A meteorologist will tell you that in the United States (which lies in the northern hemisphere), most tornadoes rotate counterclockwise, but not all of them do. Therefore, the conclusion is probably true, but not necessarily true. Inductive reasoning is, unlike deductive reasoning, not logically rigorous. Imperfection can exist and inaccurate conclusions can occur, however rare; in deductive reasoning the conclusions are mathematically certain.

Inductive reasoning is sometimes confused with mathematical induction, an entirely different process. Mathematical induction is a form of deductive reasoning, in which logical certainties are “daisy chained” to derive a general conclusion about an infinite number of objects or situations.

Supply Chain Analytics

Supply chain analytics refers to the processes organizations use to gain insight and extract value from the large amounts of data associated with the procurement, processing and distribution of goods. Supply chain analytics is an essential element of supply chain management (SCM).

The discipline of supply chain analytics has existed for over 100 years, but the mathematical models, data infrastructure, and applications underpinning these analytics have evolved significantly. Mathematical models have improved with better statistical techniques, predictive modeling and machine learning. Data infrastructure has changed with cloud infrastructure, complex event processing (CEP) and the internet of things. Applications have grown to provide insight across traditional application silos such as ERP, warehouse management, logistics and enterprise asset management.

An important goal of choosing supply chain analytics software is to improve forecasting and efficiency and be more responsive to customer needs. For example, predictive analytics on point-of-sale terminal data stored in a demand signal repository can help a business anticipate consumer demand, which in turn can lead to cost-saving adjustments to inventory and faster delivery.

Achieving end-to-end supply chain analytics requires bringing information together across the procurement of raw materials and extends through production, distribution and aftermarket services. This depends on effective integration between the many SCM and supply chain execution platforms that make up a typical company’s supply chain. The goal of such integration is supply chain visibility: the ability to view data on goods at every step in the supply chain.

Supply chain analytics software

Supply chain analytics software is generally available in two forms: embedded in supply chain software, or in a separate, dedicated business intelligence and analytics tool that has access to supply chain data. Most ERP vendors offer supply chain analytics features, as do vendors of specialized SCM software. Some IT consultancies develop software models that can be customized and integrated into a company’s business processes.

Some ERP and SCM vendors have begun applying CEP to their platforms for real-time supply chain analytics. Most ERP and SCM vendors have one-to-one integrations, but there is no standard. However, the Supply Chain Operations Reference (SCOR) model provides standard metrics for comparing supply chain performance to industry benchmarks.

Ideally, supply chain analytics software would be applied to the entire chain, but in practice it is often focused on key operational subcomponents, such as demand planning, manufacturing production, inventory management or transportation management. For example, supply chain finance analytics can help identify increased capital costs or opportunities to boost working capital; procure-to-pay analytics can help identify the best suppliers and provide early warning of budget overruns in certain expense categories; and transportation analytics software can predict the impact of weather on shipments.

How supply chain analytics works

Supply chain analytics brings together data from across different applications, infrastructure, third-party sources and emerging technologies such as IoT to improve decision-making across the strategic, tactical and operational processes that make up supply chain management. Supply chain analytics helps synchronize supply chain planning and execution by improving real-time visibility into these processes and their impact on customers and the bottom line. Increased visibility can also increase flexibility in the supply chain network by helping decision-makers to better evaluate tradeoffs between cost and customer service.

The process of creating supply chain analytics typically starts with data scientists who understand a particular aspect of the business, such as the factors that relate to cash flow, inventory, waste and service levels. These experts look for potential correlations between different data elements to build a predictive model that optimizes the output of the supply chain. They test out variations until they have a robust model.

Supply chain analytics models that reach a certain threshold of success are deployed into production by data engineers with an eye toward scalability and performance. Data scientists, data engineers and business users work together to refine the way these data analytics are presented and operationalized in practice. Supply chain models are improved over time by correlating the performance of data analysis models in production with the business value they deliver.

Features of supply chain analytics

Supply chain analytics software usually includes most of the following features:

  • Data visualization. The ability to slice and dice data from different angles to improve insight and understanding.
  • Stream processing. Deriving insight from multiple data streams generated by, for example, the IoT, applications, weather reports and third-party data.
  • Social media integration. Using sentiment data from social feeds to improve demand planning.
  • Natural language processing. Extracting and organizing unstructured data buried in documents, news sources and data feeds.
  • Location intelligence. Deriving insight from location data to understand and optimize distribution.
  • Digital twin of the supply chain. Organizing data into a comprehensive model of the supply chain that is shared across different types of users to improve predictive and prescriptive analytics.
  • Graph databases. Organizing information into linked elements that make it easier to find connections, identify patterns and improve traceability of products, suppliers and facilities.

Types of supply chain analytics

A common lens used to delineate the main types of supply chain analytics is based on Gartner’s model of the four capabilities of analytics: descriptive, diagnostic, predictive and prescriptive.

  • Descriptive supply chain analytics uses dashboards and reports to help interpret what has happened. It often involves using a variety of statistical methods to search through, summarize and organize information about operations in the supply chain. This can be useful in answering questions like, “How have inventory levels changed over the last month?” or “What is the return on invested capital?”
  • Diagnostic supply chain analytics are used to figure out why something happened or is not working as well as it should. For example, “Why are shipments being delayed or lost?” or “Why is our company not achieving the same number of inventory turns as a competitor?”
  • Predictive supply chain analytics helps to foresee what is likely to happen in the future based on current data. For example, “How will new trade regulations or a pandemic lockdown affect the availability and cost of raw materials or goods?”
  • Prescriptive supply chain analytics helps prescribe or automate the best course of action using optimization or embedded decision logic. This can help improve decisions about when to launch a product, whether or not to build a factory or the best shipment strategy for each retail location.

Another way of breaking down types of supply chain analytics is by their form and function. Advisory firm Supply Chain Insights, for example, breaks down the types of supply chain analytics into the following functions:

  • Workflow
  • Decision support
  • Collaboration
  • Unstructured text mining
  • Structured data management

In this model, the different types of analytics feed into each other as part of an end-to-end ongoing process for improving supply chain management.

For example, a company could use unstructured text mining to turn raw data from contracts, social media feeds and news reports into structured data that is relevant to the supply chain. This improved, more structured data could then help automate and improve workflows, such as procure-to-pay processes. The data in digitized workflows is much easier to capture than data from manual workflows, thus increasing the data available for decision support systems. Better decision support could in turn enhance collaboration across different departments like procurement and warehouse management or between supply chain partners.

Other technologies are emerging as ways to improve the predictive models generated by supply chain analytics. For example, organizations are starting to use process mining to analyze how they execute business processes. This type of process analytics can be used to create a digital twin of the organization that can help identify supply chain opportunities for automation across procurement, production, logistics and finance. Augmented analytics can help business users ask questions about the business in plain language, with responses delivered in brief summaries. Graph analytics can shed light on the relationships between entities in the supply chain, such as how changes in a tier 3 supplier might affect tier 1 suppliers.

Supply chain analytics uses

Sales and operations planning uses supply chain analytics to match a manufacturer’s supply with demand by generating plans that align daily operations with corporate strategy. Supply chain analytics is also used to do the following:

improve risk management by identifying known risks and predicting future risks based on patterns and trends throughout the supply chain;

increase planning accuracy by analyzing customer data to identify factors that increase or decrease demand;

improve order management by consolidating data sources to assess inventory levels, predict demand and identify fulfillment issues;

streamline procurement by organizing and analyzing spending across departments to improve contract negotiations and identify opportunities for discounts or alternative sources; and

increase working capital by improving models for determining the inventory levels required to ensure service goals with minimal capital investment.

History of supply chain analytics

Supply chain analytics has its roots in the work of Frederick Taylor, whose 1911 publication, The Principles of Scientific Management, laid the groundwork for the modern fields of industrial engineering and supply chain managementHenry Ford adopted Taylor’s techniques in the creation of the modern assembly line and a supply chain that supported more efficient means of production.

The advent of mainframe computers gave rise to the data processing work done by IBM researcher Hans Peter Luhn, who some credit for coining the term business intelligence in his 1958 paper, “A Business Intelligence System.” His work helped build the foundation for the different types of data analytics used in supply chain analytics.

In 1963, Bud Lalonde, a professor at Ohio State University, proposed that physical distribution management should be combined with materials management, procurement and manufacturing into what he called business logistics. Around this time, management consultant Stafford Beer and others began exploring new ideas like the viable systems model for organizing business information into a structured hierarchy to improve business planning and execution. By the early 1980s, the burgeoning field was known as supply chain management.

As the internet became a force in the 1990s, people looked at how it could be applied in supply chain management. A pioneer in this area was the British technologist Kevin Ashton. As a young product manager tasked with solving the problem of keeping a popular lipstick on store shelves, Ashton hit upon radio frequency identification sensors as a way to automatically capture data about the movement of products across the supply chain. Ashton, who would go on to co-found the Massachusetts Institute of Technology’s Auto-ID Center that perfected RFID technology and sensors, coined the term internet of things to explain this revolutionary new feature of supply chain management.

The 1990s also saw the development of CEP by researchers such as the team headed by Stanford University’s David Luckham and others. CEP’s ability to capture incoming data from real-time events helped supply chain managers correlate low-level data related to factory operations, the physical movements of products, and weather into events that could then be analyzed by supply chain analytics tools. For example, data about production processes could be abstracted to factory performance, which in turn could be abstracted into business events related to things like inventory levels.

Another turning point in the field of supply chain analytics was the advent of cloud computing, a new vehicle for delivering IT infrastructure, software and platforms as service. By providing a foundation for orchestrating data across multiple sources, the cloud has driven improvements in many types of analytics, including supply chain analytics. The emergence of data lakes like Hadoop allowed enterprises to capture data from different sources on a common platform, further refining supply chain analytics by enabling companies to correlate more types of data. Data lakes also made it easier to implement advanced analytics that operated on a variety of structured and unstructured data from different applications, event streams and the IoT.

In recent years, robotic process automation — software that automates rote computer tasks previously performed by humans — has become a powerful tool in improving business automation and the ability to integrate data into analytics.

In addition, the artificial intelligence technique known as deep learning is increasingly being used to improve supply chain analytics. Deep learning techniques are driving advances in machine vision (used to improve inventory tracking), natural language understanding (used to automate contract management), and improvements in routing models.

Future trends of supply chain analytics

Supply chain analytics will continue to evolve in tandem with the evolution of analytics models, data structures and infrastructure, and the ability to integrate data across application silos. In the long run, advanced analytics will lead to more autonomous supply chains that can manage and respond to changes, much like self-driving cars are starting to do today. In addition, improvements in IoT, CEP and streaming architectures will enable enterprises to derive insight more quickly from a larger variety of data sources. AI techniques will continue to improve people’s ability to generate more accurate and useful predictive insights that can be embedded into workflows.

Other technologies expected to play a big role in supply chain analytics and management include the following:

Blockchain. Blockchain infrastructure and technologies promise to improve visibility and traceability across more layers of the supply chain. These same building blocks could drive companies to use smart contracts to automate, control and execute transactions.

Graph analytics. Predicted to power more than half of all enterprise applications within a decade, graph analytics will help supply chain managers better analyze the links of various entities in the supply chain.

Hyperautomation. The technologies underpinning hyperautomation will accelerate supply chain automation by using process mining analytics to identify automation candidates, generate the automations and manage these automated processes.

Statistical Analysis

Statistical analysis is the collection and interpretation of data in order to uncover patterns and trends. It is a component of data analytics. Statistical analysis can be used in situations like gathering research interpretations, statistical modeling or designing surveys and studies. It can also be useful for business intelligence organizations that have to work with large data volumes.

In the context of business intelligence (BI), statistical analysis involves collecting and scrutinizing every data sample in a set of items from which samples can be drawn. A sample, in statistics, is a representative selection drawn from a total population. 

The goal of statistical analysis is to identify trends. A retail business, for example, might use statistical analysis to find patterns in unstructured and semi-structured customer data that can be used to create a more positive customer experience and increase sales. 

Steps of statistical analysis

Statistical analysis can be broken down into five discrete steps, as follows:

  • Describe the nature of the data to be analyzed.
  • Explore the relation of the data to the underlying population.
  • Create a model to summarize an understanding of how the data relates to the underlying population.
  • Prove (or disprove) the validity of the model.
  • Employ predictive analytics to run scenarios that will help guide future actions.

Statistical analysis software

Software for statistical analysis will typically allow users to do more complex analyses by including additional tools for organization and interpretation of data sets, as well as for the presentation of that data. IBM SPSS Statistics, RMP and Stata are some examples of statistical analysis software. For example, IBM SPSS Statistics covers much of the analytical process. From data preparation and data management to analysis and reporting. The software includes a customizable interface, and even though it may be hard form someone to use, it is relatively easy for those experienced in how it works.

Analytic Database

An analytic database is a read-only system that stores historical data on business metrics such as sales performance and inventory levels. Business analysts, corporate executives and other workers can run queries and reports against an analytic database. The information is updated on a regular basis to incorporate recent transaction data from an organization’s operational systems.

An analytic database is specifically designed to support business intelligence (BI) and analytic applications, typically as part of a data warehouse or data mart. This differentiates it from an operational, transactional or OLTP database, which is used for transaction processing i.e., order entry and other “run the business” applications. Databases that do transaction processing can also be used to support data warehouses and BI applications, but analytic database vendors claim that their products offer performance and scalability advantages over conventional relational database software.  

There currently are five main types of analytic databases on the market:

Columnar databases, which organize data by columns instead of rows, thus reducing the number of data elements that typically have to be read by the database engine while processing queries.

Data warehouse appliances, which combine the database with hardware and BI tools in an integrated platform that’s tuned for analytical workloads and designed to be easy to install and operate.

In-memory databases, which load the source data into system memory in a compressed, non-relational format in an attempt to streamline the work involved in processing queries.

Massively parallel processing (MPP) databases, which spread data across a cluster of servers, enabling the systems to share the query processing workload.

Online analytical processing (OLAP) databases, which store multidimensional “cubes” of aggregated data for analyzing information based on multiple data attributes.

Real-Time Analytics

Real-time analytics is the use of data and related resources for analysis as soon as it enters the system. The adjective real-time refers to a level of computer responsiveness that a user senses as immediate or nearly immediate. The term is often associated with streaming data architectures and real-time operational decisions that can be made automatically through robotic process automation and policy enforcement.

Whereas historical data analysis uses a set of historical data for batch analysis, real-time analytics instead visualizes and analyzes the data as it appears in the computer system. These enables data scientists to use real-time analytics for purposes such as:

  • Forming operational decisions and applying them to production activities including business processes and transactions on an ongoing basis.
  • Viewing dashboard displays in real time with constantly updated transactional data sets.
  • Utilizing existing prescriptive and predictive analytics
  • Reporting historical and current data simultaneously.

Real-time analytics software has three basic components:

  • an aggregator that gathers data event streams (and perhaps batch files) from a variety of data sources;
  • a broker that makes data available for consumption; and
  • an analytics engine that analyzes the data, correlates values and blends streams together.

The system that receives and sends data streams and executes the application and real-time analytics logic is called the stream processor.

How real-time analytics works

Real-time analytics often takes place at the edge of the network to ensure that data analysis is done as close to the data’s origin as possible. In addition to edge computing, other technologies that support real-time analytics include:

  • Processing in memory — a chip architecture in which the processor is integrated into a memory chip to reduce latency. 
  • In-database analytics — a technology that allows data processing to be conducted within the database by building analytic logic into the database itself. 
  • Data warehouse appliances — a combination of hardware and software products designed specifically for analytical processing. An appliance allows the purchaser to deploy a high-performance data warehouse right out of the box. 
  • In-memory analytics — an approach to querying data when it resides in random access memory, as opposed to querying data that is stored on physical disks.
  • Massively parallel programming — the coordinated processing of a program by multiple processors that work on different parts of the program, with each processor using its own operating system and memory.

In order for the real-time data to be useful, the real-time analytics applications being used should have high availability and low response times. These applications should also feasibly manage large amounts of data, up to terabytes. This should all be done while returning answers to queries within seconds.

The term real-time also includes managing changing data sources — something that may arise as market and business factors change within a company. As a result, the real-time analytics applications should be able to handle big data. The adoption of real-time big data analytics can maximize business returns, reduce operational costs and introduce an era where machines can interact over the internet of things using real-time information to make decisions on their own.

Different technologies exist that have been designed to meet these demands, including the growing quantities and diversity of data. Some of these new technologies are based on specialized appliances — such as hardware and software systems. Other technologies utilize a special processor and memory chip combination, or a database with analytics capabilities embedded in its design.

Benefits of real-time analytics

Real-time analytics enables businesses to react without delay, quickly detect and respond to patterns in user behavior, take advantage of opportunities that could otherwise be missed and prevent problems before they arise.

Businesses that utilize real-time analytics greatly reduce risk throughout their company since the system uses data to predict outcomes and suggest alternatives rather than relying on the collection of speculations based on past events or recent scans — as is the case with historical data analytics. Real-time analytics provides insights into what is going on in the moment.

Other benefits of real-time analytics include:

  • Data visualization. Real-time data can be visualized and reflects occurrences throughout the company as they occur, whereas historical data can only be placed into a chart in order to communicate an overall idea.
  • Improved competitiveness. Businesses that use real-time analytics can identify trends and benchmarks faster than their competitors who are still using historical data. Real-time analytics also allows businesses to evaluate their partners’ and competitors’ performance reports instantaneously.
  • Precise information. Real-time analytics focuses on instant analyses that are consistently useful in the creation of focused outcomes, helping ensure time is not wasted on the collection of useless data.
  • Lower costs. While real-time technologies can be expensive, their multiple and constant benefits make them more profitable when used long term. Furthermore, the technologies help avoid delays in using resources or receiving information.
  • Faster results. The ability to instantly classify raw data allows queries to more efficiently collect the appropriate data and sort through it quickly. This, in turn, allows for faster and more efficient trend prediction and decision making.

Challenges

One major challenge faced in real-time analytics is the vague definition of real time and the inconsistent requirements that result from the various interpretations of the term. As a result, businesses must invest a significant amount of time and effort to collect specific and detailed requirements from all stakeholders in order to agree on a specific definition of real time, what is needed for it and what data sources should be used.

Once the company has unanimously decided on what real time means, it faces the challenge of creating an architecture with the ability to process data at high speeds. Unfortunately, data sources and applications can cause processing-speed requirements to vary from milliseconds to minutes, making creation of a capable architecture difficult. Furthermore, the architecture must also be capable of handling quick changes in data volume and should be able to scale up as the data grows.

The implementation of a real-time analytics system can also present a challenge to a business’s internal processes. The technical tasks required to set up real-time analytics — such as creation of the architecture — often cause businesses to ignore changes that should be made to internal processes. Enterprises should view real-time analytics as a tool and starting point for improving internal processes rather than as the ultimate goal of the business.

Finally, companies may find that their employees are resistant to the change when implementing real-time analytics. Therefore, businesses should focus on preparing their staff by providing appropriate training and fully communicating the reasons for the change to real-time analytics.

Use cases for real-time analytics in customer experience management

In customer relations management and customer experience management, real-time analytics can provide up-to-the-minute information about an enterprise’s customers and present it so that better and quicker business decisions can be made — perhaps even within the time span of a customer interaction. 

Here are some examples of how enterprises are tapping into real-time analytics:

  • Fine-tuning features for customer-facing apps. Real-time analytics adds a level of sophistication to software rollouts and supports data-driven decisions for core feature management. 
  • Managing location data. Real-time analytics can be used to determine what data sets are relevant to a particular geographic location and signal the appropriate updates.
  • Detecting anomalies and frauds. Real-time analytics can be used to identify statistical outliers caused by security breaches, network outages or machine failures. 
  • Empowering advertising and marketing campaigns. Data gathered from ad inventory, web visits, demographics and customer behavior can be analyzed in real time to uncover insights that hopefully will improve audience targeting, pricing strategies and conversion rates.

Examples

Examples of real-time analytics include:

  • Real-time credit scoring. Instant updates of individuals’ credit scores allow financial institutions to immediately decide whether or not to extend the customer’s credit.
  • Financial trading. Real-time big data analytics is being used to support decision-making in financial trading. Institutions use financial databases, satellite weather stations and social media to instantaneously inform buying and selling decisions.
  • Targeting promotions. Businesses can use real-time analytics to deliver promotions and incentives to customers while they are in the store and surrounded by the merchandise to increase the chances of a sale.
  • Healthcare services. Real-time analytics is used in wearable devices — such as smartwatches — and has already proven to save lives through the ability to monitor statistics, such as heart rate, in real time.
  • Emergency and humanitarian services. By attaching real-time analytical engines to edge devices — such as drones — incident responders can combine powerful information, including traffic, weather and geospatial data, to make better informed and more efficient decisions that can improve their abilities to respond to emergencies and other events.

Future

The future of pharmaceutical marketing and sales is being greatly impacted by the use of real-time analytics. It is expected that more pharmaceutical companies will begin using emerging technologies and implementing real-time analytics instead of relying on traditional methods to gain deeper insights into customer behavior and the market landscape. This has the potential to reduce costs through accurate predictions while also increasing sales and profit by optimizing marketing.

Higher education is also changing with the use of real-time analytics. Organizations can start marketing to prospective students who are best fit for their institution based on factors such as test scores, academic records and financial standing. Real-time, predictive analytics can help educational organizations gauge the probability of the student graduating and using their degree for gainful employment as well as predict a class’ debt load and earnings after graduation.

Unfortunately, the consistently increasing amount of machines and technical devices in the world and the expanding amount of information they capture makes it harder and harder to gain valuable insights from the data. One solution to this is the open source Elastic Stack; a collection of products that centralizes, stores, analyzes and displays any desired log and machine data in real time. Open source is believed to be the future of computer programs, especially in data-driven fields like business intelligence.