Data Analytics

Data analytics (DA) is the process of examining data sets in order to find trends and draw conclusions about the information they contain. Increasingly data analytics is used with the aid of specialized systems and software. Data analytics technologies and techniques are widely used in commercial industries to enable organizations to make more-informed business decisions. It is also used scientists and researchers to verify or disprove scientific models, theories and hypotheses.

As a term, data analytics predominantly refers to an assortment of applications, from basic business intelligence (BI), reporting and online analytical processing (OLAP) to various forms of advanced analytics. In that sense, it’s similar in nature to business analytics, another umbrella term for approaches to analyzing data. The difference is that the latter is oriented to business uses, while data analytics has a broader focus. The expansive view of the term isn’t universal, though: In some cases, people use data analytics specifically to mean advanced analytics, treating BI as a separate category.

Data analytics initiatives can help businesses increase revenues, improve operational efficiency, optimize marketing campaigns and customer service efforts. It can also be used to respond quickly to emerging market trends and gain a competitive edge over rivals. The ultimate goal of data analytics, however, is boosting business performance. Depending on the particular application, the data that’s analyzed can consist of either historical records or new information that have been processed for real-time analytics. In addition, it can come from a mix of internal systems and external data sources.

Types of data analytics applications

At a high level, data analytics methodologies include exploratory data analysis (EDA), and confirmatory data analysis (CDA). EDA aims to find patterns and relationships in data, while CDA applies statistical techniques to determine whether hypotheses about a data set are true or false. EDA is often compared to detective work, while CDA is akin to the work of a judge or jury during a court trial a distinction first drawn by statistician John W. Tukey in 1977.

Data analytics can also be separated into quantitative data analysis and qualitative data analysis. The former involves the analysis of numerical data with quantifiable variables. These variables can be compared or measured statistically. The qualitative approach is more interpretive it focuses on understanding the content of non-numerical data like text, images, audio and video, common phrases, themes and points of view.

At the application level, BI and reporting provide business executives and corporate workers with actionable information about key performance indicators, business operations, customers and more. In the past, data queries and reports typically were created for end users by BI developers who worked in IT. Now, more organizations will use self-service BI tools that let executives, business analysts and operational workers run their own ad hoc queries and build reports themselves.

An advanced type of data analytics includes data mining, which involves sorting through large data sets to identify trends, patterns and relationships. Another type is called predictive analytics, which seeks to predict customer behavior, equipment failures and other future events. Machine learning can also be used for data analytics, using automated algorithms to churn through data sets more quickly than data scientists can do via conventional analytical modeling. Big data analytics applies data mining, predictive analytics and machine learning tools. Text mining provides a means of analyzing documents, emails and other text-based content.  

Data analytics initiatives support a wide variety of business uses. For example, banks and credit card companies analyze withdrawal and spending patterns to prevent fraud and identity theft.  E-commerce companies and marketing services providers will use clickstream analysis to identify website visitors who are likely to buy a particular product or service based on navigation and page-viewing patterns. Healthcare organizations mine patient data to evaluate the effectiveness of treatments for cancer and other diseases. Mobile network operators also examine customer data to forecast churn. This allows mobile companies to take steps to prevent defections to business rivals. To boost customer relationship management efforts, other companies can also engage in CRM analytics to segment customers for marketing campaigns and equip call center workers with up-to-date information about callers.

Inside the data analytics process

Data analytics applications involve more than just analyzing data. Particularly on advanced analytics projects. Much of the required work takes place upfront, in collecting, integrating and preparing data and then developing, testing and revising analytical models to ensure that they produce accurate results. In addition to data scientists and other data analysts, analytics teams often include data engineers, whose job is to help get data sets ready for analysis.

The analytics process starts with data collection. Data scientists identify the information they need for a particular analytics application, and then work on their own or with data engineers and IT staff to assemble it for use. Data from different source systems may need to be combined via data integration routines, transformed into a common format and loaded into an analytics system, such as a Hadoop cluster, NoSQL database or data warehouse.

In other cases, the collection process may consist of pulling a relevant subset out of a stream of data that flows into, for example, Hadoop. This data is then moved to a separate partition in the system so it can be analyzed without affecting the overall data set.

Once the data that’s needed is in place, the next step is to find and fix data quality problems that could affect the accuracy of analytics applications. That includes running data profiling and data cleansing tasks to ensure the information in a data set is consistent and that errors and duplicate entries are eliminated. Additional data preparation work is then done to manipulate and organize the data for the planned analytics use. Data governance policies are then applied to ensure that the data follows corporate standards and is being used properly.

From here, a data scientist builds an analytical model, using predictive modeling tools or other analytics software using languages such as Python, Scala, R and SQL. The model is initially run against a partial data set to test its accuracy. Typically, it’s then revised and tested again. This process is known as “training” the model until it functions as intended. Finally, the model is run in production mode against the full data set, something that can be done once to address a specific information need or on an ongoing basis as the data is updated.

In some cases, analytics applications can be set to automatically trigger business actions. For example, stock trades by a financial services firm. Otherwise, the last step in the data analytics process is communicating the results generated by analytical models to business executives and other end users. Charts and other infographics can be designed to make findings easier to understand. Data visualizations often are incorporated into BI dashboard applications that display data on a single screen and can be updated in real-time as new information becomes available.

Data analytics vs. data science

As automation grows, data scientists will focus more on business needs, strategic oversight and deep learning. Data analysts who work in business intelligence will focus more on model creation and other routine tasks. In general, data scientists concentrate efforts on producing broad insights, while data analysts focus on answering specific questions. In terms of technical skills, future data scientists will need to focus more on the machine learning operations process, also called MLOps.

Bottom of Form

Speaking at our Information Builders‘ Summit, IDC Group vice president, Dan Vesset estimated that knowledge workers spend less than 20% of their time on data analysis. The rest of their time is taken up with finding, preparing and managing data, “An organisation plagued by the lack of relevant data, technology and processes, employing 1000 knowledge workers, wastes over $5.7 million annually searching for, but not finding information,” warned Vesset.

Vesset’s comments underline the fact that data must be business-ready before it can generate value through advanced analytics, predictive analytics, IoT, or artificial intelligence (AI).

As we’ve seen from numerous enterprise case studies, co-ordination of data and analytics strategies and resources is the key to generating return on analytics investments.

Building the case for aligning data and analytics strategies

As data sources become more abundant, it’s important for organisations to develop a clear data strategy, which lays out how data will be acquired, stored, cleansed, managed, secured, used and analysed, and the business impact of each stage in the data lifecycle.

Equally, organisations need a clear analytics strategy which clarifies the desired business outcomes.

Analytics strategy often follows four clear stages: starting with descriptive analytics; moving to diagnostic analytics; advancing to predictive analytics and ultimately to prescriptive analytics.

These two strategies must be aligned because the type of analytics required by the organisation will have a direct impact on data management aspects such as storage and latency requirements. For example, operational analytics and decision support will place a different load on the infrastructure to customer portal analytics, which must be able to scale to meet sudden spikes in demand.

If operational analytics and IoT are central to your analytics strategy, then integration of new data formats and real-time streaming and integration will need to be covered in your data strategy.

Similarly, if your organisation’s analytics strategy is to deliver insights directly to customers, then data quality will be a critical factor in your data strategy.

When the analytics workload is considered, the impact on the data strategy becomes clear. While a data lake project will serve your data scientists and back office analysts, your customers and supply chain managers may be left in the dark.

Putting business outcomes first

Over the past four decades, we have seen the majority of enterprise efforts devoted to back-office analytics and data science in order to deliver data-based insights to management teams.

However, the most effective analytics strategy is to deliver insights to the people who can use them to generate the biggest business benefits.

We typically observe faster time to value where the analytics strategy focuses on delivering insights directly to operational workers to support their decision-making; or to add value to the services provided to partners and customers.

How to align data and analytics strategies One proven approach is to look at business use cases for each stage in the analytics strategy. This might include descriptive management scorecards and dashboards; diagnostic back-office analytics and data science; operational analytics and decision support; M2M and IoT; AI; or portal analytics created to enhance the customer experience.

Identify all the goals and policies that must be included in your strategies. Create a framework to avoid gaps in data management so that the right data will be captured, harmonised and stored to allow it to be used effectively within the analytics strategy.

Look at how your organisation enables access to and integration of diverse data sources. Consider how it uses software, batch or real-time processing and data streams from all internal systems.

By looking at goals and policies, the organisation can accommodate any changes to support a strong combined data and analytics strategy.

Focus on data quality

Once you have defined your data and analytics strategies, it’s critical to address data quality. Mastering data ensures that your people can trust the analytic insights derived from it. Taking this first step will greatly simplify your organisation’s subsequent analytics initiatives.

As data is the fuel of the analytics engine, performance will depend on data refinement.

The reality for many data professionals is that they struggle to gain organisation-wide support for a data strategy. Business managers are more inclined  to invest in tangibles, such as dashboards Identifying the financial benefits of investing in a data quality programme, or a master data management initiative is a challenge, unless something has previously gone wrong which has convinced the management team that valuable analytics outputs are directly tied to quality data inputs.

To gain their support for a data strategy consider involving line of business managers by asking them what the overall goals and outputs are for their analytics initiatives. An understanding the desired outputs of data will then guide the design of the data infrastructure.

Pulling together

Often we see data management, analytics and business intelligence being handled by different teams, using different approaches, within the same organisation. This can create a disconnection between what the business wants to achieve from data assets and what is possible. Data and analytics strategies need to be aligned so that there is a clear link between the way the organisation manages its data and how it gains business insights.

  • Include people from different departments who possess a cross section of skills: business, finance, marketing, customer service, IT, business intelligence, data science and statistics. Understand how these colleagues interact and what is important to them in terms of data outputs.
  • Take into account how data interconnects with your organisation’s daily business processes. This will help answer questions about the required data sources, connections, latency and inputs to your analytics strategy. Ensuring that they work together connects data to business value.
  • Finally, consider the technology components that are required. This entails looking at different platforms that deliver the required data access, data integration, data cleansing, storage and latency, to support your required business outcomes.

Measuring the benefits

The following organisations aligned their data and analytics strategies to deliver clear business outcomes:

  • Food for the Poor used high quality data and analytics to reach its fund raising target more quickly: reducing the time taken to raise $10 million from six months to six days, so that it could more quickly help people in dire need.
  • Lipari Foods integrated IoT, logistics and geo location data, enabling it to analyse supply chain operations so that it uses warehouse space more efficiently, allowing it to run an agile operation with a small team of people.
  • St Luke’s University Health Network mastered its data as part of its strategy to target specific households to make them aware of specialised medications, reaching 98 per cent uptake in one of its campaigns focused on thirty households. “Rather than getting mired in lengthy data integration and master data management (MDM) processes without any short-term benefits, stakeholders decided to focus on time-to-value by letting business priorities drive program deliverables,” explains Dan Foltz, program manager for the EDW and analytics implementation at St. Luke’s. “We simultaneously proceeded with data integration, data governance, and BI development to achieve our business objectives as part of a continuous flow. The business had new BI assets to meet their needs in a timely fashion, while the MDM initiative improved those assets and enabled progressively better analysis,” he adds. This approach allowed the St. Luke’s team to deliver value throughout the implementation.

These are just a few examples of organisations having a cohesive data strategy and analytics strategy which has enabled them to generate better value from   diverse and complex data sets.

Gaining better value from data

While analytics initiatives often begin with one or two clear business cases, it’s important to ensure that the overall data analytics strategy is bigger than any single initiative. Organisations that focus on individual projects may find that they have overlooked key data infrastructure requirements once they try to scale.

As Grace Auh, Business Intelligence and Decision Support manager at Markham Stouffville Hospital, observed during Information Builders’ Summit, “Are you connecting the dots? Or are you just collecting them?”

Capturing data in silos to serve tactical requirements diminishes the visibility and value that it can deliver to the whole organisation. The ultimate path to creating value is to align your data and analytic strategies to each other and most importantly to the overall strategy and execution of your organisation.