Definition
Online Analytical Processing (OLAP) is a method for creating queries from multidimensional data, primarily for delivering insights for Business Intelligence. OLAP involves three core operations: aggregation / consolidation (roll-up), drill-down (from summary to detail), and slicing and dicing (snapshots across multiple dimensions). OLAP contrasts with the other method for data processing, OLTP or Online Transaction Processing, which focuses on processing individual transactions in real time. For example, OLTP processes (captures, stores, updates) sales transactions in real time whereas OLAP analyzes (aggregates, distills, investigates) the trend in sales transactions over time.
Purpose
The purpose of OLAP is to analyze data across multiple dimensions in order to provide business insights. The operative word for OLAP is analytical: focusing on creating insights about performance, trends and patterns over time.
What are the Primary Uses for OLAP?
OLAP is used to create insights (e.g. business intelligence, descriptive analytics) from data based on queries performed on multidimensional data that has been modeled and typically stored in a relational database. There are three (3) types of OLAP, with each type distinguished by the type of database it uses:
- Relational OLAP (ROLAP) – Queries made to data stored in relational databases (data stored in individual tables that can be combined in a query)
- Multidimensional OLAP (MOLAP) – Queries made to data store in a multidimensional cube (data already combined and stored with combinations readily available for queries)
- Hybrid OLAP (HOLAP) – Queries made to a data store that combines relational with multidimensional.
The main tradeoff in use between these methods is between query speed and database size / data storage. ROLAP minimizes data storage, but requires each query to create multidimensional combinations, whereas ROLAP already combines the data, making queries faster as long as the data size does not grow too large – and if many multidimensional data combinations are not queried, then the cost of storing that data can be inefficient.
Business Benefits Delivered by OLAP
The main benefit of OLAP is to deliver insights from data viewed across multiple dimensions – particularly over time – for reporting and analysis of performance, trends, comparisons / patterns and outliers / anomalies.
Common Roles and Responsibilities Associated with OLAP
Roles important to the use of OLAP are as follows:
- BI Engineer – The BI engineer is responsible for delivering business insights using OLAP methods and tools. The BI engineer works with the business and technical teams to ensure that the data is available and modeled appropriately for OLAP queries, and then builds those queries, including designing the outputs (reports, visuals, dashboards) typically using BI tools. In some cases, the BI engineer also models the data.
- Business Owner – There needs to be a business owner who understands the business needs for data and subsequent reporting and analysis. This to ensure accountability, actionability as well as ownership for data quality and data utility based on the data model. The business owner and project sponsor are responsible for reviewing and approving the data model as well as the reports and analysis that OLAP will generate. For larger, enterprise-wide insights creation and performance measurement,,a governance structure should be considered to ensure cross-functional engagement and ownership for all aspects of data acquisition, modeling and usage: reporting, analysis.
- Data Modelers – Data Modelers are responsible for each type of data model: conceptual, logical and physical. Data Modelers may also be involved with defining specifications for data transformation and loading.
- Technical Architect – The technical architect is responsible for logical and physical technical infrastructure and tools. The technical architect works to ensure the data model is physically able to be accessed, queried and analyzed by the various OLAP tools.
- Data Analyst / Business Analyst – Often a business analyst or more recently, data analyst are responsible for defining the uses and use cases of the data, as well as providing design input to data structure, particularly metrics, business questions / queries and outputs (reports and analyses) intended to be performed and improved. Responsibilities also include owning the roadmap for how data is going to be enhanced to address additional business questions and existing insights gaps.
Common Business Processes Associated with OLAP
The process for developing and deploying OLAP-based reports and analyses are as follows:
- Business Requirements – Business requirements or features and user stories (in the agile context) are defined to represent how the data is going to be used, including the business questions to be addressed, data subjects to acquire and elements of the core dimensions necessary to effectively query as well as example output reflecting query presentation.
- Data Acquisition and Profiling – Data sources are identified, acquired and analyzed for relevance, accuracy, completeness and dimensionality.
- Data Modeling – Data models representing conceptual, logical and physical versions are created by a data modeler and reviewed jointly by business and technical teams.
- Extract, Transform, Load (ETL/ELT) – The ETL process transforms raw data to conform with the data model, including moving the data from source location to target destination, and loading it into a database for OLAP queries to be performed.
- Business Intelligence Applications – The last step is to develop reports and analyses in an application that queries the multidimensional data. This can be anything from a spreadsheet to an application such as PowerBI or Tableau or to an application for creating predictive analytics.
Common Technologies Associated with OLAP
Technologies involved with OLAP are as follows:
Data Modeling applications – These applications make it easier to create, present and instantiate conceptual, logical and physical data models.
Semantic Layer – Semantic layer applications enable the development of a logical and physical data model for use by OLAP-based business intelligence and analytics applications. The Semantic Layer ensures that the data consumed by these applications is modeled in a multidimensional form singularly (one instance), consistently and made accessible easily. Companies like AtScale also provide the capability to automate OLAP-based data query optimization via creating data aggregation tables within the data models based on query frequency and performance / response time.
Business Intelligence Tools – These tools automate the OLAP queries, making it easier for data analysts and business-oriented users to create reports and analyses without having to involve IT / technical resources.
Trends / Outlook for OLAP
Key trends in the OLAP arena are as follows:
- Semantic Layer – The semantic layer is a common, consistent representation of the data used for business intelligence used for reporting and analysis, as well as for analytics. The semantic layer is important, because it creates a common consistent way to define data in multidimensional form to ensure that queries made from and across multiple applications, including multiple business intelligence tools, can be done through one common definition, rather than having to create the data models and definitions within each tool, thus ensuring consistency and efficiency, including cost savings as well as the opportunity to improve query speed / performance.
- Visualization – Given the increased amount of data, most users prefer to visualize insights from the data vs typically using columnar reports. Visualizations have increased in their sophistication as well as availability, including through more applications and libraries.
- Automation – Increase emphasis is being placed by vendors on ease of use and automation to increase speed-to-insights. This includes offering “drag and drop” interfaces to execute data-related preparation activities and insights creation / queries without having to write code, including reusing activities and processes, both for repeating use as well as sharing.
- Self-service – As data grows, availability of qualified data technologists and analytics are very limited. To address this gap and increase productivity without having to lean 100% on IT resources to make data and analysis available, Self-service is increasingly available for data profiling, mining, preparation, reporting and analysis. In addition tools like the Semantic Layer offered by AtScale, Inc are also focused on enabling business users / data analysts to model data for business intelligence and analytics uses.
- Cognitive – Data means nothing if it is not actionable. Actionable means understandable. Increasing effort is being made to make data easier to understand, and this includes applying cognitive capabilities to improve analysis, synthesis, presentation as well as query and question / answer interaction.
- Transferable – Increased effort is also underway to make insights easier to consume, and this includes making data available for publishing easier, including using api’s and via objects that store elements of the insights.
- Observable – Recently, a host of new vendors are offering services referred to as “data observability”. Data observability is the practice of monitoring the data to understand how it is changing and being consumed. This trend, often called “dataops” closely mirrors the trend in software development called “devops” to track how applications are performing and being used to understand, anticipate and address performance gaps and improve areas proactively vs reactively.
AtScale and OLAP
AtScale’s semantic layer provides the ability for both business users, data analytics and data modelers to easily and rapidly create a multidimensional data model for consistent consumption by multiple applications that provide business intelligence and analytics. Further, AtScale also provides automated data aggregation to significantly improve data query speed.
Additional Resources
NEW BOOK