August 2, 2019
How to Data Model on a Cloud Data WarehouseUsing AtScale to establish a semantic layer on Amazon Redshift delivers several important benefits to modern data and analytics teams. As a single source of governed metrics, and dimensions, AtScale extends the value of Redshift for business intelligence and data science use cases. By ensuring high performance query with no ETL/ELT or extracts, AtScale on Redshift simplifies data pipelines and improves end user experience.Â
Summary of benefits of using AtScale on Amazon Redshift using the TPC-DS 10TB benchmark:
Query Performance12x faster | User Concurrency61x faster |
Compute Cost2.6x cheaper | SQL Queries Complexity76% less complex |
For a detailed overview of results and methodology of this benchmark study, please follow this link.
In this blog, we cover how to connect Amazon Redshift to AtScale, and by doing so, accelerate query times and reduce compute cost. To connect AtScale to a live Redshift database, there are just a few steps to follow:
- Add a new data warehouse connection in AtScale Design Center.
- Import a sample XML data model or create a new one from scratch.
- View the imported data model in AtScale Design Center.
How to Connect Amazon Redshift to AtScale
Create a NewData Warehouse
First, to connect Amazon Redshift to AtScale, in AtScale Design Center, navigate to the âSettingsâ tab and click on the âData Warehousesâ menu item. Then press the âCreate Data Warehouseâ button, which will trigger the following pop-up:
From here, select the desired data warehouse (eg. Redshift) from the dropdown list. Next, add a name for the data warehouse to make it easy to identify within AtScaleâs data warehouse library. For this example, letâs name our new warehouse âRedshift Data Warehouseâ.
Next, select the Redshift âDatabaseâ where we want AtScale to work and choose an âAggregate Schemaâ in that database where AtScale will store its aggregate tables. Additionally, you can optionally select an S3 bucket location for storing temporary files for loading sample data.Â
From there, you can click on the âTest S3 Connectionâ button to verify that AtScale can read and write to the S3 bucket. Then, click on the âSaveâ button below to save these settings.
Create a Connection
Now that we have created a data warehouse set up for Redshift, we can define one or more connections. AtScale data warehouse connections allow users to specify different resources for querying and creating aggregates.Â
In this example, weâll create a single data warehouse connection for both queries and aggregates. To create a connection, click on the arrow (â<â) on the right side of the screen to open the connections panel.
 Then, click on the âCreate Connectionâ button to define a new connection.
Then select a âNameâ for the connection (letâs just use âRedshiftâ), fill in the the Redshift server âHostâ (you can find this in your AWS Redshift console), the âPortâ (default is â5439â) and the âUserâ and âPasswordâ (or Access Key). By clicking on the âTest Connectionâ button, you can make sure that your credentials are good to go. Then click âSaveâ to save this connection.
Import the Sales Insights Sample Project & Data:
Now that weâve connected AtScale to the Redshift data warehouse, weâre ready to start data modeling. A good way to learn firsthand is to load the AtScale âSales Insightsâ sample project and model. To load this data and model, click on the AtScale icon in the upper left hand corner of AtScale Design Center to land on our home screen.
Then, click on the âUse Sample to Exploreâ tile to launch the sample import wizard.
If you havenât already downloaded the âSales Insightâ sample bundle, click on the âour sample data repositoryâ link in the wizard title bar to download the bundle.
Then click on the âDownloadâ button for the âSales Insights Application (AtScale versions >= 2019.2.x)â bundle and download it to your desktop and navigate back to the Design Center wizard screen.
From there, select the âData Warehouseâ (âRedshift Data Warehouseâ) name you just created, select the âSchemaâ to store your sample data, the âDestination Directoryâ for storing the temporary files (same as the âBucket Nameâ you defined in the âS3 Bucket Detailsâ for the Data Warehouse creation dialog. Then, select the âSales Insightsâ sample bundle you just downloaded and click the âNextâ button. Finally, choose the defaults for the remaining wizard pages and click the âFinishâ button to import the data and the model definition.
View Imported Model From Amazon Redshift
After you have imported the XML model and sample data into your Amazon Redshift data warehouse, you will see your new model in the AtScale Design Center canvas.
You can click on the âArrange Itemsâ button on the canvas to auto arrange your model visually.
If you doubleclick on the âFactinternetsalesâ Dataset title (the panels with the blue title ribbons), you should see the sample data in the Dataset composer: this will let you preview the raw data in this Redshift table and create calculations and transformations.
Summary:
Connecting AtScale to a cloud data warehouse like Amazon Redshift is an intuitive process with just a few steps. Following the steps above, weâre now ready to publish our model and consume it in a variety of BI tools (i.e. Excel, Power BI, Tableau, Looker) and data science platforms (i.e. DataRobot, SageMaker, Jupyter notebooks).Â
As you can see, the AtScale semantic layer establishes a single source of governed metrics with no need to extract or transform data from your Redshift data warehouse.
Power BI/Fabric Benchmarks