What Is the Data Warehouse?

The Data Warehouse is a PostgreSQL database that we load all of our important company analytics data into to make it easier to generate reports and metrics across the company.

How Does Data Get Into The Warehouse?

We use a combination of Fivetran, Apache Airflow, Logical Replication and Random Scripts (sorry just being honest) to get data into the data warehouse. When possible we create an individual Postgres schema for each data source to keep tables organized.

How Can I Have Data from $MY_TOOL synced to the Warehouse?

  1. Check to see if the tool you’re using has a supported Fivetran Connector
  2. If your tool isn’t supported by Fivetran the analytics team will have to build a custom ETL program for you using Airflow. Reach out in #axt and we can collect information, and communicate the feasibility and timeline of getting your data in the warehouse.

What is Airflow?

Airflow is an Extract, Transform, Load tool that we use to run data manipulation tasks on a schedule. You can find our airflow instance at https://1353e53c350a48bfa00ba5bd13c611a3-dot-us-central1.composer.googleusercontent.com/home

dbt Table of Contents

Individual Tables

DBT Naming Conventions and Folder Structure

Logical Replication