Throughout the years we have developed multiple frameworks that enable us to build dependable data systems. The frameworks are focusing on either the back-end best practices or the more front-oriented performance models required for a solid analytics setup. We use Airtable to track the framework adoption percentage with a point-based check registry basis. The framework adoption checks for our demonstration assets can be found here, which give you an under-the-hood impression how we track each data systems adherence to our norms.
We provide data engineers specialised in building end-to-end data architectures from the operational business data sources to downstream data-applications. We connect the business operating systems to a centralised reporting datawarehouse with tools such as Airbyte, Hevo Data and Dataddo. For the datawarehousing we prefer using Google BigQuery due to its low cost and ultra-fast quering speed. The data transformations inside the datawarehouse are done using the tool dbt Cloud, which stores all logic in a repository like GitHub. This implies that we codify all your business logic, exceptions and definitions. After this step we setup the dimensional model and combine the dimensional and factual models into several OBTs, (One Big Table). These tables are storage heavy, but contain dozens of dimensions, which makes the analysis downstream significantly more intuitive for end-users. We visualise the data inside the warehouse through tools like PowerBI or Looker Studio, to validate that all code we put into place generates clean output data. As part of this step we embed data quality checks based on custom business logic requirements to make sure that there are no data quality errors for critical input fields. With a clean and consistent staging and factual layer available to use, we define the business entities, dimensions, measures and metrics that fits the domain model for your industry and business model. After this step, we aggregate the measures and metrics based on the entities that represent the axis around which your company runs its operations. Depending on the business model these entities are: accounts, projects, employees, departments, cost centers and teams. The downstream applications are connected through the external layer that is defined in the datawarehouse. Based on this external layer, we setup a connection with Cube which provides the semantic layer that enables headless analytics applications like Steep, Preset and Klipfolio to provide their distinct value. We build the interface that allows near-realtime datafeeds into business spreadsheets that are used by sales, finance and operations to perform ad-hoc calculations and forecasting. All our domain models are based on our global best practices and the operational data and analytics data standard found on MetricHQ. The engineering of your data architecture has reached maturity when the SDA-score reaches > 95%. We base our decision to expand development on the basic setup when 3 core performance metric targets are met over the last 3 months. 1) the entry data quality test score needs to be >99.9%. and 3) the issue-resolve duration needs to be smaller than 48 hours. Depending on the tier-level (24 hrs, 1 hrs, 15 min) data freshness.
Once the data architecture is fully operational, the first layer on top of the data model can be built, which consists out of modeling a set number of metrics per team. These metrics are then ordered based on hierarchy with the top 4 metrics per team being labeled as Key Metrics. The key metrics of each team are combined in a singular reporting setup, that is part of the monthly update that can be sent to the entire company, to enable your employees to be on the same page when it comes to which metrics are tracked and which one determining the success of the company. For all the metrics tracked in the governance framework the users define a target ratio and a minimum ratio, this enables c-suite executives, team-leads and employees to understand which metrics require most attention on the short and longer term, and which metrics are deteriorating and which ones are performing mediocre, but are stable. Once the framework is fully set, the company can observe 10s up to 100s of metrics and decide which metrics require most attention for their OKRs of that quarter. The existense of this setup enables OKRs and goal-setting to become full-circle integrated into the companies data model, since all key results are present in the data model, identified as important for company performance and medium attention has not improved the metric sufficently, so the metric is bumped to becme part of the quarterly or annual key results.