dbt hub

dbt test: expect_table_aggregation_to_equal_other_table

USE CASE

Aggregation
Tables relationship

APPLIES TO

Model
Seed
Source

This page provides an overview of the expect_table_aggregation_to_equal_other_table test from the dbt-expectations package, which is designed to assert the equivalence of aggregated results between two tables. This test is essential for validating that summaries or computed results from different sources or stages within a dataset are consistent with each other.

How it Works

The expect_table_aggregation_to_equal_other_table test compares the results of an aggregation expression executed on one table to the results of the same or a different expression on another table. This is beneficial for ensuring consistency in data processing, especially when data is transferred or transformed across tables or models.

Steps and Conditions:

  1. Expression Setup:
    • Expression: Define an aggregation expression to evaluate on the primary model.
    • Compare Expression: Optionally, specify a different aggregation expression for the comparison model.
  2. Model Selection:
    • Primary Model: The initial table or model on which the primary expression is executed.
    • Comparison Model: The table or model against which the primary model's results are compared (specified by compare_model).
  3. Grouping and Conditions:
    • Group By: Optional grouping criteria for the primary expression.
    • Compare Group By: Optional grouping criteria for the comparison expression.
    • Row Conditions: Optional conditions to filter rows in both the primary and comparison expressions.
  4. Tolerance Settings:
    • Tolerance: Optional absolute tolerance for differences between the results.
    • Tolerance Percent: Optional percentage tolerance, expressed as a decimal, to accommodate minor variations in the results.
  5. Outcome:
    • Pass: The test passes if the aggregated results match within the specified tolerances.
    • Fail: The test fails if the results differ beyond the acceptable tolerance levels, indicating a discrepancy that needs resolution.

Example Usage: B2B SaaS

In the context of a B2B SaaS company, accurately tracking and reconciling revenue across different systems or data stages is crucial. Consider a scenario where revenue_monthly and revenue_archived are two models storing current and previously archived revenue data, respectively.


tests:
  - dbt_expectations.expect_table_aggregation_to_equal_other_table:
      expression: sum(revenue)
      compare_model: ref("revenue_archived")
      group_by: [account_id, month]
      compare_group_by: [account_id, month]
      tolerance_percent: 0.01

In this example, the test ensures that the total monthly revenue per account matches between the revenue_monthly table and the revenue_archived table within a tolerance of 1%. This helps to verify the integrity and consistency of revenue data as it moves from active records to archives, which is fundamental for accurate billing and financial reporting.

The only data observability platform built into your dbt code

  • Get monitors on your production tables out-of-the-box with zero configuration
  • Add tests to your code in bulk with a simple UI
  • Track test results over time
  • Set owners and create meaningful alerts
  • Triage incidents faster using our end-to-end column-level lineage graph