dbt test: expect_table_row_count_to_equal_other_table_times_factor

SOURCE

github.com/calogica/dbt-expectations

USE CASE

Volume

Tables relationship

APPLIES TO

Model

Seed

Source

This page delves into the expect_table_row_count_to_equal_other_table_times_factor test from the dbt-expectations package. This test is essential for validating the relationship between the row counts of two tables, scaled by a predefined factor. It ensures that the data growth or reduction across tables maintains a predictable ratio, which is critical in many analytical contexts.

How it Works

The expect_table_row_count_to_equal_other_table_times_factor test compares the row count of one table to the row count of another table multiplied by a specified factor. This verification can highlight discrepancies that may suggest data issues such as duplication errors, missing entries, or improper integrations.

Steps and Conditions:

Table and Scale Factor Selection: Choose the primary table and another table to compare against, along with a multiplication factor by which the row count of the second table is scaled.
Conditional Parameters:
- Group By: Optionally, define columns for grouping data before counting rows in both the primary and comparison tables.
- Row Condition: Define conditions to filter the rows included in the count for both tables, which can refine and target the test more precisely.
Execution: The test performs the row count, applies any groupings or conditions, and then scales the row count of the comparison table by the specified factor. It checks if this scaled value matches the row count of the primary table.
Outcome:
- Pass: If the primary table's row count equals the scaled row count of the comparison table, the test passes, indicating consistency with the expected ratio.
- Fail: If there is a mismatch in counts, indicating an unexpected ratio of row counts between the two tables, the test fails. This may warrant a closer inspection of data handling practices.

Example Usage: Fintech

In a Fintech context, maintaining accurate transaction records across different systems or tables is crucial for financial reporting and analysis. Let's consider a scenario where a transactions table needs to be consistent with a processed_transactions table, but scaled by a factor due to processing rules or fees applied.


models:
  - name: transactions
    tests:
      - dbt_expectations.expect_table_row_count_to_equal_other_table_times_factor:
          compare_model: ref("processed_transactions")
          factor: 0.95
          group_by: ['transaction_type']
          compare_group_by: ['transaction_type']
          row_condition: "status = 'completed'"
          compare_row_condition: "status = 'processed'"

In this example:- The row count of processed_transactions is expected to be 95% of transactions due to a 5% transaction failure rate.- Both tables are grouped by transaction_type and include only transactions marked as completed or processed, respectively, ensuring that counts are compared within relevant subsets of data.

Detection & Coverage

Governance

Triage and Response

Enabling business users

Integrations

Cloud vs. OSS

Webinars

dbt Tests Hub

Customer stories

Blog

Cloud Demo Videos

About Us

Documentation

The Elementary Community

dbt test: expect_table_row_count_to_equal_other_table_times_factor

SOURCE

USE CASE

APPLIES TO

How it Works

Steps and Conditions:

Example Usage: Fintech

The only data observability platform built into your dbt code

Products

Resources

Company