dbt hub

dbt test: expect_column_distinct_count_to_equal_other_table

USE CASE

Distinct count
Tables relationship

APPLIES TO

Model
Column
Seed
Source

This page describes the expect_column_distinct_count_to_equal_other_table test from the dbt-expectations package. This test verifies that the number of distinct values in a column of one model matches the number of distinct values in a column of another model. It is particularly useful for ensuring consistency across related datasets.

How it Works

The test compares the count of distinct values in one specified column against the count of distinct values in another column from a different model. It is designed to assure that the relationships between datasets are maintained and consistent.

Steps and Conditions:

  1. Column and Model Selection: Choose the column and the model you want to test. Also, determine the target column and model against which the distinct count will be compared.
  2. Configure Comparison: Define the comparison model and the column within that model. You can also specify conditions to filter which rows should be considered for both the source and comparison columns.
  3. Row and Compare Row Conditions _(Optional)_: Apply any specified row conditions to filter the datasets before performing the count.
  4. Execution: The test counts the distinct values in both the source and comparison columns, taking into account any row conditions.
  5. Outcome:
    • Pass: The test passes if the number of distinct values in both columns is equal.
    • Fail: The test fails if the number of distinct values differs between the two columns.

Example Usage: Fintech Company

In a Fintech company, ensuring that transaction types recorded in separate systems or tables are consistent can be critical for accurate reporting and compliance.

Consider a scenario where a Fintech company maintains two models: transactions and audited_transactions. Both models have a column named transaction_type. The transactions table contains raw transaction data, and audited_transactions contains verified transaction records for compliance purposes.


models:
  - name: transactions
    columns:
      - name: transaction_type
        tests:
          - dbt_expectations.expect_column_distinct_count_to_equal_other_table:
              compare_model: ref("audited_transactions")
              compare_column_name: transaction_type

This setup ensures that the number of distinct transaction types in the transactions table matches the number of distinct transaction types in the audited_transactions table. This alignment is crucial for validating the integrity and consistency of transaction data across different data recordings in the Fintech operational workflows.

The only data observability platform built into your dbt code

  • Get monitors on your production tables out-of-the-box with zero configuration
  • Add tests to your code in bulk with a simple UI
  • Track test results over time
  • Set owners and create meaningful alerts
  • Triage incidents faster using our end-to-end column-level lineage graph