dbt test: expect_column_unique_value_count_to_be_between

SOURCE

USE CASE

Values distribution

Uniqueness

APPLIES TO

Column

This page outlines the expect_column_unique_value_count_to_be_between test from the dbt-expectations package. This test ensures that the number of unique values in a specified column falls within a defined range, offering flexibility in validating data uniqueness across different scenarios.

How it Works

The expect_column_unique_value_count_to_be_between test assesses whether the count of unique values in a selected column is between a minimum and maximum value. This test is useful for ensuring that columns maintain a desired level of uniqueness, which could vary based on the data's nature and the specific requirements of your database schema.

Steps and Conditions:

Column Selection: Choose the column for which you need to verify the unique value count.
Define Range: Set the minimum and maximum numbers of unique values you expect in the column using min_value and max_value arguments. These limits can include the values themselves depending on the strictly parameter.
Optional Configuration:
- Group By: If specified using the group_by argument, the test will assess uniqueness within the defined groups, allowing for segmented validation.
- Row Condition: Using the row_condition argument, define any specific conditions for the rows that should be included in the test, such as excluding rows with null IDs.
- Strict Comparison: The strictly flag determines whether the boundary values are included (false) or strictly excluded (true) in the range check.
Execution: Apply the configured conditions and groupings, then count the unique values in the selected column. Compare this count against the specified range.
Outcome:
- Pass: If the number of distinct values falls within the inclusive or exclusive range based on the strictly flag, the test passes.
- Fail: If the count of unique values lies outside the established range, the test fails.

Example Usage: E-commerce

For an E-commerce company, maintaining a precise range of unique product categories can be vital for managing inventory efficiently and streamlining product search functionalities.

Consider the products table where each row represents a different product, and the category_id column specifies the category of each product:


models:
  - name: products
    tests:
      - dbt_expectations.expect_column_unique_value_count_to_be_between:
          column: category_id
          min_value: 10
          max_value: 50
          row_condition: "active = true"

In this example, the expect_column_unique_value_count_to_be_between test checks that the number of unique, active product categories ranges from 10 to 50. This setup helps ensure that the E-commerce platform offers a diversified but manageable array of product categories, enhancing user experience and operational efficacy.

AI Agents

Detection & Coverage

Governance

Triage and Response

Enabling business users

Integrations

Cloud vs. OSS

Webinars

dbt Tests Hub

Customer stories

Blog

Cloud Demo Videos

About Us

Documentation

The Elementary Community

dbt test: expect_column_unique_value_count_to_be_between

SOURCE

USE CASE

APPLIES TO

How it Works

Steps and Conditions:

Example Usage: E-commerce

The only data observability platform built into your dbt code

Product

Resources

Company

AI Agents Demo