dbt test: expect_select_column_values_to_be_unique_within_record

SOURCE

github.com/calogica/dbt-expectations

USE CASE

Uniqueness

Columns relationship

APPLIES TO

Model

Seed

Source

This page describes the expect_select_column_values_to_be_unique_within_record test in dbt (data build tool). This test is designed to ensure that the values across specified columns within a single record are unique, helping to prevent potential data integrity issues from duplicate values in the same record.

How it Works

The expect_select_column_values_to_be_unique_within_record test validates that for each record, the values in the selected columns are unique. It is critical in scenarios where multiple fields in a record should not have repeating values, ensuring data accuracy and uniqueness within single records.

Steps and Conditions:

Column Selection: Choose multiple columns that need to be checked for unique values within each record.
Row Exclusion (Optional): Define conditions to exclude certain rows from the test based on presence of null values or specific identifiers. The default behavior excludes rows if _all_ specified columns are null.
Configurations (Optional):
- Ignore Row If: Decide whether to ignore rows where any or all values are missing. Default is "all_values_are_missing".
- Quote Columns: Specify whether the column names should be quoted. Default is false.
- Row Condition: Apply additional filters to specify which rows should be included based on a condition.
Validation: For each included row, the test checks if all selected columns contain unique values.
Outcome:
- Pass: If all checked records have unique values across selected columns, the test passes.
- Fail: If any record has duplicate values in the selected columns, the test fails.

Example Usage: Fintech

In a Fintech company, ensuring that transaction records are unique and correctly recorded is crucial, especially when each transaction involves multiple identifiers or references that should not repeat within the same record.

Consider a scenario where the transactions table stores information about user transactions, including two fields: transaction_id and reference_code.


models:
  - name: transactions
    tests:
      - dbt_expectations.expect_select_column_values_to_be_unique_within_record:
          column_list: ["transaction_id", "reference_code"]
          ignore_row_if: "any_value_is_missing"

In this example, the expect_select_column_values_to_be_unique_within_record test checks that the transaction_id and reference_code within each transaction record are unique to that record. This prevents issues such as duplication of transaction IDs or reference codes, which could lead to financial discrepancies and challenges in transaction verification.

AI Agents

Detection & Coverage

Governance

Triage and Response

Enabling business users

Integrations

Cloud vs. OSS

Webinars

dbt Tests Hub

Customer stories

Blog

Cloud Demo Videos

About Us

Documentation

The Elementary Community

dbt test: expect_select_column_values_to_be_unique_within_record

SOURCE

USE CASE

APPLIES TO

How it Works

Steps and Conditions:

Example Usage: Fintech

The only data observability platform built into your dbt code

Product

Resources

Company

Live Demo