This page describes the expect_select_column_values_to_be_unique_within_record
test in dbt (data build tool). This test is designed to ensure that the values across specified columns within a single record are unique, helping to prevent potential data integrity issues from duplicate values in the same record.
How it Works
The expect_select_column_values_to_be_unique_within_record
test validates that for each record, the values in the selected columns are unique. It is critical in scenarios where multiple fields in a record should not have repeating values, ensuring data accuracy and uniqueness within single records.
Steps and Conditions:
- Column Selection: Choose multiple columns that need to be checked for unique values within each record.
- Row Exclusion (Optional): Define conditions to exclude certain rows from the test based on presence of null values or specific identifiers. The default behavior excludes rows if _all_ specified columns are null.
- Configurations (Optional):
- Ignore Row If: Decide whether to ignore rows where any or all values are missing. Default is "all_values_are_missing".
- Quote Columns: Specify whether the column names should be quoted. Default is
false
. - Row Condition: Apply additional filters to specify which rows should be included based on a condition.
- Validation: For each included row, the test checks if all selected columns contain unique values.
- Outcome:
- Pass: If all checked records have unique values across selected columns, the test passes.
- Fail: If any record has duplicate values in the selected columns, the test fails.
Example Usage: Fintech
In a Fintech company, ensuring that transaction records are unique and correctly recorded is crucial, especially when each transaction involves multiple identifiers or references that should not repeat within the same record.
Consider a scenario where the transactions
table stores information about user transactions, including two fields: transaction_id
and reference_code
.
In this example, the expect_select_column_values_to_be_unique_within_record
test checks that the transaction_id
and reference_code
within each transaction record are unique to that record. This prevents issues such as duplication of transaction IDs or reference codes, which could lead to financial discrepancies and challenges in transaction verification.