dbt hub

dbt test: expect_table_columns_to_not_contain_set

USE CASE

Schema

APPLIES TO

Model
Seed
Source

This page describes the expect_table_columns_to_not_contain_set test from the dbt-expectations package. This test is designed to ensure certain values are absent in specific columns of a model, seed, or source, ensuring data cleanliness and integrity by verifying that prohibited values do not appear in specified columns.

How it Works

The expect_table_columns_to_not_contain_set test checks that none of the values in the specified list appear in the selected columns. This is crucial for maintaining the integrity of data by ensuring that specific unwanted or invalid values are excluded from data columns.

Steps and Conditions:

  1. Column Selection: Select one or multiple columns to apply the test to.
  2. Define Prohibited Values: List the values that should not be present in the specified columns.
  3. Optional Transformation:
    • Transform: Apply a transformation like upper or lower to the data in the columns before performing the check. This ensures uniformity in data comparison, especially if case sensitivity could affect the test results.
  4. Execution: The test examines each value in the specified columns, applying any defined transformations, and verifies that none of the values listed as prohibited are present.
  5. Outcome:
    • Pass: If none of the prohibited values are found in the selected columns, the test passes.
    • Fail: If any prohibited values are detected in the columns, the test fails, indicating the presence of unwanted data.

Example Usage: Fintech

For a Fintech company, it's essential to maintain strict control over financial transaction categories to ensure they comply with regulations. Using expect_table_columns_to_not_contain_set, a Fintech firm can enforce restrictions on the categories of transactions processed.

Consider a scenario where the transactions table contains a category column that classifies each transaction. Regulatory guidelines mandate that certain categories, like 'gambling' and 'bribery', should not be processed.


models:
  - name: transactions
    tests:
      - dbt_expectations.expect_table_columns_to_not_contain_set:
          column_list: ["category"]
          values_list: ["gambling", "bribery"]

In this example, the expect_table_columns_to_not_contain_set is applied to ensure that prohibited transaction categories such as 'gambling' and 'bribery' are absent from the category column in the transactions table. This test verifies compliance with regulatory standards avoiding potential legal and reputational risks.

The only data observability platform built into your dbt code

  • Get monitors on your production tables out-of-the-box with zero configuration
  • Add tests to your code in bulk with a simple UI
  • Track test results over time
  • Set owners and create meaningful alerts
  • Triage incidents faster using our end-to-end column-level lineage graph