dbt hub

dbt test: expect_column_values_to_be_null

USE CASE

Missing values

APPLIES TO

Column

This page outlines the expect_column_values_to_be_null test from the dbt-expectations package. This test helps validate scenarios where a column's values are expected to be null, ensuring that data entries meet specific conditions or represent states where no data should be present.

How it Works

The expect_column_values_to_be_null test checks if the values in a selected column are all null. This is particularly useful in cases where null values are expected due to business rules or data requirements.

Steps and Conditions:

  1. Column Selection: Choose the column you intend to test for null values.
  2. Optional Configuration:
    • Row Conditions: Define any specific conditions for selecting rows. For example, you might only want to check for null values in rows where another column is not null.
  3. Execution: The test scans each value in the selected column under the specified conditions.
  4. Outcome:
    • Pass: If every checked value in the column is null, the test passes, confirming compliance with the expected data condition.
    • Fail: If any non-null value is found in the column, the test fails, indicating an unexpected or erroneous data entry that requires attention.

Example Usage: Fintech

For a Fintech company, maintaining accurate and conditional data is essential, especially concerning financial records or historical data. Consider a particular scenario involving a transactions table, where there is a need to ensure that the cancelled_date column should only contain null values for transactions that are not cancelled.


models:
  - name: transactions
    columns:
        - name: cancelled_date
          tests:
            - dbt_expectations.expect_column_values_to_be_null:
                row_condition: "status != 'cancelled'"

In this example, the expect_column_values_to_be_null test is applied to the cancelled_date column of the transactions model. The condition applied ensures that for transactions where the status is not 'cancelled', the cancelled_date should be null. This setup helps verify that transactions are being tracked and recorded correctly, aligning with business rules that dictate that only cancelled transactions should record a cancellation date.

The only data observability platform built into your dbt code

  • Get monitors on your production tables out-of-the-box with zero configuration
  • Add tests to your code in bulk with a simple UI
  • Track test results over time
  • Set owners and create meaningful alerts
  • Triage incidents faster using our end-to-end column-level lineage graph