This page outlines the expect_column_values_to_be_null
test from the dbt-expectations package. This test helps validate scenarios where a column's values are expected to be null, ensuring that data entries meet specific conditions or represent states where no data should be present.
How it Works
The expect_column_values_to_be_null
test checks if the values in a selected column are all null. This is particularly useful in cases where null values are expected due to business rules or data requirements.
Steps and Conditions:
- Column Selection: Choose the column you intend to test for null values.
- Optional Configuration:
- Row Conditions: Define any specific conditions for selecting rows. For example, you might only want to check for null values in rows where another column is not null.
- Execution: The test scans each value in the selected column under the specified conditions.
- Outcome:
- Pass: If every checked value in the column is null, the test passes, confirming compliance with the expected data condition.
- Fail: If any non-null value is found in the column, the test fails, indicating an unexpected or erroneous data entry that requires attention.
Example Usage: Fintech
For a Fintech company, maintaining accurate and conditional data is essential, especially concerning financial records or historical data. Consider a particular scenario involving a transactions
table, where there is a need to ensure that the cancelled_date
column should only contain null values for transactions that are not cancelled.
In this example, the expect_column_values_to_be_null
test is applied to the cancelled_date
column of the transactions
model. The condition applied ensures that for transactions where the status is not 'cancelled', the cancelled_date
should be null. This setup helps verify that transactions are being tracked and recorded correctly, aligning with business rules that dictate that only cancelled transactions should record a cancellation date.