This page covers the expect_column_values_to_not_be_null
test from the dbt-expectations package. This test verifies that there are no null values in a specified column, ensuring that all entries in that column contain valid data. It is fundamental for maintaining the completeness and reliability of the dataset.
How it Works
The expect_column_values_to_not_be_null
test targets a column and checks each entry to confirm that it is not null. This validation is crucial in datasets where every record needs to store meaningful information in specific fields.
Steps and Conditions:
- Column Selection: Select the column that needs to be validated for non-null values.
- Optional Configuration:
- Row Condition: Use the
row_condition
argument to include only certain rows based on a specific condition. For example, you might test only rows where IDs are not null.
- Row Condition: Use the
- Execution: The test scans each entry in the chosen column, ensuring no data is missing where it is expected.
- Outcome:
- Pass: If there are no null values in the selected columns under the conditions set, the test passes.
- Fail: If any null values are found in the column, the test fails, indicating incomplete or potentially corrupted data entries that need addressing.
Example Usage: Fintech
In a Fintech company, accurate and complete financial records are crucial. Consider a transactions
table with a payment_amount
column that should record the amount for each transaction.
In this example, the expect_column_values_to_not_be_null
test is applied to the payment_amount
column of the transactions
model, with an additional condition to only consider transactions from January 1, 2023, onwards. By ensuring there are no null values in this critical financial column, the fintech company can safeguard against processing errors and ensure accurate financial reporting and analysis.