data_validation_framework.util¶
Util functions.
Functions
|
Apply a function to df rows using tqdm. |
|
Return a list of missing columns in a |
|
Write a message without interfering with the progress bar using the message Queue. |
|
Check that required columns exist in a |
|
Update progress bar using the Queue. |
|
Try to apply a function on a |
Classes
|
Fake file-like stream object that redirects all prints to a Queue. |
- class data_validation_framework.util.StreamToQueue(*args, message_queue=None, **kwargs)¶
Bases:
DummyTqdmFile
Fake file-like stream object that redirects all prints to a Queue.
- write(buf)¶
Redirect write calls to the Queue.
- data_validation_framework.util.apply_to_df(df, func, *args, nb_processes=None, redirect_stdout=None, **kwargs)¶
Apply a function to df rows using tqdm.
- data_validation_framework.util.check_missing_columns(df, required_columns)¶
Return a list of missing columns in a
pandas.DataFrame
.- Parameters:
df (pandas.DataFrame) – The DataFrame to check.
required_columns (list) – The list of column names. A column name can be a str for a one level column or either a list(tuple(str)) or a dict(list(str)) for a two-level column.
- data_validation_framework.util.message_worker(progress_bar, message_queue)¶
Write a message without interfering with the progress bar using the message Queue.
- data_validation_framework.util.report_missing_columns(df, required_columns)¶
Check that required columns exist in a
pandas.DataFrame
.- Parameters:
df (pandas.DataFrame) – The DataFrame to check.
required_columns (list) – The list of column names. A column name can be a str for a one level column or either a list(tuple(str)) or a dict(list(str)) for a two-level column.
- data_validation_framework.util.tqdm_worker(progress_bar, tqdm_queue)¶
Update progress bar using the Queue.
- data_validation_framework.util.try_operation(row, func, *args, **kwargs)¶
Try to apply a function on a
pandas.Series
, and record exception.