This module will allow you to execute data quality rules upon your project data to check for discrepancies
in your data. Listed below are some pre-defined data rules that you may utilize and run.
You may also create your own rules or edit, delete, or reorder the rules you have already created.
To find discrepancies for a given rule, simply click the Execute button next to it, or click the Execute All Rules
button to fire all the rules at once. It will provide you with a total number of discrepancies found for each rule and will allow you
to view the details of those discrepancies by clicking the View link next to each. Read more detailed instructions.
The pre-defined rules listed in red text cannot be modified, reordered, or removed. They are there if you wish to use them.
You may also build and execute your own rules at the bottom of the table below.
Rules can be set up using a literal logic format (e.g., [age] > 65) that will be evaluated as a boolean value (true or false) after
an existing record's value for that field is substituted (e.g., assuming a record's value is 23 for 'age', 23 > 65 evaluates as false).
The logic will be applied to all existing records in the project, and for any record for which the logic evaluates as true, it will
return it as a discrepancy for that rule. Similar to branching logic and calculated fields, REDCap variable/field names may be utilized in the
rule logic by placing the variable name inside square brackets [ ]. Also, for longitudinal projects, you may reference a field on one
specific event by prepending the variable name in the logic with the unique event name in square brackets.
Checking the 'real-time execution' checkbox for any custom data quality rule will enable the rule to be executed invisibly
on data entry forms whenever a user clicks the Save button to create or modify a record. After clicking Save,
it will execute all relevant data quality
rules invisibly (i.e. behind the scenes) and will display a warning pop-up message if any of the rules have been violated,
in which it will display a list of the data quality rules that were violated and also display the fields involved with their data values.
If no rules were violated, then it will save the record as usual and not display a pop-up message. Just like the results
that are returned when executing rules on the Data Quality page itself, results displayed on data entry forms
for 'real-time execution' can be excluded (if desired) so that they will not be displayed again if they are still in violation in the future.
Special functions may also be used within the logic as well (similar
to functions in calculated fields), all of which are listed on the Help & FAQ
page. If Data Access Groups exist for this project, then discrepancies will also be stratified according to their group
(assuming the user viewing this page is not in a group). Any user within a Data Access Group will only be able to see the discrepancies
for their own group. Also, if users do not have user privileges to view or edit data on specific data entry forms, then they will not be able
to view data from those forms if displayed in any results on this page as a data quality discrepancy.
If a discrepancy has been found for a given rule, any individual discrepancy in the list of results may be excluded
from those results in the future. Excluding a result merely prevents it from being included in the count of discrepancies
if the rule is executed again in the future. Excluded results can be accessed again by clicking the 'view' link at the
top of the results table for that rule, after which they can be un-excluded, if desired.
You can only add new Data Quality Rules to the project by uploading a CSV file with a new Data Quality Rules configuration. The format for the CSV upload file can be acquired by exporting the CSV file of your existing Data Quality Rules.
Note: You cannot edit or delete existing Data Quality Rules using the CSV file but can only do that by clicking the edit/delete icon next to a given Rule.
Select your CSV file of Data Quality Rules to be added:
Displayed below is a preview of all new data quality rules you are about to commit. Please look over the additions, and then approve them
by clicking the Upload button.
Note: These new data quality rules will be added to the existing set of data quality rules.
Data Quality Rules
Processing rule 0 of 0
Processing Complete!
Execute rules:
* The Blank Values rules above automatically exclude fields hidden by branching logic
when searching for blank values. If a field is hidden by branching logic on a data entry form or survey, then it is expected that
such a field would not have a value. Thus for these cases, the values for those hidden fields will not be classified as missing.
Additionally, checkbox fields are also excluded since an unchecked checkbox is itself often considered to be a real value. Note: This rule will not return any fields that are blank but also have Missing Data Codes. If you wish to find fields with Missing Data Codes,
it is recommended to execute Rule I instead.
** The term 'outlier' refers to a value that is more than two standard deviations from the mean.
*** The term 'hidden fields' refers to any fields on a survey or data entry form that are not being displayed
because branching logic is hiding them, which assumes that the field's value should be blank/null.
Loading...
Enter a new comment below and click the Add button.
You may hide certain results from displaying again and again by excluding them. Simply click the 'exclude' link
for a result in the table of discrepancies for a rule, and that result will not be counted next time in the number of
discrepancies for that rule, nor will it be displayed in the table of results. Results that have been excluded can be viewed again
by clicking the 'view' link at the top of the results table for that rule, in which it will display the number of excluded results if any
should exist. Results may have their exclusion status removed by clicking the 'remove exclusion' link in the results table
for an excluded result.
Data issues found with the Data Quality module can be resolved using the Data Resolution Workflow. After executing any given
data quality rule and viewing the results in the pop-up window, you will see a button in the 'Resolve issue' column that allows you and other
users in the project to leave comments and/or complete a formal data resolution process for documenting details of the data issue,
including the origin the issue, who resolved the issue, and how it was resolved (if applicable). Once the data resolution process has
been opened for an item, it will then appear in the 'Resolve Issues', which allows you to view all open items that need to be resolved.
Enabling the 'real-time execution' functionality for a custom rule is a great way to add
more data validation on a data entry form to ensure that data are getting entered correctly *at the moment* they are entered,
as opposed to checking the quality of the data retroactively by executing the rules here on this page. Using the
'real-time execution' feature is an excellent way to be proactive about maintaining the quality and integrity of your data.
Checking the 'real-time execution' checkbox for any custom data quality rule will enable the rule to be executed invisibly
on data entry forms whenever a user clicks the Save button to create or modify a record. After clicking Save,
it will execute all relevant data quality
rules invisibly (i.e. behind the scenes) and will display a warning pop-up message if any of the rules have been violated,
in which it will display a list of the data quality rules that were violated and also display the fields involved with their data values.
If no rules were violated, then it will save the record as usual and not display a pop-up message. Just like the results
that are returned when executing rules on the Data Quality page itself, results displayed on data entry forms
for 'real-time execution' can be excluded (if desired) so that they will not be displayed again if they are still in violation in the future.
NOTE: The pre-defined rules cannot have 'real-time execution'
enabled, but only the custom rules can. Also, the 'real-time execution' functionality does not work on survey pages, nor does it
get executed when performing data imports (either via the Data Import Tool or via the API). Thus, it only works on data entry forms.