This section briefly describes the main concepts and terms used in the application. Understanding these will make the process of learning the process flow much easier.
C3 Integrity is a flexible tool to allow users to load a variety of data into a system from a variety of sources. Data is uploaded into data sets by users who are members of access groups. The upload is done by using a pre-defined file format. Data sets are made up of attributes which may be subject to business rules called validations. A validation might be simple like a data type, or it might be a reference data set (a set of values to pick from). If an attribute has this type of validation set up, it may also be a qualifier which means that certain users can only input a subset of data that has a particular characteristic.
A data set is exactly that, a set of data which is the destination for rows of new data that users need to upload. Each data set corresponds to a table in the staging database, and also to one or more file formats that can upload to it. Some data sets simply store a number of reference values for other data sets; these are called reference data sets. Data sets consist of one or more data set attributes.
Data set attribute
A data set is comprised of one or more attributes. These correspond to columns in the upload file. Each attribute can hold different types of information and may be subject to business rules called attribute validations.
An attribute validation is a way of making sure that data uploaded to that attribute conforms to certain rules. A simple validation would be ‘date’, or ‘integer’. You could also set up a validation which limited the acceptable values for an attribute, for example ‘Country’ could have a validation set up such that only the values ‘Australia’, ‘UK’ and ‘USA’ were acceptable. These values would be stored in their own data set, referred to as a reference data set.
Reference data set
A reference data set is a normal data set, except that it is only used as a list of reference values for another data set. For example, a ‘countries’ data set could just be populated by ‘Australia’, ‘UK’ and ‘USA’. This data set could then be used by an attribute validation on another data set (e.g. ‘employee details’), and also as a qualifier to limit users to only uploading data for certain countries. Reference data sets could also be used to substitute values, e.g.: if an upload was performed that contained the value ‘AUS’, the reference data set could provide the ‘lookup’ to populate the data set with ‘Australia’ instead.
A file format is a pre-defined definition of how a file (e.g. a CSV file) can be uploaded into a particular data set. It includes a definition of how the file is set up (e.g.: how many header rows there are) and mappings between columns in the file and attributes in the data set.
All actions in Integrity are recorded in an audit trail against a logged-in user. A user is identified by their name and email address, and must authenticate using a password or their organisational directory service (such as LDAP).
Authorisation to upload to data sets is granted through membership of an access group. An access group brings together a number of users who have the same level of access to one or more data sets and qualifiers. Access groups have rights to access particular data sets, to either ‘replace all’, ‘append and replace’ or both when they upload data to each data set. Access groups also have the rights to upload data according to qualifiers.
When uploading data, it is possible to allow certain users (as members of certain access groups) to be able to upload all data, or limit them to uploading just a subset of data, by applying a qualifier to an attribute. For example, if the attribute ‘Country’ had a qualifier, it would be possible to prevent users from uploading data unless they were updating data with the country value ‘Australia’.
Most users using C3 Integrity will be of the role ‘Data Uploader’ which means they will be uploading data into the staging database.
The Group Administrator role has slightly more responsibility than the Data Uploader role, but not as much responsibility as the Data Set Modeller role. Group Administrators manage user rights and access groups.
Data set modeller
The Data Set Modeller role has significantly more responsibility than the Data Uploader and Group Administrator roles. The Data Set Modeller performs various crucial tasks such as setting system preferences, reviewing existing data sets, creating new data sets, adding attributes to a data set, and setting up business rules. Performing these task requires a detailed understanding of the target database tables, SQL, business rules and organisational requirements.