Friday, 20 April 2018

Data Standards


1.    Data Standard 1: Data collection purpose

The purpose of collecting the data / information[1] and how it is expected to be utilised must be clearly stated.

1.1.  Examine the purpose of collecting the data / information, and how it is expected to be utilised.  Typical questions:  Which of these (one or more) will you want to do?
·          Count how often things appear or happen;
·          Extract data into a report (and do further calculations in the report file);
·          Track the progress of a process;
·          Keep a register of something;
·          Use mail merge (inside a letter or report, or print mailing labels);
·          Print out (or view on-screen) lists of phone numbers to call;
·          Copy and paste groups of e-mail addresses into e-mail messages;
·          Extract information from a “knowledge base”, to help answer enquiries; and / or
·          Do calculations for invoicing.

1.2.  Write the purpose statement for collecting the information.
1.3.  Name the folder or system according to the purpose of the information.

2.    Data Standard 2: Data file structure

File structures should contain each field only once – there must be no duplicate information.

Note: If a new system is being designed in order to collect the data, the activities concerning the development, implementation and maintenance of that system are outside the scope of the data standards.  However, the system must facilitate compliance with the data standards:

2.1.  Set up a file structure that ensures that – wherever possible – each item is entered only once, and the single entry is referred to, over and over.
2.2.  Ensure that the validations and data descriptions facilitate compliance with the data standards.  Functionality such as the following assists with compliance:
·          MS Excel: drop-downs / lookup tables and VLookup;
·          MS Access and other database packages: related tables (i.e. relational database).


3.    Data Standard 3: Data integrity and validation

Data and information should be of the highest integrity.

·       Free text should be used as little as possible.
·       Lookup tables and predetermined lists should be a standard feature of databases.

·       Typing of information should be done once, and thereafter correct use of copy and paste method should be employed.

When capturing, editing and checking data:


3.1.  Use free text as little as possible, and lookup tables or predefined lists as much as possible.  Wherever a national set of values exists, use it.  (Examples: Stats SA’s “Health and Functioning” definitions; SAQA’s NQF Levels.)
3.2.  Wherever possible, use copy-and-paste instead of retyping.
3.3.  Format the inputs uniformly (e.g. phone numbers) – an “input mask” should be used to ensure this.
3.4.  Do as little as possible manually, and as much as possible via the tools and utilities available to you.  For example, use formulas inside documents (including MS Word) rather than typing in what has been found using a calculator or by counting; use pivot tables and graphs.
3.5.  Use data validations as much as possible.  (Spelling and grammar checks in MS Word are also validations.)
3.6.  Try to see the patterns in things.
3.7.  Try to see where things are the same as each other.
3.8.  Look for where there are risks of inaccuracies, and find ways to prevent these inaccuracies.

4.    Data Standard 4: File naming

Files should be consistently named.

4.1.  Format file names uniformly (e.g. “Qualifications offered as at yyyy mm dd.xlsx”: each time there is an update to the file, the name always starts with “Qualifications offered as at ” and the date, which is always part of the file name, is formatted as yyyy mm dd). 
4.2.  Where there are several versions of the same file, add “v{number}”, e.g. “v2”, before the date.

5.    Data Standard 5: Procedures

Procedures for handling data and information must be written and maintained, and adherence to the procedures must be monitored.

5.1.  Develop, maintain and adhere to procedures for handling data and information.
5.2.  Ensure that the procedures are included in the organisation’s system for cataloguing procedures (such as a Quality Management System or a Business Continuity System).
5.3.  Monitor that the data standards exist and are being met:
·          Data standards exist if a list of allowed values can be demonstrated.

Depending on the nature and use of the data, the parameters can be set narrowly or broadly.

·          Data validations are in use if:

o   The system has been programmed to use validations when people try to enter or edit data; and / or
o   Data or information is manually checked, and corrected if it does not meet the data standards.

In some circumstances, it is not possible to account for everything via validations, for instance of one does not wish to block everything from entering an information system.  In such cases, exception reports can be produced and acted upon, after capturing or loading the data.

·          Data standards are being met if:

o   The system contains only allowed values;
o   Aggregations and analyses make sense; and
o   There is consistency in the format and values of those data elements that have more flexibility allowed.

·          Sample table that can be used to assess whether data standards exist and are being met:

System
Data Standards Exist
(Y/N)
Data Validations are in use
(Y/N)
Data Standards are being met
(Rating as %)
















[1] Data are raw facts, such as one person’s test score.  Information is the product after data have been organised (aggregated or analysed).  Information assists with understanding or deciding something, such as the class average of the test scores assisting with decisions concerning the moderation of results.

No comments:

Post a Comment

Shortcut keys on the computer (Windows)

Shortcut Keys on the Computer ©  Yvonne Shapiro 2019 Shortcut keys in Windows The Windows (“Vlaggie”) key is shown as “W-” i...