Page 76 - C:\Users\Evans Moetji\Desktop\DPSA Guideline Digital Flipbook\
P. 76
ANNEXURE 1: QUALITY CONTROL METHODS
B VALIDITY OF DATA
You need to determine whether the attributes in the datasets are accurate so that errors can
be corrected. This is done by checking whether:
❏ The attributes are of the right type (i.e. text/string, categorical or numeric). For example, the
unique identifier should be a text variable (e.g. Clinic153) while the capacity of the service
point should be a numeric variable (e.g. 348).
❏ The spatial features have been classified correctly and whether they have the right range of
values. Classification refers to the type of service point, for example, a primary or secondary
school. An example of values that are out of range is when a school has 5 000 classrooms
– this exceeds the number of classrooms normally associated with schools.
❏ Duplicate spatial features (e.g. service points with exactly the same coordinates) exist. If
duplicates are found in the spatial information, incorrect spatial features need to be deleted
or their spatial location corrected. If spatial features of the same type share the same
geographic location (and have duplicated geographic coordinates) these features need to
be combined into one and their attributes aggregated.
The classification of spatial features, range of values and extent of duplicates can be checked
using the Filter option in Microsoft Excel. With the Filter option, it is possible to select incorrectly
classified service point types, and records where values associated with service points are found
to be out of range, so that they can be corrected.
Target population
By viewing the attribute tables associated with the spatial features in a GIS, and creating thematic
maps of variables of the target population dataset, it is possible to check whether the attributes
are of the right type, the classification of spatial features and whether there are duplicates.
Service points
The easiest way to check the validity of attributes associated with government service points is to
view the table in Microsoft Excel or in a desktop GIS to see if the variables are correctly defined.
Road dataset
When it comes to the road dataset, this is best done in a full GIS so that the variable types,
classification of spatial features, attributes and duplicates can be checked at the same time. A GIS
allows duplicate segments of roads to be quickly detected and to be cleaned. The connectivity
of road segments to one another can also be checked within a GIS. A connectivity analysis also
needs to be done within the accessibility modelling software to check that all the road segments
are connecting to one another. Notes should also be made during the validity check on the
extent of errors detected and fixed to provide an understanding of the accuracy of the dataset.
71