Validation And Verification

A common problem with computer systems is that it is very easy to put incorrect data into them. For example :

If you put incorrect data into a computer system then you will get incorrect results out of it. Processing incorrect inputs will produce incorrect outputs. This leads to the acronym :

GIGO : Garbage In Garbage Out

Sometimes incorrect data can actually cause a computer system to stop working temporarily. This is a particular problem in batch processing systems when data may be processed overnight. If incorrect data stops a batch processing system from working then a whole night's processing time may be lost.

People who develop computer systems go to a lot of trouble to make it difficult for incorrect data to be entered. The two main techniques used for this purpose are :

(1) Verification

(2) Validation

These techniques are described in detail below.

Verification

A verification check ensures that data is correctly transferred into a computer from the medium that it was originally stored on. Verification checks are usually used to check that information written on a data collection form has been correctly typed into a computer by a data entry worker.

Methods of Verification

The two most common methods of verification are :

Validation

A validation check is an automatic check made by a computer to ensure that any data entered into the computer is sensible. A validation check does not make sure that data has been entered correctly. It only ensures that the data is sensible. For this reason validation checks are not usually as effective as verification checks. They can however be carried out automatically by a computer and therefore require less work by computer operators making them cheaper to use.

Methods of Validation

There are many different methods of validation. The most appropriate method(s) to use will depend upon what data is being entered. The most common methods are listed here.

Validation checks can be performed by any piece of software. However you are most likely to encounter them when creating a new database. Sophisticated database packages will let you implement validation checks using validation rules. You can provide different validation rules for each different field in the database. Below are some examples of how the common validation methods can be used.

Type Check

Databases perform type checks automatically on all entered data. When a database is created each field in the database is given a type. Whenever data is entered into a field the database will check that it is of the correct type, e.g. alphabetic or numeric. If it is not then an error message will be displayed and the data will have to be re-entered. Here are some example field names and appropriate types.

Field NameTypeValid DataInvalid Data
Date of BirthDate11/03/9630/02/76, fred
SexAlphabeticMale, Female, Albert123, WA2
Shoe SizeNumeric12, 2.3, 123236G, house
PostcodeAlphanumericW12 6BD

Notice that a type check is not a very good validation check. Many of the entries in the Valid Data column in the table pass the type check but are clearly incorrect.

Length Check

As with type checks, most databases will automatically perform length checks on any entered data. The length check ensures that the data entered is no longer than a specified maximum number of characters. This is particularly important if a fixed length field is being used to store the data. If this is the case then any extra characters typed that made the data longer than the space available to store it would be lost. Here are some example field names and appropriate maximum lengths :

Field NameMaximum LengthValid DataInvalid Data
Title6mr, Mrs, GeorgeThe Duke Of, Sixteen
Surname15Smith, JonesSmethurst-Whately
County15England, CarThe Former Yugoslav Republic of Macedonia

Length checks are usually only performed on alphabetic or alphanumeric data. A similar test can be performed on numeric and date data by using a range check.

Range Check

Range checks are used on data made up of numbers or dates which must fall into a particular range. A lower and upper boundary for sensible values is specified. Any values which fall outside of this range will be rejected. Most sophisticated databases will let you set valid ranges for each field.

Field NameLower BoundaryUpper Boundary
Age0130
Car Engine Size (L)0.58.0
Month112
Temperature in UK (C)-2040

Sometimes there is only one boundary required for a particular field. For example the minimum volume of a cube would be zero cubic centimetres, but there is no maximum volume. When there is only one boundary to check the type of check used is known as a limit check rather than a range check.

Format Check

Format checks can be performed on data entered into a database field. The format that data entered into a field must be in is specified using an input mask. The input mask is made up of special characters which indicate what characters may be typed.

In a particular database the following special characters can be used to define an input mask :

Here are some input masks that could be used to validate three letter codes, car registration numbers and postcodes.

Input MaskPurposeValid DataInvalid Data
LLLThree Letter CodeABC
AND
OLD
AB
B2H
ABCD
L990LLLCar Registration NumberN912CHG
A3HAM
P22AUL
N1234HG
N212BT
123EAF

 

Ll90 0LLPostcodeWA14 9JD
M90 4SJ
BL9 0HN
WAM4 9PM
WA6 13H
M12 9Q

Check Digits

The check digit is a particularly important method of validation. It is used to ensure that code numbers that are originally produced by a computer are re-entered into another computer correctly. The check digit is a single digit added onto the end of a code number by the computer. The check digit is calculated from the other digits in the number. Check digits are included in bar code numbers.

Producing a Check Digit

This procedure is used to generate a check digit to add to the end of a number. It uses the Modulo-11 weighted check digit calculation. This calculation is used for ISBN numbers on books.

1) Start with the original product number e.g. 185813415.

2) Weight each digit by its position in the string and add up the results :

Digit185813415
Weightings*10*9*8*7*6*5*4*3*2Total
Result1072405661516310228

3) Divide the total by 11 and then subtract the remainder from 11. The check digit is the result of this operation :

228 / 11 = 20 remainder 8 => Check digit is 11-8 = 3.

If the remainder is 10 then the check digit is set to X. If it is 11 then the check digit is 0

4) Add the check digit to the end of the original number to get the complete product number : 1858134143

Validating a Number Including a Check Digit

The procedure to check if a number with a check digit in it has been inputted correctly is similar to that used to generated the check digit :

1) Input the number including the check digit. e.g. 1858134153.

2) Weight each digit by its position in the string and add up the results :

Digit1858134153
Weightings*10*9*8*7*6*5*4*3*2*1Total
Result10724056615163103231

3) Divide the total by 11.

231 / 11 = 21 remainder 0

4) If the reminder is 0 then the number has passed the validation check and so it is likely that it has been inputted correctly.

It is important that each digit is weighted before the numbers are added up. If this was not done then a check digit would not detect transposition errors (where two digits are swapped around). This is a particularly common form of error when numbers are typed.

Parity Checks

Parity checks are used during transmission of data to detect errors that have been caused by interference or noise. All data is transmitted as a sequence of 1s and 0s. A common type of error that occurs during data transmission is that a bit is swapped from a 0 to a 1 or a 1 to a 0 by electrical interference. Parity checks detect this type of error. A parity check works like this :

Transmission

1) When data is transmitted each character is encoded as a 7-bit binary number. e.g. the letter ‘B’ has the code 1000011.

2) An eigth bit is added to make a byte. This bit is called a Parity Bit.

3) A system can use either even or odd parity -

For example in an even parity system a parity bit of 1 would be added to the code for B and it would be transmitted as 11000011.

Reception

1) When a character is received the number of 1s and 0s in the byte are counted :

2) If this is not the case then an error must have occurred. A request will be sent to the transmitter to ask it to send the byte again.

Parity checks are not very good at detecting burst errors where more than one bit in a byte is changed.

Questions

(1)Give definitions of (a) a verification check and (b) a validation check.
(2)If some data passess a validation check does this mean that the data is correct ?
(3)Do you think that the Dual Input or On-Screen Prompt method of validation is best ? Why ?
(4)Why do you think that the Dual Input verification method is not used very often ?
(5)If some data passes the Dual Input verification test does that mean that it has definitely been entered correctly ? Why ?
(6)Some data is being entered into a database. Suggest the most appropriate validation checks that could be used for each of these fields :

a) Month of the Year
b) Gender
c) Postcode

(7)Suggest the name of a field on which a range check could be used. What range would be valid for this field ?
(8)A format check is going to be performed using an input mask. The input mask, built using the rules in this article is :

99990.00

Using this input mask, which of these items of data are valid ?

a) 2.38
b) 4.A8
c) 98564.53
d) 878979.93
e) 473.8

(9)Calculate the check digit to complete the ISBN number 185029553 (show your working).
(10)Is the ISBN number 045050638X valid (show your working) ?
(11)A computer system uses even parity. Which of these bytes have been received correctly ?

a) 10010110
b) 01001010

(12)Give an example of an error that could occur when the letter B was transmitted in an even parity system that would not be detected by a parity check [ Hint : You may find it easier to think of it as two errors. ]

Practical Exercises

(1) Set up a spreadsheet which will (a) calculate and (b) check the validity of moulo-11 weighted check digits.

(2) Set up a spreadsheet which will (a) calculate and (b) check the parity of bytes for both even and odd parity systems.

(C) P. Meakin 1998