A common problem with computer systems is that it is very easy to put incorrect data into them. For example :
If you put incorrect data into a computer system then you will get incorrect results out of it. Processing incorrect inputs will produce incorrect outputs. This leads to the acronym :
GIGO : Garbage In Garbage Out
Sometimes incorrect data can actually cause a computer system to stop working temporarily. This is a particular problem in batch processing systems when data may be processed overnight. If incorrect data stops a batch processing system from working then a whole night's processing time may be lost.
People who develop computer systems go to a lot of trouble to make it difficult for incorrect data to be entered. The two main techniques used for this purpose are :
These techniques are described in detail below.
A verification check ensures that data is correctly transferred into a computer from the medium that it was originally stored on. Verification checks are usually used to check that information written on a data collection form has been correctly typed into a computer by a data entry worker.
The two most common methods of verification are :
A validation check is an automatic check made by a computer to ensure that any data entered into the computer is sensible. A validation check does not make sure that data has been entered correctly. It only ensures that the data is sensible. For this reason validation checks are not usually as effective as verification checks. They can however be carried out automatically by a computer and therefore require less work by computer operators making them cheaper to use.
There are many different methods of validation. The most appropriate method(s) to use will depend upon what data is being entered. The most common methods are listed here.
Validation checks can be performed by any piece of software. However you are most likely to encounter them when creating a new database. Sophisticated database packages will let you implement validation checks using validation rules. You can provide different validation rules for each different field in the database. Below are some examples of how the common validation methods can be used.
Databases perform type checks automatically on all entered data. When a database is created each field in the database is given a type. Whenever data is entered into a field the database will check that it is of the correct type, e.g. alphabetic or numeric. If it is not then an error message will be displayed and the data will have to be re-entered. Here are some example field names and appropriate types.
|Field Name||Type||Valid Data||Invalid Data|
|Date of Birth||Date||11/03/96||30/02/76, fred|
|Sex||Alphabetic||Male, Female, Albert||123, WA2|
|Shoe Size||Numeric||12, 2.3, 12323||6G, house|
Notice that a type check is not a very good validation check. Many of the entries in the Valid Data column in the table pass the type check but are clearly incorrect.
As with type checks, most databases will automatically perform length checks on any entered data. The length check ensures that the data entered is no longer than a specified maximum number of characters. This is particularly important if a fixed length field is being used to store the data. If this is the case then any extra characters typed that made the data longer than the space available to store it would be lost. Here are some example field names and appropriate maximum lengths :
|Field Name||Maximum Length||Valid Data||Invalid Data|
|Title||6||mr, Mrs, George||The Duke Of, Sixteen|
|County||15||England, Car||The Former Yugoslav Republic of Macedonia|
Length checks are usually only performed on alphabetic or alphanumeric data. A similar test can be performed on numeric and date data by using a range check.
Range checks are used on data made up of numbers or dates which must fall into a particular range. A lower and upper boundary for sensible values is specified. Any values which fall outside of this range will be rejected. Most sophisticated databases will let you set valid ranges for each field.
|Field Name||Lower Boundary||Upper Boundary|
|Car Engine Size (L)||0.5||8.0|
|Temperature in UK (C)||-20||40|
Sometimes there is only one boundary required for a particular field. For example the minimum volume of a cube would be zero cubic centimetres, but there is no maximum volume. When there is only one boundary to check the type of check used is known as a limit check rather than a range check.
Format checks can be performed on data entered into a database field. The format that data entered into a field must be in is specified using an input mask. The input mask is made up of special characters which indicate what characters may be typed.
In a particular database the following special characters can be used to define an input mask :
Here are some input masks that could be used to validate three letter codes, car registration numbers and postcodes.
|Input Mask||Purpose||Valid Data||Invalid Data|
|LLL||Three Letter Code||ABC|
|L990LLL||Car Registration Number||N912CHG|
|Ll90 0LL||Postcode||WA14 9JD|
The check digit is a particularly important method of validation. It is used to ensure that code numbers that are originally produced by a computer are re-entered into another computer correctly. The check digit is a single digit added onto the end of a code number by the computer. The check digit is calculated from the other digits in the number. Check digits are included in bar code numbers.
Producing a Check Digit
This procedure is used to generate a check digit to add to the end of a number. It uses the Modulo-11 weighted check digit calculation. This calculation is used for ISBN numbers on books.
1) Start with the original product number e.g. 185813415.
2) Weight each digit by its position in the string and add up the results :
3) Divide the total by 11 and then subtract the remainder from 11. The check digit is the result of this operation :
228 / 11 = 20 remainder 8 => Check digit is 11-8 = 3.
If the remainder is 10 then the check digit is set to X. If it is 11 then the check digit is 0
4) Add the check digit to the end of the original number to get the complete product number : 1858134143
Validating a Number Including a Check Digit
The procedure to check if a number with a check digit in it has been inputted correctly is similar to that used to generated the check digit :
1) Input the number including the check digit. e.g. 1858134153.
2) Weight each digit by its position in the string and add up the results :
3) Divide the total by 11.
231 / 11 = 21 remainder 0
4) If the reminder is 0 then the number has passed the validation check and so it is likely that it has been inputted correctly.
It is important that each digit is weighted before the numbers are added up. If this was not done then a check digit would not detect transposition errors (where two digits are swapped around). This is a particularly common form of error when numbers are typed.
Parity checks are used during transmission of data to detect errors that have been caused by interference or noise. All data is transmitted as a sequence of 1s and 0s. A common type of error that occurs during data transmission is that a bit is swapped from a 0 to a 1 or a 1 to a 0 by electrical interference. Parity checks detect this type of error. A parity check works like this :
1) When data is transmitted each character is encoded as a 7-bit binary number. e.g. the letter B has the code 1000011.
2) An eigth bit is added to make a byte. This bit is called a Parity Bit.
3) A system can use either even or odd parity -
For example in an even parity system a parity bit of 1 would be added to the code for B and it would be transmitted as 11000011.
1) When a character is received the number of 1s and 0s in the byte are counted :
2) If this is not the case then an error must have occurred. A request will be sent to the transmitter to ask it to send the byte again.
Parity checks are not very good at detecting burst errors where more than one bit in a byte is changed.
|(1)||Give definitions of (a) a verification check and (b) a validation check.|
|(2)||If some data passess a validation check does this mean that the data is correct ?|
|(3)||Do you think that the Dual Input or On-Screen Prompt method of validation is best ? Why ?|
|(4)||Why do you think that the Dual Input verification method is not used very often ?|
|(5)||If some data passes the Dual Input verification test does that mean that it has definitely been entered correctly ? Why ?|
|(6)||Some data is being entered into a database. Suggest the most appropriate validation checks that could be used for
each of these fields :|
a) Month of the Year
|(7)||Suggest the name of a field on which a range check could be used. What range would be valid for this field ?|
|(8)||A format check is going to be performed using an input mask. The input mask, built using the rules in this article is :|
Using this input mask, which of these items of data are valid ?
|(9)||Calculate the check digit to complete the ISBN number 185029553 (show your working).|
|(10)||Is the ISBN number 045050638X valid (show your working) ?|
|(11)||A computer system uses even parity. Which of these bytes have been received correctly ? |
|(12)||Give an example of an error that could occur when the letter B was transmitted in an even parity system that would not be detected by a parity check [ Hint : You may find it easier to think of it as two errors. ]|
(1) Set up a spreadsheet which will (a) calculate and (b) check the validity of moulo-11 weighted check digits.
(2) Set up a spreadsheet which will (a) calculate and (b) check the parity of bytes for both even and odd parity systems.
(C) P. Meakin 1998