Evaluation of home pregnancy test kits for reliability
Pregnancy kits for home use are used widely in the UK and the Purchasing and Supplies Agency of the Department of Health commissioned an evaluation of all the products available.
VOL: 103, ISSUE: 27, PAGE NO: 30-31
Michael J Wheeler is consultant clinical scientist; Susan Lamph is evaluator; Stephen Halloran is consultant clinical scientist; all at Guildford Medical Device Evaluation Centre, Postgraduate Medical School, Guildford, Surrey.
Michael J Wheeler, PhD, MSc, BSc, FRC Path; ; Susan Lamph, AIBMS; Stephen Halloran, MSc, BSc, DipCB, FRC Path
Pregnancy kits for home use are used widely in the UK and the Purchasing and Supplies Agency of the Department of Health commissioned an evaluation of all the products available. This study has identified 30 different kits, 27 of which have been evaluated.
All the devices were capable of giving a positive result for pregnancy at the time of the first missed period. Areas of confusion and potential error have been identified. Previous studies have shown that although a device is reliable a high proportion of women reported a wrong result due to incorrect use of the device. It is important therefore that women who present to clinic with a result for a home pregnancy kit have the result confirmed by a trained professional.
Over-the counter (OTC) pregnancy kits (devices) have been available in the UK since the late 1970s. Early tests could detect levels of 200-500IU/L human chorionic gonadotropin (hCG) so that testing was best delayed until two weeks after the last missed period. Sensitivities of 10-50IU/L hCG became possible with the availability of monoclonal antibodies and refined techniques. HCG is detected in blood and urine after a foetus has implanted in the uterus.
Valanis and Perlman (1982) found that 33% of women, testing less than nine days after the last missed period, had false negative results.
Bastian et al (1998)reported that most false negative results are due to users not using the kits properly.
However, there have been very few studies on the reliability of OTC pregnancy testing devices. A recent study in Germany by SiekmeierandLutz(2007) suggests that pregnancy testing kits are very reliable. They looked at all the notifications reported as failures to the Federal Institute for Drugs and Medical Devices in Germany of in vitrodevices (that includes OTC pregnancy testing devices) between 1999 and 2006. There were 207 notifications relating to OTC devices of which 25 (12.1%) were pregnancy testing devices. Notifications of these devices rely on doctors and users reporting failures and so this could be a gross underestimate of the true incidence of failures. Products evaluated in 1999 all had sensitivity levels of 25-50 IU/L hCG (Wheeler, 1999) and used an immunochromatography technique. This is essentially a wick technique where applied urine diffuses along an absorbent material reacting with different antibodies along the way.
The products evaluated in 2006 reported here use exactly the same technique but have sensitivity of 10-50 IU/L. As the urine progresses along the wick any hCG in the urine combines with a monoclonal antibody (raised in a mouse) to the β-subunit of hCG (hCG is made of two subunits, an α- and a β-subunit. The bound hCG continues along the wick to the test reading area where it is ‘captured’ by another monoclonal antibody directed to the α-subunit of hCG. Excess β-subunit antibody continues along the wick to the control area where it is captured by an antimouse immunogloulin G antibody. Devices are either a flat plastic cassette or pallet onto which a few drops of collected urine are added to the sample well (dipstick/pallet), or a stick design that can be placed either in the urine stream or into a beaker of collected urine (midstream dipstick). The latter is the simpler device. Those that can be placed into the urine stream have a plastic casing with a small amount of wick exposed at one end for placing into the urine stream. A plastic cap is provided to cover the absorbent tip after use.
This evaluation was carried out for the Centre for Evidenced Based Purchasing (CEP) which is part of Policy and Innovation Directorate of the NHS Purchasing and Supplies Agency. The study was commissioned by CEP to evaluate the reliability of OTC pregnancy testing devices available in the UK. Thirty different OTC pregnancy test kits available from pharmacies, supermarkets or the internet were identified.
Method of evaluation
Companies were notified that an evaluation of OTC pregnancy tests was taking place. They were sent the test protocol and invited to submit their products for evaluation. Of 20 companies, 15 agreed for their devices to be evaluated. Chefaro Ltd was redesigning it device (Predictor), while Boots, LPC Medical (UK) (Easistix P), Home Health UK (Fastest) and Lloyds decided not to submit their devices for evaluation. Previous versions of Boots, Chefaro Ltd and Lloyds products evaluated by Wheeler (1999) were all found to be reliable and convenient to use, however these are not the products currently available. All the tests were carried out in a well-lit laboratory and results were confirmed by two operators. The lowest concentration of hCG detected by each kit was examined in two ways:
- using six urine samples with known concentrations of about 100IU/L hCG from women who were pregnant;
- by adding the current hCG international reference standard to a urine specimen from a woman who was not pregnant. The six pregnancy urines were diluted with a non-pregnancy urine with a concentration <2IU/L hCG to give dilutions down to about 5IU/L. The second group of samples were prepared as follows. A non-pregnancy urine (concentration <2IU/L hCG) had the Fourth International Standard forChorionic Gonadotrophins, (National Institute for Biological Standards and Control, UK 2004) added to give concentrations of 5, 10, 25, 50, 100, 250 and 500IU/L hCG.
All devices were tested with both sets of sample dilutions to determine sensitivity. Results were recorded as negative, positive, weakly positive and very weakly positive. As the very weakly positive results were only faintly visible it was judged that they could be easily missed in a less well-lit environment with only one operator. No details are available regarding how each manufacturer determines the sensitivity of its kits so it was decided that the limitation of detection (sensitivity) of each kit was the concentration at which more than 50% of results from the above specimens were at least weak positives. This provided a standard approach to sensitivity testing for all kits. Urine samples from six menopausal women were also tested. The hCG concentration in women of reproductive age who are not pregnant is <5IU/L but this rises after the menopause and has been reported to be as high as 10IU/L. Some kits now report sensitivity as low as 10IU/L so there is a risk that a sample from a woman with premature menopause being screened for pregnancy could test positive. The presence of a ‘hook effect’ was also investigated. Very high concentrations of hCG could swamp the binding sites of both antibodies used in these kits so that little or no hCG is captured in the test area. Therefore a test would appear negative despite high concentrations of hCG being present. This phenomenon or ‘hook effect’ was investigated by adding 10,000, 50,000, 100,000 and 500,000IU/L to a urine sample from a woman who is not pregnant. Ease of use was examined by 10 women who were not based in a laboratory.
Full details of the 2006 evaluation, including kits tested and the individual results obtained, may be found in Lamph et al (2006a, 2006b). Table 1 lists the manufacturers, the devices they supply and the type of device. The kits came in three formats: dipstick, midstream dipstick and pallet. Dipstick devices are only suitable for placing into a vessel of collected urine, midstream dipsticks may be either placed into a urine stream or into a vessel of collected urine, while a pallet device requires collection of a urine specimen before putting a few drops of the urine onto the test device.. A group of women were asked to state their preference for each type of device. The preferred device was the midstream dipstick as it required no urine collection if placed in midstream and was very simple to use.
The users found pallet devices less intuitive to use and some had difficulty using the droppers to add the correct number of drops to the sample well. Although the dipstick device was easier to use, users were concerned about dipping the device too far into the urine and found reading results was more difficult than when using either the pallet or midstream dipstick devices. These devices also required urine to be collected first. Users thought it useful for a collection vessel to be supplied with the pallet and dipstick devices. Collection devices are provided with the EARLY BIRD, RapidselfTest, Reveal, TRUELINE and Unitest but these are of variable quality. Some users found the collection vessels supplied with the RapidselfTest and Reveal kits too shallow and none of the users liked the collapsible vessel supplied with the Unitest pregnancy cassette test. Although midstream dipsticks were easier and more convenient to use, some users found that with some tests excess urine dripped from the absorbant tip before the tip cover could be replaced and this was perceived to be messy. The Accuclear Compact pregnancy test and Mediply Midstream pregnancy test were the only devices not to have this problem. Leaving the dipstick in the urine stream for 10 seconds as recommended with the ASDA home pregnancy test, QUIK-CHECK midstream and TESCO home pregnancy test was felt to be too long by some users. Reading results was not a problem although some users found the vertical negative line in the Clearblue pregnancy test confusing and would have preferred a clear window for a negative result.
The sensitivity reported for the devices ranged from 10 to 50IU/L hCG. This evaluation agreed with the manufacturers’ stated sensitivity in 14 cases. One device was found to be more sensitiveand the remainder less sensitive, usually by one dilution factor. The TRUELINE pregnancy test was two dilutions less sensitive than quoted when read at three minutes but was more sensitive when read at 10 minutes. Results for the sensitivity testing are given in Table 2. Differences between the result of the evaluation and the manufacturers’ quoted data could be due to a number of reasons:
- Not all the devices have been calibrated using the same international standard as used in the evaluation;
- Manufacturers may have used very weak positive results as their cut-off value, whereas we used only weak positive results;
- Some devices give a range of times within which the result may be read. Usually the device was more sensitive when read at the longer time but it is not known what time each manufacturer used to determine the sensitivity of its device.
Three QUIK-CHECK devices had a claimed sensitivity of 10IU/L. It is possible that at this sensitivity some urine samples from women going through the menopause may have tested positive. For such urine samples the EARLY BIRD gave two positive results: one weak the other very weak positive. Negative results were obtained with all the other devices. In the examination for a hook effect eight devices had a weak positive result at 500,000IU/L hCG but a clear positive test at ≤100,000IU/L. The devices showing this effect were Accuclear, ASDA, Mediply midstream, both QUIK-CHECK pallets, Reveal, TESCO and TRUELINE. Therefore higher concentrations of hCG would give an even weaker result. The TRUELINE device could be read from three minutes to 10 minutes. At the shortest recommended read time a sensitivity of 100IU/LHCG was found compared with the reported sensitivity of 25IU/L. When the device was read using the longest recommended read time a sensitivity of 50IU/L was found.
When the OTC pregnancy test kits were last evaluated in the UK for the MHRA (Wheeler, 1999) there were only nine different devices. There are now 30 different devices available, including supermarkets’ own-brands as well as devices from additional companies. Therefore the choice is wider and potentially even more confusing for users than ever before.
This evaluation examined the sensitivity, reliability and ease of use of 27 different pregnancy testing devices from 15 companies. Five companies declined the offer to have their devices included in the evaluation. The reasons were not stated except for Chefaro Ltd, which was redesigning its Predictor device. There are reports, many anecdotal, that false positive and false negative results are obtained occasionally with pregnancy testing devices. Studies looking into this problem have shown that in many cases it is due to the user not carrying out the tests correctly (Bastian et al, 1998). Daviaud et al(1993) sent out 478 positive urine samples to 638 laywomen for testing. False negative results were obtained for 230 of these specimens. Valanis and Perlman (1982) also reported high error rates and Doshi (1986) drew attention to the false positive results that were also obtained. For every 10 urine samples not containing hCG one was reported as positive. It is important that instructions are clear, easy to read and come with clear diagrams. Wheeler (1999) noted that some instructions had very small print while others had instructions that ran over two sides of an instruction sheet in such a way that the full instructions could be missed. In this evaluation instructions were easy to follow for all but the Unitest devices. The Unitest instruction leaflets had small print that was difficult to read and basic instructions were included on the foil test wrapper. The necessary sample application volume for the pallet device differed between the foil wrapper and the printed instructions. As well as false results due to users not following or understanding instructions, there are methodological problems that can lead to false results. These include poor sensitivity of the device, hypersensitivity of the device, the hook effect and cross-reaction with LH (luteinizing hormone). In the last case all antibodies used in these devices are reported to have very low cross-reaction with LH. It has not been possible to establish how each company determines the sensitivity of its devices. Some may include very weak positives as their limit of detection and may consider the concentration at which only positive tests are seen or a certain percentage of positive results are seen. In addition there are three different international standards used by the companies to calibrate their testing kits. We standardised our sensitivity calculation to provide direct comparison of the devices. It is not surprising that there are differences between our sensitivity figure and that reported by a company but in only one case was there a difference of more than one dilution. This was with the TRUELINE device that could be read from three minutes to 10 minutes. All kits give a time range over which a result may be read. Some kits showed greater sensitivity at the longer time. It was especially confusing when kits gave alternative read times - for example, instructions stated that the DISCOVER Today device should be read between one and two minutes but also stated that the kit should not be read after 10 min. CHECKMATE-EZE suggested results could be read up to 15 minutes after the device made contact with the urine sample. We recorded the results at the earliest and the latest recommended times. The sensitivity calculated at the two times did not change for 16 devices but five devices (ASDA, CHECKMATE-eEZE, Clearblue DIGITAL, EARLY BIRD and TESCO) were more sensitive at the longer time. We are concerned about providing a time window for devices that are used by untrained people. A negative result may occur at the shortest read time but a weak positive at the longest read time. One can imagine a user being uncertain as to whether or not they are pregnant. In addition we found that the EARLY BIRD and Unitest pallet devices gave unreliable results at the shortest recommended time due to background colour obscuring the test window. We recommend that manufacturers provide a single read time. It has been reported that hCG concentrations in women going through the menopause can rise to between 5IU/L and 10IU/L (Borkowski and Muquardt, 1979). Three QUIK-CHECK devices have a reported sensitivity of 10IU/L. It is possible that a device with a low sensitivity could give a false positive result in women going through the menopause. This occurred with the EARLY BIRD device. The hook effect in immunometric sandwich assays, as employed in these devices, causes falsely low results. This is due to both the signal and capture antibodies being swamped with the antigen - in this case hCG. Manufacturers have quoted hCG concentrations of >250,000IU/L at 8-10 weeks. Multiple pregnancies and hCG secreting tumours can produce even higher concentrations of hCG but this is a rare occurrence. Although most urine samples screened for pregnancy are collected a short time after the last missed period, women will occasionally wait several weeks before testing for pregnancy - when the hCG concentrations will be very high. We tested the reliability of the devices up to 500,000IU/L. Eight devices gave a weak positive at this concentration although a strong positive at 100,000IU/L. This indicated that these devices suffered from some hook effect. At this high concentration all results were seen as positive and therefore indicate pregnancy, but the signal would be even weaker or absent at higher concentrations.
All the devices we tested were capable of detecting pregnancy at the time of the first missed period. Therefore if a woman reads and follows the instructions carefully she should achieve an accurate result at this time. However it should be noted that previous studies have found that a high proportion of women do not achieve the correct result. This evaluation has demonstrated some of the variability between devices and areas, especially with correct reading times, where a woman may be confused when reading a result. The false positive tests for urine from women going through the menopause with the EARLY BIRD device are of particular concern. Our users found that not all devices were easy to use and this could lead to their making errors. As such, when a woman presents at clinic with a result with a home testing kit it should always be retested by a professional operator with experience of pregnancy testing.
Bastian, L.A. et al (1998) Diagnostic efficiency of home pregnancy test kits. Archives of Family Medicine; 7: 5, 465-469.
Borkowski, A., Muquardt, C. (1979) Human chorionic gonadotropin in the plasma of normal non-pregnant subjects. New EnglandJournal of Medicine; 301: 6, 298-302.
Daviaud, J. et al (1993) Reliability and feasibility of pregnancy home-use tests: Laboratory validation and diagnostic evaluation by 638 volunteers. Clinical Chemistry; 39: 1, 53-59.
Doshi, M.L. (1986) Accuracy of consumer performed in-home tests for early pregnancy detection. American Journal of Public Health; 76: 5, 512-514.
Lamph, S. et al (2006a) Over the Counter Pregnancy Tests. Report 06051. Procurement and Purchasing Agency, NHS, UK. www.pasa.nhs.uk.
Lamph, S. et al (2006b) Over the Counter Pregnancy Tests: Technical Data Supplement. Report 06051-S. Procurement and Purchasing Agency, NHS, UK. www.pasa.nhs.uk.
National Institute for Biological Standards and Control, UK (2004) Fourth International Standard for Chorionic Gonadotrophins. NIBSC75/589 National Institute for Biological Standards and Control www.nibsc.ac.uk/documents/ifu/75-589.pdf
Siekmeier, R., Lutz, J. (2007) Experience with post-market surveillance of in-vitro diagnostic medical devices for lay use in Germany. Clinical Chemistry and Laboratory Medicine; 45: 3, 396-401.
Valanis, B., Perlman, C.S. (1982) Home pregnancy testing kits: prevalence of use, false-nagative rates, and compliance with instructions. American Journal of Public Health; 72: 9, 1034-1036.
Wheeler, M.J. (1999) Home and laboratory pregnancy-testing kits. Professional Nurse 14: 8, 571-576.