# The detection of defective members of large populations Paper

Recently your firm has made a series of costly errors in its group life insurance quotations. Your supervisor wishes to improve quality control. She has discovered a classic article in the field: Dorfman, Robert (1943), ‘The detection of defective members of large populations’, Annals of Mathematical Statistics, 14(3), pp. 436-440. Your supervisor has asked you to read this article, and then write a review of the article. She has specified that the review should be no longer than 1200 words.

Before the outbreak of World War Two, in 1943 Harvard economist and mathematical statistician Robert Dorfman wrote an article that is integral and significant in statistics. His article ‘The Detection of Defective Members of Large Populations’ (TDDMLP), reveals Dorfman’s intricate thinking and now part of the Annals of Mathematical Statistics that is published by the Institute of Mathematical Statistics. Moreover, in the article he emphasizes on the process achieving an “efficient method for eliminating all defective members of certain types of large populations”.

This idea follows close to the importance of quality control, which is vital in all types of financial institutions, from banks to insurance companies. In conjunction, Dorfman uses the analogy of discussing blood samples as one application of the quality control methodology. Ultimately, TDDMLP can be utilised in proving indirectly that undergoing shortcuts do not completely reduce quality, if and only if the analytical theory is supported by it.

With Dorfman’s main objective to identify defective individual members of a large populace in a less “expensive and tedious process”, informs the reader that testing of combined samples can be beneficial in an economical way. By experimenting on a large-scale population such as the United States Public Health Service and Selective Service, Dorfman collects the blood samples from the men inductees from the armed forces, and conducts experiment with them to detect which men had a syphilitic antigen.

In TDDMLP Dorfman proposes that under a statistical and probabilistically approach of the elimination of defective people can be minimised by “increasing the efficiency of detection”. Hence with his intention to pool the samples into groups will reveal the extent of saving compared to individual testing. In this experiment, Dorfman undergoes a methodological and practical process to demonstrate his idea. He executes this by first pooling N blood samples into group pools with n members, rather than testing each blood sample from the individual men.

With the assumption that the tests are conducted under “sufficiently sensitive and specific” rules, if the group pools contain no syphilitic antigen, then the pool will test negative, which this indicates none of the people in that batch are infected with syphilis. On the other hand, if a syphilitic antigen is found in the pool, then at least one of the members in that pool is affected, so then each member in that pool will be retested separately to determine which of them is infected.

In addition this process will determine the most efficient size of the pool groups with the assumption that since the “population is large enough” the discrete binomial distribution can be applied. Furthermore his findings also reveal the amount of savings attainable by conducting the experiment with group pooling. In the article, Dorfman shows three important findings, which are that as the prevalence rate increase, the relative testing cost from individual testing increases and both the number of people in each pool (n) and the amount of savings decrease.

One of the key deductions found from the experiment is that the extent of savings attainable will increase as the prevalence rate decrease. This can be numerically examined with reference to Table 1 in the article, as it shows the “relative testing costs for selected prevalence rates to individual testing”. The table shows that as the prevalence rate increases amongst the members, the savings that could be made from pooling diminishes; this is because when there is a low level of prevalence rate of defectives, “it is likely that a new pool formed from the untested samples will prove to be negative”.

So if blood samples results in being negative, then the test for that pool is finished, otherwise the test should run individually again until a defective is detected. By following “this procedure until a negative pool is found”, the amount of savings attainable would increase by average 5. 5% with each extra percent decrease in the prevalence rate. Also Dorfman findings reveals that the amount of savings attainable can be maximally done at 80% with prevalence rate of 1%, and with a much higher prevalence rate of 30%, there is only 1% of savings, hence the extent of savings attainable will increase as the prevalence rate decrease.

In addition, as Dorfman uses group testing instead of individual testing shows that the relative testing cost increases as the prevalence rate increases. In conjunction, the amount of economical benefit that can be gathered depends on the group pool size and the prevalence rate. Dorfman shows the optimal size of the pool groups (i. e. amount of n) for different levels of prevalence rates diagrammatically and numerically. From Figure 1 in the article, it displays the “shape of the relative cost for prevalence rates” ranging from 1% to 15%.

By looking at the minimum points of the curves, “the optimum group size for a population with a known prevalence rate is the integral value of n”, this “has the lowest corresponding value on the relative cost curve for that prevalence rate”. Dorfman revealed that the maximum amount of people per group at the lowest prevalence rate of 1% are 11 people, and with the highest prevalence rate of 30% with 3 people per group. Evidently this proves that it is more economical to detect defectives by group pooling than to test individually.

Although TDDMLP revealed great insight into the “efficient method for eliminating all defective members of certain types of large populations”, by finding it more economical and time wise to group pool rather than testing individually, however Dorfman fails to take into account any “technical failure or operators’ error”. With the possible degree of impurity or imperfection and taking the shortcut of combines testing may result to incorrect findings.

Yet Dorfman used a process that was in a logical and coherent order, subsequently there would be a low level of possibility of faults. This can be related to quality control of firms, in which improving quality control for firms is becoming an opportunity cost for raising revenue. Robert Dorfman’s remarkable and notable article ‘The Detection of Defective Members of Large Populations’, is renowned for its statistical findings on economical benefits in group pooling compared to individual testing in detecting defectives amongst a large population.

Dorfman uses the connection between prevalence rates of syphilis to the pool size and the extent of savings attainable. His results show that as the prevalence rate increase, the relative testing cost from individual testing increases and the number of people in each pool (n) and the amount of savings decrease. Hence the relative cost and the amount of savings achievable have an indirect relationship. This shows that quality control can at times be hindsight, when costs want to be reduced.

Bibliography

1. Dorfman, Robert (1943), ‘The Detection of Defective Members of Large Populations’, Annals of Mathematical Statistics, 14(3), pp. 436-440. 2. Sterrett, Andrew. (1957), ‘On the Detection of Defective Members of Large Populations’, Annals of Mathematical Statistics, 28, pp. 1033 3. Theobald, C. , and A. Davie, (February 9, 2007), “Group Testing, the Pooled Hypergeometric Distribution and Estimating Numbers of Defectives in Small Populations”, pp 2-4