By Thomas N. Herzog

ISBN-10: 0387695028

ISBN-13: 9780387695020

This e-book is helping practitioners achieve a deeper realizing, at an utilized point, of the problems concerned about enhancing facts caliber via modifying, imputation, and checklist linkage. the 1st a part of the publication bargains with equipment and types. the following, we specialize in the Fellegi-Holt edit-imputation version, the Little-Rubin multiple-imputation scheme, and the Fellegi-Sunter checklist linkage version. short examples are incorporated to teach how those suggestions work.

In the second one a part of the e-book, the authors current real-world case stories within which a number of of those strategies are used. They conceal a large choice of program parts. those comprise personal loan warrantly coverage, scientific, biomedical, road security, and social assurance in addition to the development of record frames and administrative lists.

Readers will locate this publication a mix of sensible suggestion, mathematical rigor, administration perception and philosophy. The lengthy checklist of references on the finish of the publication allows readers to delve extra deeply into the topics mentioned the following. The authors additionally talk about the software program that has been built to use the ideas defined in our text.

Example text

The bounds L and U can be established separately for each industry of interest. Within an industry, the bounds may be established by examining targeted subsets of companies such as the largest and smallest ones because the larger companies may have different characteristics (in terms of edits) than the smaller ones. For ongoing surveys, the bounds can also be established using survey data from the current time period. 4. Zero Control Test Another relationship test – the zero control test – using several data elements is sometimes useful for control purposes.

Definition: Let Ai i ∈ I where I is an arbitrary index set, possibly infinite, be an arbitrary collection of events. The collection of events, Ai i ∈ I , is said to ik ∈ I, we have be independent if for each finite set of distinct indices i1 P Ai1 ∩ Ai2 ∩ · · · ∩ Aik = P Ai1 P Ai2 · · · P Aik 1 This section is based heavily on pages 26–27 of Ash [1970]. 51 52 6. 1: Let two fair dice be tossed. Let each possible outcome have (an 1 equal) probability of occurrence of . Let 36 A = first die = 1 2 or 3 B = first die = 3 4 or 5 C = the sum of the two faces is 9 = 3 6 4 5 5 4 3 5 3 6 6 3 Hence, A∩B = 3 1 A∩C = 3 6 B∩C = 3 6 A∩B∩C = 3 6 3 2 3 3 3 4 4 5 5 4 and So, it follows that 1 2 1 2 1 4 P A∩B = 1 =P A P B = 6 P A∩C = 1 =P A P C = 36 1 2 4 36 = 1 18 P B∩C = 1 =P B P C = 12 1 2 4 36 = 1 18 1 2 1 2 = Consequently, even though P A∩B∩C = 1 =P A P B P C = 36 1 9 the events {A, B, C} are not independent.

If most of the information in all of the components of the database is correct, then the company can effectively combine and use the information. But there are some instances in which quality can deteriorate. For example, the mail-order portion of the mailing has a listing of “Susan K. ” This listing was obtained from a mailing list purchased from another company. ” She is listed at a current address of “678 Maple Ave” because she has recently moved. In such situations, customers may be listed multiple times on the company’s customer list.

