Presentación | Participantes | Bibliografía (DML-E) | Bibliografía adicional | Enlaces de interés | Otros proyectos DML | Ayuda  
INICIO | 27 de julio de 2024
  

Automatic error localisation for categorical, continuous and integer data.

Título inglés Automatic error localisation for categorical, continuous and integer data.
Título español Localización automática de errores para datos categóricos, continuos y enteros.
Autor/es Waal, Ton de
Organización Statist. Netherlands, Voorburg, Holanda
Revista 1696-2281
Publicación 2005, 29 (1): 57-100, 49 Ref.
Tipo de documento articulo
Idioma Inglés
Resumen inglés Data collected by statistical offices generally contain errors, which have to be corrected before reliable data can be published. This correction process is referred to as statistical data editing. At statistical offices, certain rules, so-called edits, are often used during the editing process to determine whether a record is consistent or not. Inconsistent records are considered to contain errors, while consistent records are considered error-free. In this article we focus on automatic error localisation based on the Fellegi-Holt paradigm, which says that the data should be made to satisfy all edits by changing the fewest possible number of fields. Adoption of this paradigm leads to a mathematical optimisation problem. We propose an algorithm to solve this optimisation problem for a mix of categorical, continuous and integer-valued data. We also propose a heuristic procedure based on the exact algorithm. For five realistic data sets involving only integer-valued variables we evaluate the performance of this heuristic procedure.
Clasificación UNESCO 120900 ; 120707
Palabras clave español Datos estadísticos ; Corrección de errores ; Programación matemática ; Programación entera ; Optimización ; Heurística ; Datos categóricos
Código MathReviews MR2160537
Icono pdf Acceso al artículo completo
Equipo DML-E
Instituto de Ciencias Matemáticas (ICMAT - CSIC)
rmm()icmat.es