Espetvedt MN, Reksen O, Rintakoski S, Østerås O. Data quality in the Norwegian dairy herd recording system: agreement between the national database and disease recording on farm.
J Dairy Sci 2013;
96:2271-2282. [PMID:
23462169 DOI:
10.3168/jds.2012-6143]
[Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Accepted: 01/02/2013] [Indexed: 11/19/2022]
Abstract
The majority of herds in Norway participate in the national dairy herd recording system. For disease events, this involves transferring information registered on farm, using individual cow health cards (CHC), to the central cattle database (CCD). Before using data from such a database, validation with an aim of describing data quality should be performed, but is rarely done. In this study, diagnostic events from CHC and CCD from 74 dairy herds were compared. Events in 2008 from female cattle with minimum age of 1 yr were included (n=1,738). Discrepancies between the 2 data sources and assessment of data quality were evaluated using agreement between events on CHC and in CCD, calculating completeness and correctness for the CCD, and using a multivariable regression model for agreement (1/0). The agreement evaluation described the concordance between the 2 data sources, whereas the calculations of completeness and correctness depended on a reference data source assumed to be more reliable. Completeness of the CCD was defined as the proportion of diagnostic events on the CHC that was recorded therein. Correctness was defined as the proportion of the CCD events that was also recorded on the CHC, and with the same date and diagnostic code. The agreement was up to 87.5%, the majority of disagreement being caused by unreported events on the CHC (between 10 and 12% of all events). Completeness of the CCD was regarded as high, between 0.87 and 0.88, and correctness excellent, between 0.97 and 0.98. The multivariable regression model found 4 factors that increased the odds for diagnostic events being in agreement between CHC and CCD. These were the events occurring during the 305-d lactation period; the herd size being 75 cows or less; the event occurring during the spring, summer, or winter rather than autumn; and lastly, the diagnostic code for the disease event being preprinted on the CHC, involving a simple check mark as opposed to writing a 3-digit code. The model found a high degree of clustering within herd. In conclusion, disease data in the Norwegian national database for dairy cows are valid to use for epidemiologic research, having in particular an excellent correctness, but it is of concern that at least 10% of data are missing. The proportion of unreported data should be taken into consideration whenever data from this database are used. Reasons for discrepancies found are important to be aware of in any work aiming to improve data transfer from farm to central databases.
Collapse