author_facet Khare, Ritu
Utidjian, Levon
Ruth, Byron J
Kahn, Michael G
Burrows, Evanette
Marsolo, Keith
Patibandla, Nandan
Razzaghi, Hanieh
Colvin, Ryan
Ranade, Daksha
Kitzmiller, Melody
Eckrich, Daniel
Bailey, L Charles
Khare, Ritu
Utidjian, Levon
Ruth, Byron J
Kahn, Michael G
Burrows, Evanette
Marsolo, Keith
Patibandla, Nandan
Razzaghi, Hanieh
Colvin, Ryan
Ranade, Daksha
Kitzmiller, Melody
Eckrich, Daniel
Bailey, L Charles
author Khare, Ritu
Utidjian, Levon
Ruth, Byron J
Kahn, Michael G
Burrows, Evanette
Marsolo, Keith
Patibandla, Nandan
Razzaghi, Hanieh
Colvin, Ryan
Ranade, Daksha
Kitzmiller, Melody
Eckrich, Daniel
Bailey, L Charles
spellingShingle Khare, Ritu
Utidjian, Levon
Ruth, Byron J
Kahn, Michael G
Burrows, Evanette
Marsolo, Keith
Patibandla, Nandan
Razzaghi, Hanieh
Colvin, Ryan
Ranade, Daksha
Kitzmiller, Melody
Eckrich, Daniel
Bailey, L Charles
Journal of the American Medical Informatics Association
A longitudinal analysis of data quality in a large pediatric data research network
Health Informatics
author_sort khare, ritu
spelling Khare, Ritu Utidjian, Levon Ruth, Byron J Kahn, Michael G Burrows, Evanette Marsolo, Keith Patibandla, Nandan Razzaghi, Hanieh Colvin, Ryan Ranade, Daksha Kitzmiller, Melody Eckrich, Daniel Bailey, L Charles 1067-5027 1527-974X Oxford University Press (OUP) Health Informatics http://dx.doi.org/10.1093/jamia/ocx033 <jats:title>Abstract</jats:title> <jats:sec> <jats:title>Objective</jats:title> <jats:p>PEDSnet is a clinical data research network (CDRN) that aggregates electronic health record data from multiple children’s hospitals to enable large-scale research. Assessing data quality to ensure suitability for conducting research is a key requirement in PEDSnet. This study presents a range of data quality issues identified over a period of 18 months and interprets them to evaluate the research capacity of PEDSnet.</jats:p> </jats:sec> <jats:sec> <jats:title>Materials and Methods</jats:title> <jats:p>Results were generated by a semiautomated data quality assessment workflow. Two investigators reviewed programmatic data quality issues and conducted discussions with the data partners’ extract-transform-load analysts to determine the cause for each issue.</jats:p> </jats:sec> <jats:sec> <jats:title>Results</jats:title> <jats:p>The results include a longitudinal summary of 2182 data quality issues identified across 9 data submission cycles. The metadata from the most recent cycle includes annotations for 850 issues: most frequent types, including missing data (&amp;gt;300) and outliers (&amp;gt;100); most complex domains, including medications (&amp;gt;160) and lab measurements (&amp;gt;140); and primary causes, including source data characteristics (83%) and extract-transform-load errors (9%).</jats:p> </jats:sec> <jats:sec> <jats:title>Discussion</jats:title> <jats:p>The longitudinal findings demonstrate the network’s evolution from identifying difficulties with aligning the data to a common data model to learning norms in clinical pediatrics and determining research capability.</jats:p> </jats:sec> <jats:sec> <jats:title>Conclusion</jats:title> <jats:p>While data quality is recognized as a critical aspect in establishing and utilizing a CDRN, the findings from data quality assessments are largely unpublished. This paper presents a real-world account of studying and interpreting data quality findings in a pediatric CDRN, and the lessons learned could be used by other CDRNs.</jats:p> </jats:sec> A longitudinal analysis of data quality in a large pediatric data research network Journal of the American Medical Informatics Association
doi_str_mv 10.1093/jamia/ocx033
facet_avail Online
Free
finc_class_facet Medizin
Informatik
format ElectronicArticle
fullrecord blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA5My9qYW1pYS9vY3gwMzM
id ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA5My9qYW1pYS9vY3gwMzM
institution DE-105
DE-14
DE-Ch1
DE-L229
DE-D275
DE-Bn3
DE-Brt1
DE-D161
DE-Zwi2
DE-Gla1
DE-Zi4
DE-15
DE-Pl11
DE-Rs1
imprint Oxford University Press (OUP), 2017
imprint_str_mv Oxford University Press (OUP), 2017
issn 1067-5027
1527-974X
issn_str_mv 1067-5027
1527-974X
language English
mega_collection Oxford University Press (OUP) (CrossRef)
match_str khare2017alongitudinalanalysisofdataqualityinalargepediatricdataresearchnetwork
publishDateSort 2017
publisher Oxford University Press (OUP)
recordtype ai
record_format ai
series Journal of the American Medical Informatics Association
source_id 49
title A longitudinal analysis of data quality in a large pediatric data research network
title_unstemmed A longitudinal analysis of data quality in a large pediatric data research network
title_full A longitudinal analysis of data quality in a large pediatric data research network
title_fullStr A longitudinal analysis of data quality in a large pediatric data research network
title_full_unstemmed A longitudinal analysis of data quality in a large pediatric data research network
title_short A longitudinal analysis of data quality in a large pediatric data research network
title_sort a longitudinal analysis of data quality in a large pediatric data research network
topic Health Informatics
url http://dx.doi.org/10.1093/jamia/ocx033
publishDate 2017
physical 1072-1079
description <jats:title>Abstract</jats:title> <jats:sec> <jats:title>Objective</jats:title> <jats:p>PEDSnet is a clinical data research network (CDRN) that aggregates electronic health record data from multiple children’s hospitals to enable large-scale research. Assessing data quality to ensure suitability for conducting research is a key requirement in PEDSnet. This study presents a range of data quality issues identified over a period of 18 months and interprets them to evaluate the research capacity of PEDSnet.</jats:p> </jats:sec> <jats:sec> <jats:title>Materials and Methods</jats:title> <jats:p>Results were generated by a semiautomated data quality assessment workflow. Two investigators reviewed programmatic data quality issues and conducted discussions with the data partners’ extract-transform-load analysts to determine the cause for each issue.</jats:p> </jats:sec> <jats:sec> <jats:title>Results</jats:title> <jats:p>The results include a longitudinal summary of 2182 data quality issues identified across 9 data submission cycles. The metadata from the most recent cycle includes annotations for 850 issues: most frequent types, including missing data (&amp;gt;300) and outliers (&amp;gt;100); most complex domains, including medications (&amp;gt;160) and lab measurements (&amp;gt;140); and primary causes, including source data characteristics (83%) and extract-transform-load errors (9%).</jats:p> </jats:sec> <jats:sec> <jats:title>Discussion</jats:title> <jats:p>The longitudinal findings demonstrate the network’s evolution from identifying difficulties with aligning the data to a common data model to learning norms in clinical pediatrics and determining research capability.</jats:p> </jats:sec> <jats:sec> <jats:title>Conclusion</jats:title> <jats:p>While data quality is recognized as a critical aspect in establishing and utilizing a CDRN, the findings from data quality assessments are largely unpublished. This paper presents a real-world account of studying and interpreting data quality findings in a pediatric CDRN, and the lessons learned could be used by other CDRNs.</jats:p> </jats:sec>
container_issue 6
container_start_page 1072
container_title Journal of the American Medical Informatics Association
container_volume 24
format_de105 Article, E-Article
format_de14 Article, E-Article
format_de15 Article, E-Article
format_de520 Article, E-Article
format_de540 Article, E-Article
format_dech1 Article, E-Article
format_ded117 Article, E-Article
format_degla1 E-Article
format_del152 Buch
format_del189 Article, E-Article
format_dezi4 Article
format_dezwi2 Article, E-Article
format_finc Article, E-Article
format_nrw Article, E-Article
_version_ 1792338817427963909
geogr_code not assigned
last_indexed 2024-03-01T15:37:47.167Z
geogr_code_person not assigned
openURL url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=A+longitudinal+analysis+of+data+quality+in+a+large+pediatric+data+research+network&rft.date=2017-11-01&genre=article&issn=1527-974X&volume=24&issue=6&spage=1072&epage=1079&pages=1072-1079&jtitle=Journal+of+the+American+Medical+Informatics+Association&atitle=A+longitudinal+analysis+of+data+quality+in+a+large+pediatric+data+research+network&aulast=Bailey&aufirst=L+Charles&rft_id=info%3Adoi%2F10.1093%2Fjamia%2Focx033&rft.language%5B0%5D=eng
SOLR
_version_ 1792338817427963909
author Khare, Ritu, Utidjian, Levon, Ruth, Byron J, Kahn, Michael G, Burrows, Evanette, Marsolo, Keith, Patibandla, Nandan, Razzaghi, Hanieh, Colvin, Ryan, Ranade, Daksha, Kitzmiller, Melody, Eckrich, Daniel, Bailey, L Charles
author_facet Khare, Ritu, Utidjian, Levon, Ruth, Byron J, Kahn, Michael G, Burrows, Evanette, Marsolo, Keith, Patibandla, Nandan, Razzaghi, Hanieh, Colvin, Ryan, Ranade, Daksha, Kitzmiller, Melody, Eckrich, Daniel, Bailey, L Charles, Khare, Ritu, Utidjian, Levon, Ruth, Byron J, Kahn, Michael G, Burrows, Evanette, Marsolo, Keith, Patibandla, Nandan, Razzaghi, Hanieh, Colvin, Ryan, Ranade, Daksha, Kitzmiller, Melody, Eckrich, Daniel, Bailey, L Charles
author_sort khare, ritu
container_issue 6
container_start_page 1072
container_title Journal of the American Medical Informatics Association
container_volume 24
description <jats:title>Abstract</jats:title> <jats:sec> <jats:title>Objective</jats:title> <jats:p>PEDSnet is a clinical data research network (CDRN) that aggregates electronic health record data from multiple children’s hospitals to enable large-scale research. Assessing data quality to ensure suitability for conducting research is a key requirement in PEDSnet. This study presents a range of data quality issues identified over a period of 18 months and interprets them to evaluate the research capacity of PEDSnet.</jats:p> </jats:sec> <jats:sec> <jats:title>Materials and Methods</jats:title> <jats:p>Results were generated by a semiautomated data quality assessment workflow. Two investigators reviewed programmatic data quality issues and conducted discussions with the data partners’ extract-transform-load analysts to determine the cause for each issue.</jats:p> </jats:sec> <jats:sec> <jats:title>Results</jats:title> <jats:p>The results include a longitudinal summary of 2182 data quality issues identified across 9 data submission cycles. The metadata from the most recent cycle includes annotations for 850 issues: most frequent types, including missing data (&amp;gt;300) and outliers (&amp;gt;100); most complex domains, including medications (&amp;gt;160) and lab measurements (&amp;gt;140); and primary causes, including source data characteristics (83%) and extract-transform-load errors (9%).</jats:p> </jats:sec> <jats:sec> <jats:title>Discussion</jats:title> <jats:p>The longitudinal findings demonstrate the network’s evolution from identifying difficulties with aligning the data to a common data model to learning norms in clinical pediatrics and determining research capability.</jats:p> </jats:sec> <jats:sec> <jats:title>Conclusion</jats:title> <jats:p>While data quality is recognized as a critical aspect in establishing and utilizing a CDRN, the findings from data quality assessments are largely unpublished. This paper presents a real-world account of studying and interpreting data quality findings in a pediatric CDRN, and the lessons learned could be used by other CDRNs.</jats:p> </jats:sec>
doi_str_mv 10.1093/jamia/ocx033
facet_avail Online, Free
finc_class_facet Medizin, Informatik
format ElectronicArticle
format_de105 Article, E-Article
format_de14 Article, E-Article
format_de15 Article, E-Article
format_de520 Article, E-Article
format_de540 Article, E-Article
format_dech1 Article, E-Article
format_ded117 Article, E-Article
format_degla1 E-Article
format_del152 Buch
format_del189 Article, E-Article
format_dezi4 Article
format_dezwi2 Article, E-Article
format_finc Article, E-Article
format_nrw Article, E-Article
geogr_code not assigned
geogr_code_person not assigned
id ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTA5My9qYW1pYS9vY3gwMzM
imprint Oxford University Press (OUP), 2017
imprint_str_mv Oxford University Press (OUP), 2017
institution DE-105, DE-14, DE-Ch1, DE-L229, DE-D275, DE-Bn3, DE-Brt1, DE-D161, DE-Zwi2, DE-Gla1, DE-Zi4, DE-15, DE-Pl11, DE-Rs1
issn 1067-5027, 1527-974X
issn_str_mv 1067-5027, 1527-974X
language English
last_indexed 2024-03-01T15:37:47.167Z
match_str khare2017alongitudinalanalysisofdataqualityinalargepediatricdataresearchnetwork
mega_collection Oxford University Press (OUP) (CrossRef)
physical 1072-1079
publishDate 2017
publishDateSort 2017
publisher Oxford University Press (OUP)
record_format ai
recordtype ai
series Journal of the American Medical Informatics Association
source_id 49
spelling Khare, Ritu Utidjian, Levon Ruth, Byron J Kahn, Michael G Burrows, Evanette Marsolo, Keith Patibandla, Nandan Razzaghi, Hanieh Colvin, Ryan Ranade, Daksha Kitzmiller, Melody Eckrich, Daniel Bailey, L Charles 1067-5027 1527-974X Oxford University Press (OUP) Health Informatics http://dx.doi.org/10.1093/jamia/ocx033 <jats:title>Abstract</jats:title> <jats:sec> <jats:title>Objective</jats:title> <jats:p>PEDSnet is a clinical data research network (CDRN) that aggregates electronic health record data from multiple children’s hospitals to enable large-scale research. Assessing data quality to ensure suitability for conducting research is a key requirement in PEDSnet. This study presents a range of data quality issues identified over a period of 18 months and interprets them to evaluate the research capacity of PEDSnet.</jats:p> </jats:sec> <jats:sec> <jats:title>Materials and Methods</jats:title> <jats:p>Results were generated by a semiautomated data quality assessment workflow. Two investigators reviewed programmatic data quality issues and conducted discussions with the data partners’ extract-transform-load analysts to determine the cause for each issue.</jats:p> </jats:sec> <jats:sec> <jats:title>Results</jats:title> <jats:p>The results include a longitudinal summary of 2182 data quality issues identified across 9 data submission cycles. The metadata from the most recent cycle includes annotations for 850 issues: most frequent types, including missing data (&amp;gt;300) and outliers (&amp;gt;100); most complex domains, including medications (&amp;gt;160) and lab measurements (&amp;gt;140); and primary causes, including source data characteristics (83%) and extract-transform-load errors (9%).</jats:p> </jats:sec> <jats:sec> <jats:title>Discussion</jats:title> <jats:p>The longitudinal findings demonstrate the network’s evolution from identifying difficulties with aligning the data to a common data model to learning norms in clinical pediatrics and determining research capability.</jats:p> </jats:sec> <jats:sec> <jats:title>Conclusion</jats:title> <jats:p>While data quality is recognized as a critical aspect in establishing and utilizing a CDRN, the findings from data quality assessments are largely unpublished. This paper presents a real-world account of studying and interpreting data quality findings in a pediatric CDRN, and the lessons learned could be used by other CDRNs.</jats:p> </jats:sec> A longitudinal analysis of data quality in a large pediatric data research network Journal of the American Medical Informatics Association
spellingShingle Khare, Ritu, Utidjian, Levon, Ruth, Byron J, Kahn, Michael G, Burrows, Evanette, Marsolo, Keith, Patibandla, Nandan, Razzaghi, Hanieh, Colvin, Ryan, Ranade, Daksha, Kitzmiller, Melody, Eckrich, Daniel, Bailey, L Charles, Journal of the American Medical Informatics Association, A longitudinal analysis of data quality in a large pediatric data research network, Health Informatics
title A longitudinal analysis of data quality in a large pediatric data research network
title_full A longitudinal analysis of data quality in a large pediatric data research network
title_fullStr A longitudinal analysis of data quality in a large pediatric data research network
title_full_unstemmed A longitudinal analysis of data quality in a large pediatric data research network
title_short A longitudinal analysis of data quality in a large pediatric data research network
title_sort a longitudinal analysis of data quality in a large pediatric data research network
title_unstemmed A longitudinal analysis of data quality in a large pediatric data research network
topic Health Informatics
url http://dx.doi.org/10.1093/jamia/ocx033