Show simple item record

dc.contributor.authorMCLACHLAN, Sen_US
dc.contributor.authorDube, Ken_US
dc.contributor.authorGallagher, Ten_US
dc.contributor.authorDALEY, Ben_US
dc.contributor.authorWalonoski, Jen_US
dc.contributor.author11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018)en_US
dc.date.accessioned2018-06-28T09:38:11Z
dc.date.available2017-11-14en_US
dc.date.issued2018-01-27en_US
dc.date.submitted2018-06-18T12:47:33.154Z
dc.identifier.urihttp://qmro.qmul.ac.uk/xmlui/handle/123456789/40823
dc.description.abstractRealistic synthetic data are increasingly being recognized as solutions to lack of data or privacy concerns in healthcare and other domains, yet little effort has been expended in establishing a generic framework for characterizing, achieving and validating realism in Synthetic Data Generation (SDG). The objectives of this paper are to: (1) present a characterization of the concept of realism as it applies to synthetic data; and (2) present and demonstrate application of the generic ATEN Framework for achieving and validating realism for SDG. The characterization of realism is developed through insights obtained from analysis of the literature on SDG. The development of the generic methods for achieving and validating realism for synthetic data was achieved by using knowledge discovery in databases (KDD), data mining enhanced with concept analysis and identification of characteristic, and classification rules. Application of this framework is demonstrated by using the synthetic Electronic Healthcare Record (EHR) for the domain of midwifery. The knowledge discovery process improves and expedites the generation process; having a more complex and complete understanding of the knowledge required to create the synthetic data significantly reduce the number of generation iterations. The validation process shows similar efficiencies through using the knowledge discovered as the elements for assessing the generated synthetic data. Successful validation supports claims of success and resolves whether the synthetic data is a sufficient replacement for real data. The ATEN Framework supports the researcher in identifying the knowledge elements that need to be synthesized, as well as supporting claims of sufficient realism through the use of that knowledge in a structured approach to validation. When used for SDG, the ATEN Framework enables a complete analysis of source data for knowledge necessary for correct generation. The ATEN Framework ensures the researcher that the synthetic data being created is realistic enough for the replacement of real data for a given use-case.en_US
dc.subjectsynthetic dataen_US
dc.subjectsynthetic health recorden_US
dc.subjectknowledge discoveryen_US
dc.subjectdata miningen_US
dc.subjectelectronic health recordsen_US
dc.titleThe ATEN Framework for Creating the Realistic Synthetic Electronic Health Recorden_US
dc.typeConference Proceeding
dc.rights.holderCopyright © 2018 by SCITEPRESS
dc.identifier.doi10.5220/0006677602200230en_US
pubs.notesNot knownen_US
pubs.publication-statusPublisheden_US
dcterms.dateAccepted2017-11-14en_US
qmul.funderPAMBAYESIAN: PAtient Managed decision-support using Bayesian networks::Engineering and Physical Sciences Research Councilen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record