Data Lake benefits pediatrics researchKeywords:
Data in the HUS Data Lake can be leveraged for pediatric medical research, for leadership support and for international cooperation.
The HUS Data Lake is a service where data from patient registers and certain administrative registers are organized so that they can be efficiently and securely leveraged. The majority of the patient information systems in clinical use at HUS are integrated into the Data Lake.
Funding from the New Children’s Hospital Foundation has allowed the hiring of a four-person Data Lake team for three years at Children and Adolescents. This team responds to requests for information from the Data Lake submitted by pediatrics researchers and managers.
One of the purposes of the New Children’s Hospital Foundation is to promote and support pediatrics research. Data analytics development is the key to improving operations and management, to promoting research and to increasing the international attractiveness of the New Children’s Hospital.
“There is a lot of unused information in the Data Lake that is now being processed for research purposes. With a dedicated team, we will be better able to draw on existing information in pediatrics research and in management by knowledge,” says Katariina Gehrmann, Head of Digital and Innovation Services.
From neonatal research to coronavirus research
The Data Lake team receives requests from researchers to perform information searches in register data and to assist scientists and managers in discovering the potential of the information in the Data Lake. The team also participates in EU projects.
Information has been retrieved from the Data Lake for instance for the Babyscreen screening and follow-up study performed by HUS and the University of Helsinki for the purpose of identifying children with a genetic predisposition for developing type 1 diabetes or celiac disease.
“We can produce long-term monitoring data for researchers. An example of this would be reviewing the cases of children who have undergone ventriculoperitoneal shunt surgery in the past ten years. Research data retrieved from the Data Lake were similarly used in a study on the etiology of severe short and tall stature. There is also a study in the pipeline about Covid-19 in children,” says Information Analyst Mira Kuusinen, describing the information requests received by the team.
“How fast we can retrieve the data depends on the scope of the study and the content of the information request. In a typical case, it will take about two months before the data will be available to the researcher. The Data Lake contains enormous masses of data,” explains Kuusinen.
Big data boosts competitiveness
Leveraging ‘big data’, i.e. the huge volume of data in the Data Lake, is more important than ever in terms of competitiveness as well. Big data enables development of national and international research cooperation with partners and allows retention and analysis of the organization’s own data.
“The information that can be gained from the Data Lake offers potential for increasing pediatric drug trials. Children have rare diseases that are difficult to diagnose, and big data and international cooperation open up new opportunities in this field. The Data Lake project is like a magnet for our partner hospitals and for EU projects,” says Gehrmann.
Text: Jonna Suometsä
Photo: Ville Männikkö