UK Biobank has secured funding to build a data analysis platform in the Amazon Web Services (AWS) cloud that it hopes will make it easier for researchers across the world to access the health data it holds.
In operation since 2006, UK Biobank is long-term study focused on establishing how genetics and environmental factors contribute to the onset of human diseases, with its insights informed by health data collected from more than 500,000 UK volunteers.
Over the next five years, the project estimates that the amount of data stored within its database will grow to 15PB, which is likely to make accessing the information more difficult for the researchers who rely on it.
“The data is currently provided to approved researchers for download, which requires a substantial amount of expensive local storage space, as well as computing power,” said UK Biobank in a statement.
This, in turn, limits the range of researchers who can use the information to those who have the capability to store and process very large amounts of data locally, which is why a cloud-based alternative is now under development.
Rory Collins, principal investigator of UK Biobank, said the success of the entire initiative rests in the ability to make the data that the organisation holds accessible to as broad a range of research teams as possible.
“This new platform will democratise access, helping us to unleash the imaginations of the world’s best scientific minds – wherever they are – to make discoveries that improve human health,” he said.
Funding for the project is being supplied by pharmaceutical-focused research charity Wellcome, while the platform itself is being developed through a collaboration between AWS and US-based data analysis and management platform provider DNAnexus.
As well as providing the hosting and compute portion of the project, AWS has also pledged to provide $1.5m in research credits to approved early career researchers, and those living in low- and middle-income countries, to broaden the range of contributors to UK Biobank.
Once the platform is complete, these researchers will be able to analyse the data they need within the cloud-hosted platform, which UK Biobank claims will open up access to it, but also make it faster and more cost-effective to process.
“UK Biobank’s platform will not only make the data more accessible to more researchers in principle, but also – as a consequence of the extraordinary generosity of AWS – it will be more accessible in practice,” said Dr Mark Effingham, deputy CEO of UK Biobank.
“Free computing for researchers working in resource-poor settings and for young scientists starting out on a career in research is a fantastic way of increasing the use of UK Biobank’s amazing resource.”
At the time of writing, it is hoped the platform will be live and ready for researchers to use by the summer of 2021, with its development and testing taking place throughout 2020 and into the first half of next year.
Richard Daly, CEO at DNAnexus, said its work with UK Biobank is the latest in a long line of research projects that its technology has been used to underpin in recent years.
“Over the past 11 years, DNAnexus has supported the diverse scientific aims of researchers worldwide, accelerating digital transformation by simplifying complex data analysis, clinical data management, and insights at scale,” said Daly.
“We enthusiastically support the foundational UK Biobank project as it breaks new ground in the advancement of disease research through the integration of deep healthcare data with genomics and advanced tools.”