GARDIAN( the Global Agricultural Research Data Innovation & Acceleration Network)
Unlocking digital innovation in agricultural development requires the ability to aggregate data from a variety of sources and disciplines. GARDIAN enables users to do just that.
We cannot adequately address the challenges facing global food security without access to data and the ability to derive insights from the right combinations of data.Medha Devare, author and architect of the Global Agricultural Research Data Innovation & Acceleration Network (GARDIAN) data discovery tool launched by the CGIAR Platform for Big Data in Agriculture, explained how the search tool will empower users seeking data-driven solutions to transform the agriculture sector.
“Unlocking digital innovation in agricultural development requires the ability to aggregate data from a variety of sources and disciplines,” said Devare.
The potential of big data capabilities to transform the agriculture sector more often than not relies on data from different sources that must be considered together or aggregated to provide insights. For example, a service that provides smallholder farmers with advice based on data collected from market trends, regional crop yields, and climate analysis and predictions would require data pulled together from a wide variety of sources.
Making data FAIR
GARDIAN, the first pan-CGIAR search engine for agricultural data, is an important step toward bringing together valuable scientific knowledge, generated across the CGIAR network and beyond, and making these resources FAIR (Findable, Accessible, Interoperable, and Reusable).
“For data to be truly valuable it has to be reusable. This is why we want to shift the conversation from discussing open data to why data also needs to be FAIR,” emphasized Devare.
“For example, you could have a PDF of an open dataset uploaded into a repository, but the question is what can you really do with that data? For that data to really power innovation, and to aggregate with other interesting or relevant datasets, you need it to be findable, accessible, interoperable and reusable. You need that data to be FAIR,” she explained.
For a resource to be findable, it needs persistent identifiers, rich metadata, and good documentation. To be accessible, it has to have a clear usage license – ideally not restrictive – and provide access to both the metadata and the physical file. To be interoperable, it should include industry standards for metadata and data which involves using controlled vocabularies and semantic standards. Such standards enable the use of the same ‘language’ across different information resources, allowing for interpretation and aggregation — on both a human and machine level. Once a resource has achieved a composite of all of these value dimensions, it can be considered reusable – and FAIR.
More than search and discovery
CGIAR Centers typically have two separate repositories: one for data and one for publications. Generally, the repositories are on different platforms that don’t speak to each other. GARDIAN allows users to quickly and easily find agricultural information across the 30-odd data and publications platforms of the 15 CGIAR Centers and 11 Genebanks. It currently points to approximately 100,000 publications and 3,000 data sets. The tool will soon enable the discovery of resources beyond CGIAR as well.
“We needed a way for people to search across CGIAR Centers and beyond, using single or multiple keywords—such as gender, nutrition, or drought-tolerant maize—to identify resources that exist for that topic,” explains Devare, GARDIAN architect.
GARDIAN, however, is more than just a search and discovery tool. It employs algorithms that attempt to link related resources, presenting users with additional information relevant to their search.
“GARDIAN is an effort to empower users by enabling them to more quickly generate insights and actionable, data-driven options,” said Devare.
Looking forward, GARDIAN will soon provide prototype analytical pipelines, visualization capabilities, and the ability to easily aggregate data from multiple datasets. Hundreds more publications and datasets will be discoverable in the coming months, enabling GARDIAN users to accelerate data-driven innovation to help transform the agricultural research and development sector.