A generic data sharing infrastructure for AI at your fingertips

Published on: 24 November 2020

During the event for participants of the Dutch AI Coalition on November 24th, the session ‘Sharing data responsibly for AI’ presented the results of two use cases. It was demonstrated that an AI application with data from different sources can be implemented in a responsible way.

Data, together with algorithms, is the building block for AI applications. These data sources are often stored in many places with various kinds of access restrictions. This means that solutions are needed for accessing these data sources. By means of two use cases, namely the analysis of photographs and lab results aimed at COVID-19, it has been demonstrated which solutions are possible for responsible data sharing in these situations.

The report ‘Responsible data sharing‘, published in March 2020, described the need for a data sharing infrastructure. In this report, a prelude was made to how this infrastructure could look like in order to stimulate AI in the Netherlands. Together with the application areas of Health and Care, Public Services, and Energy and Sustainability, a number of technical ‘Proof of Concepts’ (PoCs) have been carried out in which so-called ‘ecosystems of trust’ have been set up in practice.

Conclusion: AI needs a generic data component structure
The results show that it is possible to set up a data sharing infrastructure based on international standards (FAIR, IDS). It has been shown that data does not have to be physically sent to other organizations in order to successfully implement AI (federative data architecture). This method of data sharing is known in healthcare as the Personal Health Train. If data is shared, it is possible for the data owner to control the use of his data through agreements and software solutions.

This approach of a generic infrastructure for data sharing has several advantages:

  • Identification, Authentication and Authorization standards are easily applicable (via secure & trusted handshakes).
  • Semantic models according to FAIR-principles are a good basis.
  • Multiple types of AI algorithms can be supported while respecting privacy requirements.

Scaling up to large-scale test environment
The Proof of Concepts are limited in scope and scale. In the coming period, the group of stakeholders will be expanded in order to realize large-scale test environments towards eventually operational solutions. In this way experience can be gained in and between application areas in order to broaden the knowledge base and to work towards implementation in practice. This will enable more and better AI implementations. By working with standards, implementation becomes easier and vendor lock-in is prevented.

Are you interested in more information about the specific use case and the underlying implementation? Visit this page for more information. To stimulate the development of AI innovations, the Dutch AI Coalition makes the proof of concept software freely available through GitLab.

Share via: