Synthetic data: a solution when developing AI applications?

digital key
The availability of data and access to it is crucial to the development of AI applications. For many organisations (start-ups and scale-ups in particular), getting data available quickly is a massive stumbling block. There are challenges in getting relevant data made available and the readiness of others to do so, as well legislation and regulations (e.g. privacy), which keep getting stricter.

The availability of data and access to it is crucial to the development of AI applications. For many organisations (start-ups and scale-ups in particular), getting data available quickly is a massive stumbling block. There are challenges in getting relevant data made available and the readiness of others to do so, as well legislation and regulations (e.g. privacy), which keep getting stricter. Without data, there can be no data-driven innovation using artificial intelligence (AI) and so solutions are badly needed.

One possible approach is to use synthetic data. This up-and-coming solution is also underlined by e.g. Gartner*, who predict that 60% of the data used for developing AI and analysis applications will be generated synthetically by 2024.

Use of artificial intelligence

What exactly is AI-generated synthetic data? Whereas original data is collected through interactions with individuals, synthetic data is created by a computer algorithm that generates completely new, artificial data points. The new aspect is using AI in the data synthesis process for modelling the synthetic data that is generated in such a way that the characteristics, relationships and statistical patterns of the original dataset are simulated. AI-generated synthetic data is a new solution that provides large quantities of representative data simply and quickly. Syntho, an expert in AI-generated synthetic data, wants to us this approach for building the foundations for data-driven innovation (e.g. using AI) and they have recently won the Philips Innovation Award with their proposals.

What challenge does it solve?

The outputs from this use case will answer some of the frequently asked questions about using synthetic data. What is the value of synthetic data? When is it a good solution and when is it less effective? What are its limitations? And what are the pros and cons of synthetic data compared to other privacy-enhancing technologies (PETs)?

Syntho and SAS are going to work together to compare AI-generated synthetic data against original datasets and assess them in terms of data quality, legal validity and usability. This will create a picture of the added value of synthetic data, show where synthetic data is less useful and what follow-up steps organisations and the NL AIC should and could take to encourage the development and application of AI. The use of synthetic data will also be shown within a broader perspective by comparing it against the privacy-enhancing technologies (PETs) that already exist.

Sharing knowledge with NL AIC affiliates is key

Generating the actual synthetic data makes it possible to compare it against the original data and then assess its data quality, legal validity and usability. The following outcomes will be shared and made available to the NL AIC’s participants, aiming to promote knowledge sharing and answer questions about synthetic data:

  • The quality report.
  • The final presentation.
  • A training session on privacy-enhancing technologies (PETs) that will also discuss other PETs such as encryption, pseudonymisation, anonymisation, etc.
  • A synthetic version of a publicly available dataset.

Parties involved

In this use case, Syntho, SAS and the NL AIC are working together to achieve the intended results. Syntho is an expert in AI-generated synthetic data and SAS is the market leader in analytics, providing software for exploring, analysing and visualising data.

More information

If you are interested, go to the Syntho website for more information about synthetic data.
Contact: Wim Kees Janssen, kees@syntho.ai

Share with

More information

Building blocks

The NL AIC collaborates on the necessary common knowledge and expertise, resulting in five themes, also called building blocks. Those are important for a robust impact in economic and social sectors.

Sectors

AI is a generic technology that is ultimately applicable in all sectors. For the development of knowledge and experience in the use of AI in the Netherlands, it is essential to focus on specific industries that are relevant to our country. These industries can achieve excellent results, and knowledge and experience that can be leveraged for application in other sectors.

    Become a participant

    The Netherlands AI Coalition is convinced that active collaboration with a wide range of stakeholders is essential to stimulate and connect initiatives in Artificial Intelligence. Within fields of expertise and with other stakeholders in the ecosystem to achieve the most significant result possible in the development and application of AI in the Netherlands. Representatives from the business community (large, small, start-up), government, research and educational institutions and civil society organisations can participate.
    Interested? For more information, see the page about participation.