suomi.fi
Go directly to contents.
Good practices for Service Developers

Using AI responsibly

Well-working AI requires high-quality data

Data collection is a multi-stage process

The operation of AI is based on data. For this reason, the properties of data require more and more consideration from an ethical perspective. Data is not just static material but a comprehensive process that includes

  • setting objectives for the system
  • identifying datasets relevant to the objective
  • collecting training datasets: methods and management
  • analysing the quality of the datasets
  • cleaning up and curating data for mechanical processing
  • generating a model and testing it
  • processing production data
  • continuous monitoring; updating the model if necessary.
The "direction" of AI ethics questions has been towards outputs and not inputs; we should focus more how the data is produced and processed.

– Researcher William Isaac, Google DeepMind
Updated: 9/11/2023

Quality requirements are highlighted when data is shared

In a society based on a data economy where data is shared and used by various authorities and even the private sector, it is not enough for each organisation to have internally consistent data procedures.

Different actors may have different ways of storing and updating their datasets, but structural and semantic differences in the data makes its sensible and secure shared use difficult.

Data itself does not contain any solutions or meaning; those qualities are not generated until the data is used. Since each use case is unique, the value and significance of data interact with end users’ actions. So, it is necessary to have communication and feedback channels between the producers, owners and users of data.

Read more about the ISO standard of measuring data quality.Opens in a new window.

Updated: 9/11/2023

AI diversifies opportunities but demands more from oversight

AI technologies have enabled two significant changes in the utilisation of data:

  • data originating from several sources can be analysed simultaneously and crosswise
  • loosely structured or even completely unstructured data can be analysed and used.

In fact, the question of the quality of data used in AI systems is no simple matter. Traditional quality factors, such as up-to-datedness and internal integrity, are still relevant, but they are now assessed across several datasets. Similarly, the integrity, security and compliance of the data has to be assessed in a more multidimensional manner.

A partial solution to strengthening the management of complex and varied data could be the systematic use, classification and indexing of metadata. It helps to keep non-structured materials better “visible”, and it also collects material from different sources into a cohesive meaning space.

Updated: 9/11/2023

Are you satisfied with the content on this page?

Checklist