6. Publication

This step can be implemented as follows

The person promoting data sharing in the organisation and the data administrator

  • prepare comprehensive metadata for the data
  • determine where the data can be published
  • publish the data in a data portal, such as Open Data service
  • agree on how they will communicate about the publication of the data together with the organisation’s communications unit

The organisation's IT professionals 

  • prepare the data for publication.

Metadata descriptions

This section discusses what metadata is, why and how the organisation should describe the data to be opened, and where this descriptive data can be published. This type of descriptive data is also referred to as metadata.

Avoindata Note icon

INSPIRE Directive

Under the INSPIRE Directive, EU Member States have an obligation to prepare metadata for the spatial data sets and services covered by the Directive and publish them on a search service, which in Finland is the Paikkatietohakemisto.

Avoindata Note icon

Data Governance Act (DGA)

Please note that the Data Governance Act obliges organisations to describe the datasets covered by the Act in the Suojattudata service.

What is metadata?

Datasets usually have two types of metadata:

  • The dataset’s internal metadata, which describe its data fields and their structures, meanings, and links to each other.
    • For example, the data model can specify the date format used in a table or what the table column ‘name’ refers to.
    • For example, the Interoperability Platform’s data models tool can be used to define the data model. 
  • The dataset’s external metadata, which describes the entire set, including its administrator, quality, and how the data was produced. 

Examples of dataset metadata

A dataset’s metadata can include the following:

  • Basic information, such as its name and licence
    • When you publish a dataset in a data portal, there usually are mandatory fields for the necessary basic information
  • A description of the data content and its possible deficiencies
  • A description of data quality using the indicators of the public administration’s common quality criteria
  • The data's production process
    • Any calculation formulas associated with the dataset’s production or similar, if possible.
  • Contact details of the data administrator.
  • The dataset’s update date and update frequency
    • Data distributed through an API (such as weather data) may even be updated in real time, whereas data shared in file format (for example statistical data) may be updated less frequently, for example once a year.

Why is describing the dataset important?

Describing the dataset will help both human and machine users to understand the data and thus facilitate its reuse. The metadata that describes the data contains information on, for example, the data’s origin, structure, and terms of use. Without this information, it would be impossible to use the data. When data is published on the Internet, its original context can easily be lost, which makes it particularly important to provide the user with metadata. 

Example

If a dataset contains the number 37 without any explanatory metadata, it could refer to a variety of factors, such as indoor temperature, shoe size, seating position, or something else entirely. Only metadata will allow you to correctly understand the meaning of that number. In this case, the metadata should describe that the numbers in the dataset refer to indoor temperatures, and that the temperatures are indicated in Celsius. 

How should the data to be opened be described?

While the Information Management Board has not published any recommendations on describing data that is to be opened, the metadata published in e.g. the Open Data service uses the EU countries’ common DCAT-AP 2.0 data model. The data.europa.eu portal assesses the metadata harvested for the service with the help of the FAIR principles.
 

What is the DCAT-AP data model?

DCAT-AP is a mutual agreement between the EU Member States on what metadata should be provided for open data. Its purpose is to facilitate the utilisation of data by harmonising the metadata requirements in different countries. When all countries provide similar information on their datasets in a similar format, developing international applications is easier.

The DCAT-AP data model is based on the DCAT (Data Catalog) vocabulary created by the World Wide Web Consortium (W3C). The purpose of the DCAT vocabulary is to harmonise the metadata used by services intended for sharing different datasets.

DCAT-AP harmonises the metadata required by data portals in different countries by providing a common metadata model that can be used by all portals. The metadata model specifies, for example, that all datasets must have a name and description. In addition, DCAT-AP has recommended and optional fields, such as keywords.

There are separate DCAT-AP versions for spatial data (GeoDCAT-AP) and statistical data (StatDCAT-AP). 

Benefits of using the DCAT-AP data model:

  • Datasets are more interoperable and easier to utilise internationally, as data from different countries is described in the same way.
    • For example, a description of the data content is available for all datasets, and the descriptions should preferably also be offered in English.
  • A predefined minimum volume of descriptive data guarantees that the datasets have been described to a sufficiently high standard
    • For example, it is impossible for data users to utilise data whose descriptive data does not contain a licence, as the user cannot know what the data’s terms of use are. 
  • The greater the amount of metadata available, the more discoverable the data is
    • For example, keywords added to metadata help data users find relevant data in different services.

Learn more about DCAT-AP.

Metadata quality assurance at data.europa.eu

Metadata quality assurance checks the metadata during each harvesting process. The purpose of quality assurance is to help data producers and data portals check the quality of metadata and receive suggestions for improvements.

The areas that are assessed in the metadata quality assurance process are derived from the FAIR principles. The areas are:

  • findability,
  • accessibility,
  • interoperability,
  • reusability, and
  • contextuality.

For additional information on metadata quality assurance, visit data.europa.eu.

 

The metadata should be available in both human and machine-readable formats. It is advisable for Finnish organisations to provide their metadata in Finnish, but also in Swedish and English whenever possible. In particular, metadata in English is useful for potential international data users. The datasets described in the Open Data service can be automatically found in the data.europa.eu service, which gathers European data together and thus gives international visibility to open data from different countries.

Publication channels

Publishing metadata in the Open Data service

When you publish information on Open Data service, you must fill in at least the mandatory metadata fields. This descriptive information is required to make it possible for users to discover the published data on the service and use it. This metadata includes a description and the licence. If you wish, you can also add a separate text file to the dataset, in which you can provide a more detailed description of the data.

Open Data has created and documented its own DCAT-AP extension. This extension makes Open Data DCAT-AP compatible and it also enhances the service’s search function and makes it easier to find datasets on data.europa.eu, where the metadata of the datasets in the service are copied, or harvested. A metadata template for describing datasets can also be found in the Open Data framework (Google docs).

Read more about DCAT-AP and the Open Data services extension.

Using the Interoperability Platform

The Interoperability Platform maintained by the Digital and Population Data Services Agency provides tools for describing the content of data and defining its metadata. It is an open online platform for the creation of both human and machine-readable data descriptions. The platform’s tools are Data models, Terminologies, and Code Lists.

The platform’s aim is to provide data users with one location for discovering the metadata descriptions of data exchanged between different organisations. It also aims to ensure the uniformity and reuse of data specifications used to describe content by encouraging the use of previously created terminologies, code lists, and data models.

The principle of reuse is part of its interoperability method, as it represents a way of managing, producing, and maintaining the data specifications and metadata necessary for digital services and data flows. The method guides the preparation of common core concepts, core classes, and code lists used by organisations, as well as how the descriptions produced by an individual organisation should be made available to everyone. At the same time, it also provides guidance on how these common descriptions can be used in the creation of the descriptions used by individual organisations.

The consistent use of concepts makes services easier to plan and understand. The utilisation of ready-made terminologies, code lists, and data models reduces the development costs of information systems. This helps accelerate the information system development cycles of the parties that use these descriptions while also promoting the interoperability of their respective information systems. Another way of describing this is semantic interoperability, which means ensuring that both the producer and user of the data understand it in the same way and allowing them to exchange data without changing its meaning along the way.

The Interoperability Platform’s ready-made data specifications are freely available to everyone. The Platform’s tools are available free of charge to content providers who need them in their terminological work, code list management, and data modelling. Content can be produced by both public administration and actors in the private sector. Data content producers are responsible for their own data specifications and their quality, and for keeping them up to date.

The public documentation on the Interoperability Platform contains necessary information on how to join the Interoperability Platform and basic instructions on how to get started with using its tools.

Practices employed by organisations

Organisations that have already opened their data have published metadata in connection with their datasets, either on a data portal (such as Open Data), on the organisation’s website, or both.

Metadata policies of Paikkatietohakemisto

The Paikkatietohakemisto service maintained by the National Land Survey of Finland is designed for the storage and sharing of metadata for actors that produce spatial data, for both INSPIRE-compliant and other spatial datasets and services.

The description of spatial datasets includes their geographic and temporal coverage, their production process, and any restrictions related to availability. The description of spatial data services includes the datasets offered and links to the actual service. When you enter the keyword avoindata.fi in the metadata of open spatial datasets and services, they are harvested to the Open Data service. The keyword ‘non-inspire’ should be added to datasets and services not within the scope of the INSPIRE Directive to ensure that they do not end up in Finland’s INSPIRE monitoring results.

Nearly 300 producers of metadata have registered with the service, representing approximately 150 organisations. The metadata of more than 1,500 spatial datasets or services has been published. About one half of them are with the scope of INSPIRE.

Metadata practices of Helsinki Region Infoshare service

Datasets should be given names that can be understood by as wide a range of users as possible. It is also a good idea to write a non-technical description of the dataset, including

  • the type of data it contains,
  • how the data was produced or collected, and
  • if the data has a particular feature the user should know about to be able to interpret it correctly.

The description should also openly inform users of any shortcomings and possible errors in the data. It would also be advisable to describe and write out the data attribute information and any abbreviations used in the dataset. The metadata of the dataset should also be translated into English.

Metadata template used by HRI (in Finnish).

Metadata practices of the Finnish Environment Institute

The Finnish Environment Institute uses a tool for describing the metadata (metadata editor) and has a separate metadata service intended for end users

Metadata can be produced for datasets and information systems, API services, environmental reports and research data. Specific metadata profiles have been implemented for each one of these. The metadata profiles of datasets and API services are compliant with the requirements of the INSPIRE Directive.

Both the metadata editor and the metadata service offer open APIs for harvesting metadata. Through these APIs, descriptions of the Finnish Environment Institute’s datasets and APIs are transferred to the Paikkatietohakemisto service maintained by the National Land Survey of Finland (metadata compliant with the INSPIRE Directive), the Open Data service maintained by the Digital and Population Data Services Agency (metadata of the Finnish Environment Institute’s open datasets) and CSC's Etsin.fi service.

Metadata practices of Statistics Finland

Production of reliable statistics requires a wide array of background information about statistics and the subjects they describe. The information about statistics section contains metadata for the statistics produced by Statistics Finland. The metadata service contains the following metadata sets:

Statistics Finland maintains and publishes national classification recommendations. Most of them are based on international standards confirmed by EU directives. Use of classification recommendations improves the comparability of statistical information produced at different times and in different areas.

Classifications consist of headings, or names of groups, of codes given to them (numerical or alphabetical codes), and of descriptions of groups (definitions). Classification refers to dividing individual items of information present in statistical data according to certain features into different groups where each unit belongs to only one group. In the classification, the groups are named and codes are issued to them. Statistics Finland classifications have also been published through an open API service.

 

Publication

This section describes how and where you should publish a dataset that is to be opened.

Organisations that have already opened their data have usually published the datasets to be opened and their metadata in a public data portal, ensuring that the datasets can be found as easily and quickly as possible. 

Compliance with FAIR principles in data publication

The FAIR principles were originally developed for research data. They are also applicable to open data publication, however, even though not all principles can necessarily be followed as such.

FAIR stands for Findable, Accessible, Interoperable, and Re-usable.

 

Image: FAIR principles in brief.

By following these principles, you can ensure that your data can be easily reused and that your published data and its metadata are of a high quality.

Check that the open data you publish complies with the FAIR principles:

  1. Publish the data in a public data portal, such as Open Data. 
  2. Make sure that a unique identifier is assigned to the data.
  3. Publish the data in an open file format.
  4. Provide a comprehensive description of the data.
  5. License the data under an open license, such as Creative Commons BY 4.

Read more about the FAIR principles and publishing open data (in Finnish).

Where should the opened data be published?

It is advisable to make the datasets opened by central government actors visible on Open Data servide. This way, the data can reach a large number of people who are interested in using it.

There are several data portals for sharing your data:

  • Anyone can publish their open data on Suomi.fi Open Data.
  • As a rule, the cities in the Helsinki Metropolitan Area publish their data in the HRI service.
  • The Paikkatietohakemisto compiles the metadata of data producers who prepare and publish spatial data and services.
  • The metadata from different international public administration datasets can be searched and browsed on data.europa.eu.

Open Data service is as a national open data contact point, and the metadata for the datasets available on it is harvested by and published on the data.europa.eu service administered by the European Commission. Open Data service contains a large volume of metadata for datasets published in other Finnish data portals – among other things, data are harvested from Helsinki Region Infoshare service and Paikkatietohakemisto.

In addition to publishing the metadata for its dataset in a data portal, the organisation may also publish the data on its website.

Publishing data on Suomi.fi Open Data

Opendata.fi is a national open data service. Its aim is to make all open data in Finland available on a single site.

The advantage of a national portal is that the organisation does not need to use its own resources to develop and maintain a portal. Open Data is constantly being developed, and the portal has an established user base, which makes it easy for different users to discover new data. The service is based on open-source code, and it is available in three languages.

Open Data. offers various benefits, such as:

  • A free publishing platform for open datasets. 
  • International visibility: data.europa.eu service harvests the Open Data service, which means that all data uploaded to the service can also be found in the European open data portal.
  • Statistics. Open Data service provides various statistics, including on the use of the service and the popularity of organisations and datasets.
  • Support materials for opening and publishing data


Open Data service can also be used through an API. Read more about using Open Data service through an API.

Publishing in Paikkatietohakemisto and on Paikkatietoikkuna.fi

Paikkatietohakemisto is a national metadata service maintained by the National Land Survey of Finland, in which data producers compile, publish and update metadata. Read more about publishing metadata in Paikkatietohakemisto (in Finnish).

Paikkatietoikkuna.fi is a national geoportal maintained by the National Land Survey of Finland that showcases spatial data sets and services and their possible uses. The National Land Survey exports map layers published by data producers to Paikkatietoikkuna through API services on request. An agreement is concluded with each data producer, which covers the terms of use and similar. Read more about Paikkatietoikkuna.

Services are a key part of compliance with the INSPIRE Directive in Finland (in Finnish). The INSPIRE Directive obliges the authorities that manage or maintain spatial data sets within the scope of the directive. Read more about obligations under the INSIPIRE Directive on Land Survey Finland’s website (in Finnish).

Publishing on HRI service

The open data of the cities in the Helsinki Metropolitan Area (Helsinki, Espoo, Vantaa and Kauniainen together with their joint municipal authorities) is published on hri.fi, from where the metadata for the datasets is harvested automatically to Open Data service.

Communications

This section describes how and where you should communicate about the data you intend to share. Communication is one way of improving the discoverability of data, which is why you should spread your message across a range of communication channels.

How should I communicate about the publication of my data?

It is advisable for an organisation to market the datasets it has opened using different communication methods, to encourage the widespread use of its data. Below is a list of good practices for communicating about your opened data.

  • It is a good idea to spread the message of your datasets as widely as possible, but remember to also focus on your target group.
  • Remember to describe what kind of data you have opened and where it has been published.
  • Explain the forms in which it has been made available, what its origins are, and when it was produced.

You should also encourage the users of your datasets to describe the applications or services they have developed on the basis of your open data. You can highlight the best examples of these applications on e.g. your social media feed, to inspire other application developers and data users.

Where should I communicate about the publication of my data?

You can inform people about your opened data in various places, such as

  • your organisation’s website,
  • a newsletter, and
  • on social media.

You can also present your data to your organisation’s networks and encourage its use by organising various events around its uses.

It is worth remembering that these communication activities can be carried out by several parties at a time. In the Helsinki Metropolitan Area, for example, both the party opening the data and the HRI service communicate about all newly opened datasets.

Read more about open data-related communications in the Open Data Handbook.

Support materials on the topic

This section contains support material related to the topics discussed in this step.

Training courses on the data.europa.eu website: