3. Datasets and demand
This part can be implemented as follows
The person promoting data sharing in the organisation, the data administrator and the organisation's data protection expert work together to
- identify data suitable for sharing and the person responsible for it.
- identify the type of data for which there is demand outside the organisation.
- compile a list of datasets that the organisation is planning to open in the future and agree on practical measures and monitoring.
Identification of datasets
This step describes how the organisation can map and classify its datasets and identify valuable datasets that can be opened.
Public administration organisations map and classify their datasets based on their statutory tasks and recommendations, in particular. The purpose of mapping the datasets is to identify, among other things, the rights and restrictions affecting the disclosure of datasets and, consequently, the possibilities of sharing the data.
The organisation must first identify what type of data it has in general and what part of this data is suitable for opening. In public administration, for example, tens of thousands of information systems are used for different purposes. It has been found that approx. 1,800 information systems are in use in central government, and the City of Helsinki, for example, uses around 700 to 900 information systems. In addition to information systems, there is a huge number of documents, including spreadsheets, images, audio and videos, which could be open data.
When mapping datasets suitable for sharing, it is advisable to also take their value into account. There are many different ways of perceiving their value. This value can be approached from the social, economic, ecological and knowledge perspective.
- Social value promotes equality (minority rights, rights of persons with disabilities, opportunities for participation).
- Economic value can be measured by financial indicators (new jobs, companies, services, tax revenue).
- Ecological value promotes material efficiency and circular economy (material recycling efficiency, natural resources use, recycling).
- Knowledge promotes knowledge-based decision-making, which often has significant secondary effects (political and societal decision-making).
It is a good idea to prioritise the opening of datasets that have been identified as valuable.
Once the organisation has identified suitable data to be shared and determined the type of data there is demand for, it is advisable to compile a list of datasets that it plans to open in the future.
Data mapping methods
Data can be mapped by using the methods discussed below, depending on which alternative the organisation has opted for.
Information management model
Government agencies, municipalities and other information management entities referred to in the Public Information Management Act (906/2019) must prepare and maintain an information management model that gives an overview of information management in the entity. Among other things, the information management model shows the titles and uses of information pools as well as information disclosure targets and storage periods of information.
The information management model helps authorities manage the continuously increasing data volumes. It helps to understand and manage the life cycle of data and, consequently, also identify and manage risks related to the use of new digital services. Rather than comprising a one-off obligation to produce a description, the information management model is something that must be maintained whenever changes relating to information management in the information management entity affect its content.
The information management model provides important and useful information content when the organisation is considering the possibilities of sharing data, and it can help choose the data the organisation wishes to open. Read the Information Management Board's recommendation regarding the information management model (Ministry of Finance publications 2020:29).
Information management map
An information management map, which implements the intention of the Public Information Management Act (906/2019), serves as a tool for authorities in financial and operational planning and when assessing the possibility of using data collected by other actors in the performance of their duties. For example, the map helps ministries to prepare regulations on information management and information resources, to guide interoperability of information resources and information systems in their sectors, and to plan ways of implementing interoperability. The map also serves the administration’s customers when a view of how data concerning a person is managed in public administration is needed.
The purpose of the information management map is to provide different actors in society with an overview of
- how information management in public administration has been organised,
- which data are maintained in different information resources, and
- under what conditions the data could be available for the actor's needs.
The map can be used to plan and develop the interoperability of public administration’s information resources and information systems. The map also provides a better starting point for assessing the impacts of major administrative or structural changes on actors responsible for information management.
The first version of the information management map was published in early 2022 on the State Treasury's Exploreadministration.fi service, and it can be used freely for different purposes.
For more information on the information management map, visit the Ministry of Finance's website.
Description to implement document publicity
Under section 28 of the Act on Public Information Management (906/2019), central government agencies, municipalities and other information management entities referred to in the Act must maintain a description of the information pools and case register managed by it. This ‘description to implement document publicity’ is one way of helping citizens to know to whom they should address their requests for information.
Among other things, the description must contain information about
- the information systems containing information belonging to the case register or the information management of services,
- the datasets included in the information systems by data groups, and
- the open access to datasets via a technical interface.
The information management entity shall publish the description in a public data network in so far as the information in the description is not secret. Read the Information Management Board's recommendation on preparing descriptions to implement document publicity (in Finnish) (Ministry of Finance publications 2020:22).
The description to implement document publicity has replaced the specifications on information management systems referred to in section 18 of the Act on the Openness of Government Activities (621/1999). Some public administration actors have opened their information system descriptions or lists as open data, and in 2016, the Association of Finnish Local and Regional Authorities published its instructions for opening the list of information systems (in Finnish) for municipalities prepared by the six largest cities on opendata.fi.
Data balance sheet
The data balance sheet contains an evaluation of how the organisation's data, data protection and information security have been implemented. In particular, its aim is to map the status of data processing, life cycle management in personal data processing and the necessary development measures. The idea is that the data balance sheet would serve as a management and internal control tool, support data protection work and increase the effectiveness of activities.
The data balance sheet plays a key role in monitoring compliance with data protection and the controller's accountability referred to in the General Data Protection Regulation (Article 5 of EU 2016/679) as well as demonstration of transparency. The obligation to demonstrate compliance means that the organisation must be able to demonstrate that it complies with the General Data Protection Regulation in the processing of personal data and that it also implements the data protection principles in practice. The data balance sheet also serves as a show of trust towards the organisation's stakeholders.
The Association of Finnish Local and Regional Authorities has published a data balance sheet template (in Finnish, docx file), which organisations can use when planning and drawing up their data balance sheets. For more information on this template, see the presentation material of the Association of Finnish Local and Regional Authorities titled Data balance sheet – what and why (in Finnish, pdf) (11 Sep 2019).
Examples of data balance sheets of different public administration organisations:
- Statistics Finland’s data balance sheet (in Finnish, pdf)
- National Land Survey of Finland’s data balance sheet (in Finnish, pdf)
- Digital and Population Data Services Agency’s data balance sheet (in Finnish)
- Finnish Transport and Communications Agency Traficom’s data balance sheet (in Finnish, pdf)
- Finnish Institute for Health and Welfare’s data balance sheet (in Finnish)
- City of Turku’s data balance sheets (in Finnish)
- City of Kokkola’s data balance sheet (in Finnish)
- Municipality of Hämeenkyrö’s data balance sheet (in Finnish, pdf)
The interoperability platform maintained by the Digital and Population Data Services Agency enables harmonised specification of data contents as well as effective and transparent cooperation between actors in information management. The interoperability platform consists of glossaries, code sets and information flows as well as information models needed in other aspects of information management.
The interoperability platform makes use of the interoperability method, which helps to create and maintain the semantic interoperability of information, or processing where the meaning of information remains unchanged in information flows. The interoperability method is a shared way of producing, managing and maintaining the information specifications and metadata needed for digital services and information flows. For a more detailed description of the interoperability platform and method, see chapter 5, Metadata description.
Demand for datasets
This stage describes how the organisation can determine the type of data for which there could be demand outside the organisation and how the data to be opened would benefit potential users.
There are no official recommendations for identifying the needs of data users.
Organisations that have already opened their data have striven to find out about potential users’ information needs in different ways, as the activities should always be guided by needs.
Methods for mapping data users’ needs
The following are examples of methods that can be used to map demand for data.
Information requests and feedback received by the organisation
Any information requests and feedback are recorded in the information management entity’s case register or feedback systems in the organisation. By analysing these requests and feedback, datasets may be identified for which there is a wider need. If possible, the feedback should be processed transparently.
User statistics of the organisation’s website
If the organisation has published descriptive or other information about its datasets on its website, it can analyse the website user statistics and determine which datasets may be of the greatest interest to visitors.
The organisation may publish a survey, for example on social media, in which potential data users are asked about what kind of data they would like to access and for what purposes they would use it. For example, the Finnish Meteorological Institute has conducted user surveys to map users’ preferences regarding datasets to be opened.
Piloting the opening of data
Data users’ needs can be investigated and tested by means of pilots: the organisation can begin by only opening a small part of the data and targeting the supply to a limited group of users. The best-case scenario is that pilot users will quickly produce the first applications using the data, which can be presented as examples of data use. Other information needs may also emerge during the piloting.
Piloting is particularly worthwhile if it is likely that several groups will be interested in the data to be opened. The organisation can start with opening data that is the most likely to be of interest to the selected target group in a format that benefits them. When planning piloting, the organisation should also remember that a decision to stop piloting should be made at some point. Rather than proceeding directly from piloting to production, the production system must be designed separately. After the piloting, other datasets can also be opened, and the accessibility of the data can be expanded by using different file formats.
The following are examples of a few different user groups:
- Innovators: The group of innovators consists of individual users who adopt new innovations quickly. They may provide good ideas for using the data, but getting this group together may be challenging.
- Individual developers: Private developers are experimental data users who can test interfaces and explore the types of data the organisation has. Testing new ideas with this group can be fast and productive.
- Small companies: Small companies often take up new sources of information faster than their larger counterparts. To work together with a small company, intensive customer support may be required.
- Large companies: Large companies are often driven by business pressures, and development processes can be lengthy and cumbersome. Large companies may insist on the open data being continuously and reliably accessible, with high-quality customer support available 24/7.
- Data aggregators: Data aggregators promote the widest possible use of open data. They collect data from different sources, creating their own sets that they share with others. For example, an aggregator may collect timetable information on means of transport through different interfaces and produce an application where users can find the timetables for all modes of transport.
- Higher education students: Student groups in different fields can be easy to reach, for example by organising courses in cooperation with the educational institution, and students complete high numbers of different projects and theses.
Other ways to identify needs
Data users' needs can also be determined by means of data polls, interviews, identification of the organisation's internal needs, or by providing users with a tool for requesting datasets.
Helsinki Region Infoshare accepts data requests
The organisation should express its willingness to receive data requests and offer a method for receiving them (including a specific e-mail address or a separate online form). Discussions on users’ preferences can also be conducted at events. For example, HRI organises regular developer meetings, at which it receives data requests. From time to time, it presents data that it is only planning to open, enabling it to listen to future users’ preferences, for example concerning the data form and format.