In order to describe the concepts involved in open data in a consistent manner, we have defined a series of terms in the table below, which summarises our understanding of some of the most common terms used  to  describe  the  open  data  landscape.  The  terminology  is  aligned  with  DCAT-AP  (DCAT Application profile for European data portals) and the metadata standards applied in When defining those terms, we have also referred to the existing terminologies related to open data, such as glossary, open data handbook  and W3C Linked Data Glossary, and make sure that our definitions do not conflict with theirs. In the Table, we also identify some synonyms and they could be used interchangeably in some circumstances, such as open data catalogue, repository, portal and platform.


Terms Definition Synonym
Open data “A piece of data is open if anyone is free to use, reuse, and redistribute it – subject only, at most, to the requirement to attribute and/or share-alike”(Open Data Handbook).
Stakeholder “any group or individual who can affect or is affected by the achievement of the organisation’s objectives” (Freeman, 1984)
Open data repository An online data storage/hosting service but with no discovery mechanism. This could be as simple as a Web server hosting static files from a single folder, with no additional index or categorisation, except perhaps a ‘landing page’ for each data set. Open data catalogue, data hub
Open data catalogue A curated collection of metadata about data sets. Compared with “open data repository”, “open data catalogue” focuses on the organisation of data sets, while “open data repository” refers to the actual data storage. The catalogue would typically be agnostic regarding where the data itself is located: (1) it may all be published on the same Web server as the catalogue, i.e. the catalogue contains a data repository, or (2) may be distributed across the Web, with the catalogue simply pointing to those remote locations, in which case the catalogue is also referred to as a “data aggregator” or “data indexer”. Open data portal, data hub, open data repository
Open data portal Often used synonymously with open data catalogue, but may provide more advanced discovery functionality to complement conventional browse-style catalogue interfaces.For example, there may be text search over the metadata describing the data sets, or the ability to preview/explore the data itself. Distinctions between open data portals and catalogues, and between open data portals and platforms should be considered fuzzy. Open data catalogue, data hub, open data platform
Open data platform A piece of software that has implemented the core features to manage open data. Those features include, but are not limited to, user management, data publishing, metadata management, data set storage, access control, search and visualisation, etc. Open data portal, open data portal software
Group Groups are used to create and manage collections of data sets with some common features. Collection, category
Data set A conceptual entity that “represents a collection of data, published or curated by a single agent, and available for access or download in one or more formats” (W3C). A data set is usually hosted in an open data repository and can belong to one or more groups. Package (in CKAN)
Distribution A distribution of a certain data set “represents a specific available form of that data set. Each data set might be available in different forms, and these forms might represent different formats of the data set or different endpoints. Examples of distributions include a downloadable CSV file, an API or RSS feed”(W3C). Resource (in CKAN)
Metadata The metadata of a data set is a collection of data that describes the data set and provides more information about the data set, such as title, tags, license, maintainer, etc. The metadata can be provided in different format, such as JSON, XML and RDF.

References can be found here: OpenDataMonitor Project – Shared References