General

?What is the main idea behind open data?
In the recent years, the open data movement is constantly gaining momentum, leading to an increasing adoption and obvious benefits. The main driving idea is that certain data should be freely available and usable for everyone, without restrictions such as copyright. This is especially important for open governmental data, the availability of which strongly contributes to the transparency and efficiency of a democratic system. An improved access to open governmental data has shown highly positive impacts on users and for the development of new digital products and services, thereby stimulating economic and business activity and ultimately providing value for society as a whole.
?Which is the most widely used open data management system?
CKAN is currently the most widely used open source data management system that helps users from different levels and domains (national and regional governments, companies and organisations) to make their data openly available. CKAN has been adopted by various levels of Open Data portals, and a few popular CKAN instances include publicdata.eu, data.gov.uk and data.gouv.fr.

CKAN provides tools to ease the workflow of data publishing, sharing, searching and management. Each data set is given its own page with a rich collection of metadata. Users can publish their data sets via an import feature or through a web interface, and then the data sets can be searched by keywords or tags with exact or fuzzy-matching queries. CKAN provides a rich set of visualisation tools, such as interactive tables, graphs and maps. Moreover, the dashboard function will help administrators to monitor the statistics and usage metrics for the data sets. Federating networks with other CKAN nodes is also supported, as well as the possibility to build a community with extensions that allow users to comment on and follow data sets. Finally, CKAN provides a rich RESTful JSON API for querying and retrieving data set information. Further platforms are Socrata, Junar, DKAN, Open Government Platform (OGP) and QU.
?Which open data platform software is most widely deployed?
There are many existing software or platforms that have been used to deploy open data portals. In this subsection, we will briefly list portal software and their implementations. CKAN is currently the most widely used open source data management system that helps users from different levels and domains (national and regional governments, companies and organisations) to make their data openly available. CKAN has been adopted by various levels of Open Data portals, and a few popular CKAN instances include publicdata.eu, data.gov.uk and data.gouv.fr. CKAN provides tools to ease the workflow of data publishing, sharing, searching and management. Each data set is given its own page with a rich collection of metadata. Users can publish their data sets via an import feature or through a web interface, and then the data sets can be searched by keywords or tags with exact or fuzzy-matching queries. CKAN provides a rich set of visualisation tools, such as interactive tables, graphs and maps. Moreover, the dashboard function will help administrators to monitor the statistics and usage metrics for the data sets. Federating networks with other CKAN nodes is also supported, as well as the possibility to build a community with extensions that allow users to comment on and follow data sets. Finally, CKAN provides a rich RESTful JSON API for querying and retrieving data set information. For more information check the knowledge base.
?What is metadata?
The metadata of a data set is a collection of data that describes the data set and provides more information about the data set, such as title, tags, license, maintainer, etc. The metadata can be provided in different format, such as JSON, XML and RDF.
?What is an open data catalogue?
An open data catalogue is a curated collection of metadata about data sets. Compared with “open data repositories”, “open data catalogue” focus on the organisation of data sets, while “open data repositories” refer to the actual data storage. The catalogue would typically be agnostic regarding where the data itself is located: (1) it may all be published on the same web server as the catalogue, i.e. the catalogue contains a data repository, or (2) may be distributed across the web, with the catalogue simply pointing to those remote locations, in which case the catalogue is also referred to as a “data aggregator” or “data indexer”.
?What is a data set?
A dataset is a conceptual entity that represents a collection of data, published or curated by a single agent, and available for access or download in one or more formats. A data set is usually hosted in an open data repository and can belong to one or more groups.
?What is an open data repository?
An open data repository is an online data storage/hosting service which has no discovery mechanism. In the most simple form an open data repository could be a simple web server hosting static files from a single folder, with no additional index or categorisation, except perhaps a 'landing page' for each data set.

OpenDataMonitor

?How can I access the harvested raw data?
Just take a look at our Harvester introduction here.
?How can I access the harmonised data?
Take a look at our API manual and video here.
?How can I compare how different countries or catalogues perform?
Just take a look at the Benchmark section on the OpenDataMonitor or take a tour here.
?How are the various metrics calculated?
You can find more information about metrics in the Methodology section of the OpenDataMonitor.
?What determines whether a license is considered open?
A license is considered open if it is conformant to the recommended list by Open Definition, maintained by the OpenDataMonitor project and available in the project's github repository.
?What determines whether a dataset is considered machine-readable?
A dataset is considered as machine-readable if it contains at least one distribution in a format that is considered machine-readable. A list of the formats considered as machine-readbale is available in the project's github repository.
?Why are several data catalogues not harvested, harmonised and analysed?
Some of the catalogues we have encountered cannot be automatically harvested by the harvesters currently available in the ODM platform (e.g. due to significantly different structure, which our harvesters cannot automatically recognize and parse). Moreover, newly harvested catalogues are harmonised and analysed using the existing configuration files with mappings. Therefore, cases which are not covered from existing mappings need to be manually added.
?Is the data discovered and harvested automatically?
The ODM platform does not support automatic discovery of open data catalogues; in order for a catalogue to be monitored, it first has to be registered manually. During registration, some information is provided which guides the configuration of the harvesting process. Once this is done, the harvesting process is performed automatically and periodically.
?How is the metadata harmonised?
The metadata is harmonised semi-automatically. A number of configuration files, corresponding to the different metadata attributes being analyzed, contain manually defined lists of mappings. These files are used by automatically running scripts that apply them to harmonize the metadata of newly collected datasets. Unhandled mappings are logged, to facilitate the maintenance of these lists (e.g. to indicate that new mappings need to be added).