The harmonized metadata are accessed through the Analysis Engine. This component calculates a series of key metrics, which were enhanced and enriched during the project and makes the results available to the demonstration site via a RESTful API (initially introduced in Deliverable D3.3). This API is publicly available .

These metrics are used to analyze and compare catalogues’ metadata and can be grouped in two general categories: Quantity and Quality. The first one is about reporting statistics by counting or summing selected attributes of the harvested metadata. For instance, we count all datasets and distributions or add up the distribution size contained for every catalogue in the database. On the other hand, the defined quality metrics attempt to quantify how good and useful the collected metadata could be to end users. They combine data from different attributes and report how complete the specific metadata are and how possible it is that they could provide adequate and meaningful data for someone that is interested to use them. For instance, we try to identify the percentage of the datasets of a catalogue that has at least one resource which is in machine-readable format, i.e. CSV, TSV, JSON, XML or RDF, or each catalogue’s accessability which means that the following attributes exist for every metadata object and have valid values: a description, at least one valid link and an author email.

The metrics are independent meaning that quality does not imply quantity and the same stands the other way around. That means catalogues with a big number of resources does not imply these resources will be provided in any of the so called machine-readable formats. So the metrics are tools that could be used by the end users to find meaningful datasets on what he wants to accomplish. The metrics implemented are basically calculated in two ways:
• Catalogue level, meaning that counts or aggregations are applied on each catalogue;
• European level, which refer to metrics calculated on top of all datasets harvested among the 24 European countries.

The dashboards used in the demonstration site contain another level, the country one. Metrics for this level are calculated based on results provided for each catalogue and grouping all that belong to a specific country.

Below you can find more information about this topic.

Related Articles

Quality Metrics

Quality Metrics


Quantity Metrics

Quantity Metrics