The ODM platform comprises a set of harvesters which are responsible for performing the task of metadata extraction from the registered catalogues. During the first period of the project, we focused on adapting a CKAN plugin for retrieving metadata from CKAN-based catalogues, as well as on designing and developing a generic harvester that extracts metadata from the HTML source code of the pages describing a catalogues’ datasets (see Deliverable D3.3 for details). During the second period, we have included a Socrata harvester in the list of available harvesters, and we have also implemented various additions and enhancements on the HTML harvester.

This section provides details on the technical backround of the ODM Harvester.

Related Articles

CKAN API

CKAN API


HTML Harvester

HTML Harvester