The Harmonization Engine applies a set of specified rules (i.e. mappings) in order to reconcile metadata records collected from different catalogues. At the beginning, during the first period of the project, we started by specifying these rules individually for each newly added catalogue. However, clearly a large part of these mappings are applicable for many catalogues, with fewer ones being exceptions that have to be specified separately. Thus, to make this process more easy and efficient to scale and maintain, we have defined a three-level hierarchy of mappings, as described:

  1. top-level: these are global rules, applied by default to all catalogues; only the administrator can define and change them;
  2. middle-level: rules in this level apply to a group of catalogues, i.e. catalogues belonging to the same user;
  3. bottom-level: these are rules that have local scope, i.e. are associated and applied to a specific catalogue.

Having this hierarchy provides much more flexibility for defining, maintaining and applying mapping rules during the harmonization process. Specifically, the rules that are finally applied to each catalogues’ fields are constructed from all three levels according to the respective scope. That is, we start with the top-level rules that are applied to a field we are trying to harmonize and then we check successively to find whether any more specific mappings exist from the middle or the bottom level, that are applicable to this specific catalogue. If so, the mapping with the most specific scope overrides the previous ones. Moreover, new mappings that do not exist in the top-level category are included to the mapping list. This enables us to be flexible, being able to specify generally applicable mappings at the top-level while also defining exceptions that are to be applied to specific catalogues or groups of catalogues.

For instance, let’s say that our mapping list contains the following: Creative Commons By 3: CC BY-3.0. This is a generally applied mapping to most of the catalogues. Although it is designed to ensure worldwide validity, jurisdictions differ on certain countries, e.g. Germany – CC By-3.0 De, Spain—CC BY-3.0 ES etc. Thus, we wanted to easily be able to retain and apply that information in our internal database without affecting the general rules or needing to specify multiple versions of the same license for each of the harvested catalogues. Therefore, we can group together the German catalogues and we can introduce the following mapping: e.g.  defining its scope to be this group.

More information can be found in Deliverable D3.6.