General distinctions between user groups can be drawn based on sectors (private/public) or the degree of organisation (individual/collective/corporate). However, for the questions at stake here, it remains paramount to translate user characteristics and requirements into necessary functionalities of the OpenDataMonitor platform. Therefore, we focus on stakeholder group interests, requirements and understanding of the topic and level of technical expertise in regard to open data.

Stakeholder Requirements and OpenDataMonitor Potential

Stakeholder Requirements ODM-Potential
Policy Makers: Parliament, ministries pushing open data, coordination bodies for e- government and ICT, governance structures for cross-level collaboration in e- government and ICT Understanding barriers to open data publication and use; understand, develop and enforce widely used standards (formats, structure, licenses etc.) Benchmark volume and sophistication of the published data as well as its use;highlight coverage of used standards;present   usage   of   open   data; metrics per geography (see D2.3)
Commercial User (Asscociations): corporate advocacy groups, business associations, media outlets Detect high-value data sets with minimally and transparent strings attached; detect mashable, harmonised data sets on a large scale; Highlight high-value data sets (e. update frequency of a data set); map           mashable           content (congruent licenses,  harmonised structure and vocabulary); highlight      coverage      of      used standards (esp. licenses);metrics  per  geography  and  per data set
Civic  Advocacy  Groups:  civic advocacy groups Advocate the publication of and detect already published politically sensitive data sets (politico-administrative) Highlight and compare sensitive data    sets    to     advocate    their publication in other locations; map mashable content;metrics  per  data  set  and  pergeography
Government bodies and associations: inter- and supra- national bodies and associations, coordinating bodies around ICT and e- government, public enterprise in charge of furthering the information society, network of  smart  cities, standardisation bodies Advocate the publication of high- value data sets; benchmark volume and sophistication of the published data as well as its use to name and shame understand coverage of used standards to align  with  these;  understand what constitutes high-value data sets to advocate their publication Benchmark volume and sophistication of the published data as well as its use;highlight      coverage      of      used standards;highlight high-value data sets; metrics per geography, catalogue and data set
Data generating and (potentially) providing government bodies Understand what constitutes a high-value data set in their professional domain; learn about standards in open data in general and their professional domain; understand how open data in their professional domain can be used Highlight high-value data sets by domain or topic;highlight coverage of used standards  (licenses,  structure and vocabulary) by domain or topic;highlight   applications   of   open data by domain or topic
Technology providers: Private technology consultancies, ICT vendors, (public) ICT service providers, Open data platform providers; applied research centres Understand widely adopted technologies and standards to align with these Highlight      coverage     of      used infrastructure,   technology   and standards (formats, licenses); metrics     per     geography     and overall


Groups for which generally little technical expertise has to be presumed are policy-makers, data generators and some of the support units. Nevertheless these groups are involved in major decisions about open data and shape its conceptualisation and implementation. Policy-makers (parliamentarians, high-level executives) are involved with open data at a rather abstract level. However, their commitment to and interest in the topic in general has a significant impact on how the machinery of government approaches and implements open data. Insightful for policy-makers is to see how their sphere of responsibility (jurisdiction, organisation) compares to others in regards of volume  and  sophistication  of  the  published  data  as  well  as  its  use.  This  serves  as  a  basis  to benchmark  their  performance  and  identify  fields  of  strategic  interest.  Therefore,  for  them  it  is necessary to see, what data is published by other public sector organisations and how frequently this data is used. Thereby they can get a better understanding of high-value datasets. At the moment administrations often pursue an “availability approach” to publishing data: They publish data that is available in a structured format, at a fairly good quality level and not obviously sensitive, because they lack a profound understanding of what data might be useful. At a more specific level, policy- makers pass laws, issue executive orders or policies about open data that shape how open data is published (e.g. prescribe licenses, formats, meta data standards or even paradigmatic shifts to consider everything open by default) (see Zuiderwijk & Janssen, 2014). However, these decisions are mostly prepared by ministries or other governmental departments, considered here as support units further below.

Another group that approaches open data from a rather thematic and legal perspective are data generators who typically hold the data and often consider themselves as data owners. They generate data in the course of their regular work and are predominantly responsible for the decisions whether and which of this data to publish as open data. Besides information about which data from their subject area is published by other organisations (see above), more detailed thematic and technical aspects are relevant for the decisions they make in terms of open data. Information about data structure, vocabulary and measurement scales could provide guidance for data generators how to publish  their  data,  although  they  often  seem  to  be  unaware  of  its  significance.  Here,  various European, supra-national and national conventions exist – some codified, others not – in various policy fields which could be built upon, as has been demonstrated with the INSPIRE directive. At a basic level, insights in which meta data schema are used could be helpful. At a far more sophisticated level, patterns in data structures and vocabulary might assist. Furthermore, data generators appear largely unaware of how open data is used and often seem to lack imagination of its possible use. In this respect, successful use cases of open data could prove insightful for them. In addition, the legal perspective is especially significant in the public sector. This is in particular true for data generators and for support units (legal department, data protection officer) who are involved in decisions about which data to publish, with which level of detail and under which license. Thus, such information could assist their decisions about licensing, liabilities and privacy protection. On the whole, data generators are not fully aware of the topic open data, do not initially endorse the idea of publishing data and have not yet integrated open data processes in their routine activities. Therefore, it poses a challenge to even attract this group to information about open data.

IT strategy units, platform providers and private consultancies often have a higher level of technical expertise, although not necessarily in regards to open data and how it is used. They are involved in decisions about portal architecture, publishing processes and to a varying extent can set standards for data published in a catalogue (data format, meta data standards, quality). For these decisions, information about the spread of platforms (e.g. CKAN), meta data schema and data formats could help them in establishing a state of the art open data portal.

Among the intermediary and end-users, advocacy groups stand out as a group which does not necessarily use open data itself, but gathers and publishes information about open data to further their cause (see Davies, 2013; Open Knowledge Foundation, 2013). Thus, they require a breadth of detailed information about open data, in particular for benchmarking purposes. Advocacy groups in general have a sophisticated technical understanding of open data so it is not necessary to reduce complexity for them in this respect. Quite the contrary, in order to illustrate which catalogue hosts the most exhaustive meta data and points the most comprehensive and sophisticated datasets advocacy groups need to look at technical details. For lobbying efforts, it is necessary to trace back datasets to specific territories, policy fields and organisations. Especially the content-relation (policy field) seems relevant, since in the various institutional arrangements in European countries, different organisations are responsible for and accommodate the same thematic data. With several catalogues by now federating data from numerous organisations, jurisdictions and even countries, this becomes more important for comparisons.

Among the immediate users of open data (esp. application developers, researchers, data journalists) further  differentiation  appears  necessary.  Different  approaches  in  data  detection  which  can  be termed “data-driven” versus “issue-driven”. Issue-driven users search for datasets in the context of a specific topic, because they have a certain interest and know in advance which data they therefore need. They search directly on an open data platform via search-terms, specific keywords. For these kinds of users, portals/catalogues provide fairly appropriate search masks. Thus far, however, they can only search in a specific catalogue and find results of the data referenced there. Since portals often contain only meta data about data from a specific jurisdiction or even organisation, users might have to search in different catalogues instead of looking into one meta-catalogue. Furthermore, the lack of meta data quality often inhibits or restricts the ability of these users to find relevant datasets. Thus, a meta-catalogue would be even more powerful, if it provided a search mask not only for the meta data in the catalogues, but the datasets themselves which are hosted in the repositories.

Data-driven  users  on  the  other  hand  look  for  complex,  comprehensive,  and  large  datasets, irrespective of their specific topical content. Their assumption basically seems to be that a complex dataset can be put to a purposeful use, even without a prior idea. Until now, they find scant support on existing catalogues to identify relevant, sensitive, high-value datasets. Since a number of datasets are simply published because they are at hand, catalogues are stacked with data of little use for these users and search term queries are of little help. More relevant would be algorithms that analyse the size of a dataset (columns, data points, whether a dataset contains string-data and numeric data or structured and unstructured data), its update frequency or whether it contains linked or non-linked data.


References can be found here: OpenDataMonitor Project – Shared References.