Various models of processes in and around (linked) open data have been put forward under different headings. They have been termed the open data life cycle, the open data value chain or plain open data process (Zuiderwijk, Janssen, Choenni, Meijer, & Alibaks, 2012). The different terminologies illustrate different purposes – practical guidance (Hyland & Wood, 2011) or analytical separation – and foci. Whereas value chain models focus more on the creation of value during open data usage (Julien, 2012), the life cycle models aim to structure the handling of the data itself. Existing process models focus on activities within public administration, such as generating, editing and publishing the data without paying too much attention on the outside-use.
Most models contain similar elements and differ only regarding semantics, granularity or the extension of the process. Hyland et al. (2011) provide a six-step guidance model that contains the steps to (1) identify, (2) model, (3) name, (4) describe, (5) convert, (6) publish the data and the reverse activity to maintain it, similar to Villazon-Terrazas et al. (2011). Another model by Hausenblas et al. (2010) also includes the user perspective, adding the steps “discovery”, “integration” and “use cases”. With the ambition to build tools to support creating linked data, the LOD2 project developed a more granular 8-step lifecycle model (Auer et al., 2012). LOD2 broadly distinguishes citizens, public administration, politics and industry as the main stakeholder groups and additionally media and science. Thereupon user types are derived, namely: “producer and publisher”, “user and producer”, “user and consumer” (M. Martin, Kaltenböck, Nagy, & Auer, 2011). Synthesizing various models, van den Broek et al. (2011) derive a lifecycle model comprising the steps (1) identification, (2) preparation, (3) publication, (4) re-use and (5) evaluation.
All of these models have in common that they describe a consecutive, one-dimensional arrangement of activities that an unspecified set of actors repeatedly undertake in order to provide a formerly unexposed amount of data to an abstract general public. Furthermore, these models incorporate only one analytical level. They predominantly take the operational day-to-day processes into account (such as extracting, cleaning, publishing and maintaining data), while largely neglecting the strategic processes (such as policy production, decision making and administrative enforcement). Therefore the decision making processes of which data will be published, who extracts data, how are data edited, how data can be accessed, which licenses are available, how data privacy and liability issues are treated, who is involved in these decisions etc. remain under-appreciated. These issues point to another deficiency of most open data process models: These process models are mostly actor-blind. Van den Broek et al. (2011) assign five internal stakeholder roles to the various steps of the life cycle: “top management, information manager, legal advisor, community manager and data owner”. Furthermore, they make some reference to the strategic issues, but intermingle them with the operative process.
However, there is some literature that takes a broader perspective at the processes around open data, at the policy-making-level as well as at the implementation process (see e.g., Blakemore & Craglia, 2006; Courmont, 2012; Heimstädt et al., 2014; Zuiderwijk & Janssen, 2014). With regard to policy-making content-related analyses illustrate considerably different emphases (Huijboom & van den Broek, 2011), however with little regard to stakeholders involved and the role they might play (see Huijboom & van den Broek, 2011; Zuiderwijk & Janssen, 2014). With special reference to the European level, Blakemore and Craglia (2006) point out the role the European commission, in particular its Directorate General responsible for the information market, plays in the shaping of the understanding and regulation of public sector information, as well as the national governments represented in the Council of Ministers. The latter largely act as advocates of PSI producers who in general favour a restrictive understanding of PSI and want to preserve their rights to charge for the dissemination of data (K. Janssen & Dumortier, 2003). Due to the limited authority the EC can exert in this area, the national government largely retained their autonomy to decide how to disseminate data. Merely in regard to geospatial data has a wider agreement been reached that also involves conventions about standards (quality, data and meta data harmonization) (Blakemore & Craglia, 2006). Here, the inclusive approach also involved domain experts and various online public consultations (Blakemore & Craglia, 2006). Thus, while the EC is lacking far-reaching legal authority, it shapes the discussion by influencing the agenda and reaching various stakeholders.
Critically looking at who these stakeholders are especially in the PSI re-use industry, Bates (2012) distinguishes between multi-national corporations and conglomerates from various industries, SMEs, micro enterprises, independent developers and voluntary civic hackers (see also Mayer-Schönberger & Zappia, 2011). This distinction is largely based on size, only the latter category taking into account the different motives. Nevertheless, Bates draws a distinction between benevolent and naïve transparency activists and profit-seeking and exploitive corporations (Bates, 2012). Similar distinctions are sometimes drawn between transparency and accountability advocacy on the one hand and commercial re-use on the other (K. Janssen, 2012; Yu & Robinson, 2012). In regards to open data, the two groups interests largely overlap, but also show significant differences in terms of contents, shape and rights of use of the data. Regarding the content of data “for innovation and economic growth this generally includes geographic data, postcodes, transport data, corporate data and other business information [whereas a]ccountability advocates will rather be interested in budget and spending data, legal information, and procedural items such as meeting minutes and reports” (K. Janssen, 2012). Regarding the shape of the data, the role of technology has become more prominent in open data compared to freedom of information, stressing issues like machine- readability, formats etc. (Yu & Robinson, 2012). Open data activists also tend to be more technology- savvy than traditional transparency advocates (K. Janssen, 2012). The most pronounced difference between the transparency and re-use is the rights-debate: Whereas transparency is about access rights in the context of freedom of information, re-use of PSI puts stronger emphasis on rights of use in terms of licensing (commercial vs. non-commercial, liabilities etc.). Considering these differences, we will separate these groups in our stakeholder classification in terms of advocacy as well as (intermediary) users, e.g. with civic activists coming from the transparency movement and independent developers putting stronger emphasis on re-use.
The political nature of decisions during the implementation of open data (portals) has been discussed by Courmont (2012) who focuses on the politics of legal, economic and technical decisions. The actors involved are not at the centre of the article and only cursorily mentioned, such as open data infrastructure providers (e.g. Socrata) and advocacy groups from civil society. Courmont states that these political choices are rarely discussed and often “imposed by public authorities without any debate” (Courmont, 2012), thereby treating them as a monolithic bloc. With some more detail, a distinction between the policy-level and the government agencies actually owning the data is sometimes made (see e.g. Huijboom & van den Broek, 2011; van den Broek et al., 2011). However, a more fine-grained understanding of the stakeholders involved in implementation seems necessary to comprehend the shaping of open data in this crucial phase.
References can be found here: OpenDataMonitor Project – Shared References