This subsection analyses models that conceptualise the practices around handling data, from its generation to administrative practices involved in the provision of open data by public sector institutions to its use by third-parties. Various models of (linked) open data have been suggested under different terminologies. They have been named the open data life cycle, the open data value chain or plain open data process (Zuiderwijk, Janssen, Choenni, Meijer, & Alibaks, 2012). The different terminologies illustrate different purposes – practical guidance (Hyland & Wood, 2011) or analytical understanding – and foci. Whereas value chain models focus more on the creation of value during open data usage (Julien, 2012), the life cycle models aim to structure the handling of the data itself. Existing process models focus on activities within public administration, such as generating, editing and publishing the data without paying too much attention on the outside-use.
Most models contain similar elements and differ only regarding semantics, granularity or the extension of the process. Hyland et al. (2011) provide a six-step guidance model that contains the steps to (1) identify, (2) model, (3) name, (4) describe, (5) convert, (6) publish the data and the reverse activity to maintain it, similar to Villazon-Terrazas et al. (2011). Another model by Hausenblas and Karnstedt (2010) also includes the user perspective, adding the steps “discovery”, “integration” and “use cases”. With the ambition to build tools to support creating linked data, the LOD2 project developed a more fine-grained 8-step life cycle model (Auer et al., 2012). Synthesising various models, van den Broek et al. (2011) derive a life cycle model comprising the steps (1) identification, (2) preparation, (3) publication, (4) re-use and (5) evaluation.
All of these models describe the life cycle as a sequential, one-dimensional process of activities that an unspecified set of actors repeatedly undertake in order to provide a formerly unexposed amount of data to an abstract general public. Furthermore, these models include only one analytical level. They exclusively take the operational processes of open data publication into account (such as extracting, cleaning, publishing and maintaining data), while largely ignoring the strategic processes (such as policy production, decision making and administrative enforcement). Thus, the decisions which data will be published, who extracts data, how are data edited, how data can be accessed, which licenses are available, how data privacy and liability issues are treated, who is involved in these decisions etc. remain underappreciated. These more general strategic processes about open data refer to the governance structure, likely to be connected to an organisation’s ICT and data governance.
The issues outlined point to another blind spot of most open data life cycle models that these are actor-blind. If at all, institutional characteristics and actor-interests are considered as “impediments” (Zuiderwijk et al., 2012) or restrictions hindering an inherently good and beneficial idea (Meijer, de Hoog, Van Twist, van der Steen, & Scherpenisse, 2014). This is especially relevant as the different stakeholders involved – which have been outlined below in Section 3.3 – have different understandings of and interests in open data what influences the results (Zuiderwijk & Janssen, 2014). Efforts have thus been undertaken to develop more holistic analytic perspectives on open data e. g. based on complexity theory (Meijer et al., 2014) the information ecology approach (Harrison, Pardo, & Cook, 2012).
Furthermore, the data itself is often treated as “a commodity rather than an artifact” (Meijer et al., 2014). However, how (open) data is understood and interpreted is shaped by the institutional and legal context, e. g. different perceptions of privacy and personal data. In a similar manner, some data can be considered more politicised than other. Also, different professional perspectives on data that refers to the same material object influence not only the sense-making, but the consideration of what data is actually important, the metrics of measurement etc. Taken together, this might even question the viability of a generic life cycle model.
References can be found here: OpenDataMonitor Project – Shared References