Data Power Conference
Panel Session 3c): Data Practices (Chair: Stefania Milan)
Challenges for an Ethnographic Approach to Big Data: Bringing Experiments into the Fieldwork
Tomas Ariztia, Universidad Diego Portales
New digital and transactional datasets (commonly called “Big Data”) have become increasingly central spaces for producing knowledge in markets. In doing so, Big Data knowledge practices and devices have become a critical space in which social forms are enacted or provoked in contemporary knowing capitalism (Ruppert et al 2013). Nevertheless, Big Data knowledge practices appear as a very elusive and difficult research object for social scientists: they are complex knowledge assemblages that involves the mobilization of multiple and different kind of entities (such as datasets, algorithms, data infrastructures or professionals) which relates to processes and practices often located in different spaces and times.
This paper describes an experimental exercise designed to ease an ethnographic approach to big data knowledge practices. Concretely, the paper describes the design and execution of a big data project aimed to help a personal finance startup to “visualize” and analyze the transaction of its users. The paper first discusses the challenges involved in taking an ethnographic approach to big data knowledge practices. It then describes the design and execution of an experimental exercise, that is, the artificial recreation a of a big data consultancy work with the help of engineer students. It concludes reflecting on some of the implications of provoking such artificial situations for researching Big Data knowledge practices. By taking a pragmatic approach (Muniesa 2014), it argued that experimental situations oriented to provoke specific realities might help social scientists to unpack the often-inaccessible collection of practices and devices that made up the world of Big Data.
The Complexities of Creating Big-Small-Data: Using Public Survey Data to Explore Unfolding Social and Economic Change
Emily Gray and Stephen Farrall, University of Sheffield, Colin Hay, Sciences Po, and Will Jennings, University of Southampton
Bold approaches to data collection and large-scale quantitative advances have long been a preoccupation for social science researchers. In this paper we expand methodological debate on the use of public survey data and official statistics with ‘Big Data’ methodologists. We introduce a new data-set that will be available for public use from October 2015. It integrates approximately thirty years of public data on victimisation, fear of crime, social and political attitudes with a wide variety of national socio-economic indicators in England and Wales. In presenting this new resource we highlight the frequent complexities of working with this type of secondary data; the validity and reliability of using historical measures, the time-intensive nature of its cleaning and collation and the methodological and substantive implications for social science researchers of bringing together multiple traditionally ‘small’ data-sets into one ‘big’ compendium.
The Construction of Twitter Databases: Empirical Case Studies on the Socio-Technical Meaning of Twitter Data as a Research Tool
Evelien D'Heer, iMinds-MICT-Ghent University, and Pieter Verdegem, Ghent University
This paper deals with methodological challenges related to Twitter research. In particular we focus on (1) unfound users and deleted tweets (that resurrect), (2) URLs that do not link (correctly) and (3) the limits of hashtag samples to study conversations. The empirical case studies we present are part of a larger research project on social media, elections and public debate. These issues are not unique for our data, but are of general relevance for anyone working with Twitter data.
Departing from the idea that a database is “anything but a simple collection of items” (Manovich, 2001, p. 194), we scrutinize the way APIs deliver and structure data. Based on our case studies, we understand datasets as textual representations of user activity (e.g. images are stored as URLs), presented in chronological rather than “conversational” order. In addition, whereas data collection is real-time, the manual analysis of the data often is not, resulting in unidentifiable users and tweets. Last, APIs provide “exact matches” for our hashtag-based data requests. However, when we include non-hashtagged responses, we notice the hashtag approach systematically underestimates reciprocity between users.
We departed from a selection of empirical cases to understand Twitter data(bases) as constructions. In general, awareness on the construction of Twitter data is crucial, as we build upon this data to explain socio-cultural phenomena.
Social Media Marketers and the Limits of Data
Jeremy Shtern, Ryerson University, and Tamara Shepherd, London School of Economics and Political Science
Social media platforms have been said to revolutionize not only social relations among people, but also the relationships between brands and people through new marketing techniques predicated on networked sociality and access to personal demographic and behavioural information. Typically, critical studies of social media marketing focus on the political and ethical dimensions of advertisers’ use of data, cross-referenced within the exponentially expanding sphere of “big data” (Andrejevic 2014; boyd & Crawford 2012). Such studies tend to frame networked sociality – the prevailing organization of communities within ephemeral information networks (Wittel 2001) – as the basis for contemporary marketing techniques that quantify and commodify users’ relationships through data (e.g., Turow 2008; 2011). The typical concern with this quantification process is that it breaches personal privacy in the quest to refine predictive behavioural targeting that will shape users’ consumption patterns and tastes through immanent surveillance (Campbell & Carlson 2002).
To interrogate the validity of these kinds of claims, this paper presents the results of an empirical investigation into how marketing professionals actually interface with social media. These professionals describe their uses of social media within marketing practices through a narrative of learning curves involving a re-casting of traditional advertising campaigns into longer term brand engagement, where the cautious use of data revolves around real-time monitoring and customer relations more so than targeting and predictive advertising. Indeed, respondents often had more to say about the limitations of data collection and use in social media marketing than its benefits. This theme of the limits of data pervades our rejoinder to critical considerations of data-based marketing techniques through social media. By considering how data is actually implemented in the social media practices of working marketers, we suggest that additional conceptual work is needed to account for the ways in which the pragmatics of contemporary marketing might mitigate or at least complicate the potential threats posed by the collection and use of personal data.