— Lora Aroyo (@laroyo) February 6, 2014
— Lora Aroyo (@laroyo) February 6, 2014
— Lora Aroyo (@laroyo) December 3, 2013
7 Nov 2013, Rosarium Amstel Park, Amsterdam
Interesting mix of different sciences and their data-driven research was presented today at the 1st annual Netherlands eScience Center symposium. Impressive diversity in terms of size of the data, types of the data, the sources from which it is collected and numerous use cases on how it used in research. The main topic for the event was “Optimizing Discovery in the Era of Big-Data”. Renè van Schaik introduced the goals and challenges of the center and defined it’s overall scope to achieve enhanced science = eScience.
Fascinating projects were presented by Willem Bouten, University of Amsterdam, on e-Ecology and the evolution of the field with the evolution of mobile phone and sensor technologies, web and big data processing. Watch a video on this research at NLeSC YouTube Channel.
Very interesting piece of history on hydrology was given by Nick van de Giesen, Technical University Delft in his talk on eWaterCycle – Since the 1700s founding discoveries of hydrology by Perrault, Huygens and Dalton (simplified here – the rain feeds the waters in rivers) there has been little groundbreaking discoveries, until the time of the web and large data research, which now provide grounds for grand challenges for water research. Watch a video on this research at NLeSC YouTube Channel.
Before lunch the winners were announced of the four edition of the Enlighten Your Research Award for 2013 (aka the big data challenge), which was selected by a jury from representatives of surfSARA, surfNET, SURF, NWO and NLeSC and was presented by the project team members #EYR4.
Check out the event timeline on twitter
Friday 22 Nov 2013, 16.00-17.00, Science Park 904 (room B0.201)
Julia Noordegraaf & Angela Bartholomew (Faculty of Humanities, UvA)
Modeling Crowdsourcing for Cultural Heritage
The Modeling Crowdsourcing for Cultural Heritage (MOCCA) project aims to help steer more effective crowdsourcing projects for galleries, libraries, archives, and museums. The outcome is a tool that helps cultural heritage professionals design effective projects. A first evaluation of existing models and projects shows that the specific conditions of individual crowdsourcing projects, such as the modalities of the institutions and collections, the level of openness, rewards and other forms of crowd management, greatly contribute to a project’s success or failure. Our challenge has become to model these conditions in a structure that allows heritage professionals to determine the design criteria relevant for their specific purposes. My presentation will focus on this modeling problem as input for a brainstorm and discussion.
Lora Aroyo (Computer Science Department, VU University Amsterdam)
Crowd Truth: Disagreement in Crowdsourcing is not Noise but Signal
One of the critical steps in analytics for big data is creating a human annotated ground truth. Crowdsourcing has proven to be a scalable and cost-effective approach to gathering ground truth data, but most annotation tasks are based on the assumption that for each annotated instance there is a single right answer. From this assumption it has always followed that ground truth quality can be measured in inter-annotator agreement, and unfortunately crowdsourcing typically results in high disagreement. We have been working on a different assumption, that disagreement is not noise but signal, and that in fact crowdsourcing can not only be cheaper and scalable, it can be higher quality. In this paper we present a framework for continuously gathering, analyzing and understanding large amounts of gold standard annotation disagreement data. We discuss the experimental results demonstrating that there is useful information in human disagreement on annotation tasks. Our results show .98 accuracy in detecting low quality crowdsource workers, and .87 F-measure at recognizing useful sentences for training relation extraction systems.
Moderator: Maarten de Rijke (Informatics Institute, UvA)
Date and Time: Friday 22 November 2013, 16.00-17.00 (followed by drinks)
Location: Science Park, room: B0.201, Science Park 904, 1098 XH Amsterdam
This year at the eXtreme Blue symposium, once again I was honored to receive the 2013 IBM Faculty Award for our work on “Crowd Truth“. It was presented by Eric Auvray (General Manager of IBM Benelux, left) and Gerard Smit (CTO IBM Benelux, right).
Following my 2012 IBM Faculty Award Chris Welty and I formed a team of students Anca Dumitrache, Oana Inel, Guillermo Soberon and Hui Lin and with support from IBM-NL by Robert-Jan Sips and Manfred Overmeen.
Chris Welty and I have presented our idea (at WebSci2013) on how to harness the disagreement between crowd text annotators in order to build gold standard data, which is closer to how people interpret relations between medical terms in text. You can read a nice trip reports on the WebSci2013 and CHI2013 conferences by my colleagues Paul Groth and Victor de Boer. Read the reactions on twitter for the event.
At the WebSci2013 I presented our most recent work on how to use events to provide meaning to cultural collection objects. We designed an evaluation framework for online access to cultural heritage, which enables the assessment of online cultural heritage applications in terms of their provision and support of information and interpretation. It is anchored in digital hermeneutics: the study and theory of the Web as a vehicle of (self)-interpretation. Digital hermeneutics considers the limits of automation and modelling on the one hand, and the interaction of people and technology on the other. We analyzed twelve Web applications, representing the range of current state of the art in this eld. This provides valuable insights into what cultural heritage applications on the Web do, can do, and how distinctive goals are to be achieved. We also reported on three user studies with the Agora demonstrator which made us reconsider a number of assumptions we made about the user’s needs for information and interpretation. Read the reactions on twitter for the event.