Building the Research Information Infrastructure (BRII): Some thoughts about Data

Friday, 27 March 2009

Some thoughts about Data

One of the interviews I conducted this week left me thinking on the issues of transformation of data and its different meanings. I was talking to this lady about her work with research activity data. She explained me how she collects information from various sources, enters that information in a spreadsheet and accommodates it according to her needs. This accommodation involves the organization of data in columns, the correction of errors, filtering records and adding new ones from other sources. All this transformed data will be entered in a research portal.

I asked my interviewee whether she could contribute some of the data she has so far. She said I could get all that information from the same sources she used, that all of those sources were public. And that made me think…. On the one hand, I thought she was right. All the information she had came from other sources which we can all access. She had not created new data but just worked on existing data. However, on the other hand the sets of data she had been working on represented new pieces of information. The work she carried out on that data transformed it in new data. Data+Work=NewData. So I guess new arrangements of data provide new meanings to that information.

A simple example: You can access an international online database and download a list of publications on economy. You enter that list in a spreadsheet and filter the publications belonging to a particular Oxford University author. Then you attach that list to the author’s profile and you get his bibliography, all his publications since before he joined Oxford. You can do that with all economists in Oxford and you will get a number of bibliographies from Oxford economists. This list will correspond to the produce of Oxford economists across their careers.

You can also use that list to filter all the publications from authors whose affiliations are Oxford University, current staff, or staff who has already left, but who produced the publications while they were working in Oxford. This second list is the produce of Oxford economists in Oxford.

Both lists come from the same source, and possibly they contain the same set of fields, however they represent different things.

When I did my PhD I came across a book called “Information, systems and information systems: making sense of the field” by Peter Checkland and Sue Holwell (1998) a must read if you are in the Information Systems field! This book is about information systems, their creation and relation to IT. In chapter four they discuss the concepts of Data, Information and Knowledge, and they introduce the concept of Capta. These concepts may help you to understand all these processes of transformation of data and how they can acquire different meanings. For Checkland and Holwell (1998) data represents all these masses of facts, observations and concepts that exist in the Universe. Once data are captured as part of an information system, a conversation or any kind of interaction they become Capta. Capta therefore are a subset of data which have been selected through a purposeful process, i.e., according to a criterion which fits a particular purpose. Capta are transformed into Information when they are given meaning and context by their interpreters. Because they depend on interpretation, a subjective process, information can have different meanings to different people. Finally, large structures of information form Knowledge.

Now, how can I explain this in the context of the BRII project?
Well, all these processes of transformation of research activity data into capta and into information happen all over the University. People acquire research activity data from different internal or external sources and transform them according to their needs. New people may use these transformed data and give them new meanings, again according to their new contexts. This seems like a mess, but

BRII will sit in the middle, facilitating these processes.

BRII will extract capta from data and store it in the Research Information Infrastructure (RII).

Using Checkland and Holwells’s concepts, we can define Research Activity data as capta. Data selected from vast sources which represents and describes only research activities. BRII will ignore data which does not fit this criterion. So the RII will be a container of capta in the sense that it will only host research activity data.

If we see this from a different angle, within the universe of the RII and call its content data again, I can say that BRII will provide the means to reuse that data and transform it into capta, capta for every system or individual who access the RII looking for information. (Different purposes and different contexts.)

As explained in the example above, most sets of existing data are data which have been filtered and worked on according to criteria which depend on the contexts and purposes of their owners. Another example, the list of researchers in a departmental website is a subset of researchers of the University. This subset was selected by checking on the affiliation of each researcher to the department or his/her work within a research group or project within the department. The same researcher may appear in another website as he/she is involved in other research activities. However, this researcher does not appear in a thematic website as his/her interests are different. Ideally the RII will hold a list of all researchers in Oxford and by accessing the RII people will be able to extract these subsets of data. This data will acquire different meanings depending on the criteria used for its extraction and on the place and way it is shown.

The RII will be a tool to give meaning to huge, disparate, disconnected, complex set of Research Activity data.

The RII will be a big Information System supporting and allowing other systems to exist on top of it. The RII itself and the web services built on top of it will allow the transformation of data into capta and information by its users.

What am I in this context?
I am the person who is looking for these sources of data and capta and tries to understand what they mean to their users and what new meanings can be obtained from them in the future.

Some issues
How do we know the purpose of capta, capta which has been adapted by some people from other sources of capta also within the University. Should the RII store only sets of raw data and allow its users to transform them? Should we use all these sources, data and capta? How?
Again, the answer lies on the semantic web. Semantic web technologies allow the labelling of data with labels which are meaningful to peolple and to computers, that is, tagging data with meaning. From the context of the RII these labels convert data into capta. From the context of the services accessing the RII tags are the means through which users can convert data into capta for their own needs. Print this post

No comments:

Post a Comment

Our Goal

Building the Research Information Infrastructure (BRII) aims to support the efficient sharing of Research Activity Data (RAD) captured from a wide range of sources. BRII develops an infrastructure that harvests and archives RAD, and Web services which disseminate and reuse this kind of data by using a lightweight solution based on semantic web technologies. Phases of the project include: a stakeholder analysis to collect views from interested parties (e.g., academics and administrators); an iterative development process which uses information collected in the analysis phase; and an embedding and sustainability phase where user acceptance is assessed and strategies to support the expansion of the information research infrastructure are designed. Additional outputs of the BRII include: an application programming interface (API) for harvesting and querying data; a collection of ontologies and taxonomies used to organise and classify data; a themed Web site; and the Oxford Blue Pages displaying RAD in creative ways. By facilitating access to RAD, BRII expects to improve the research visibility of the institution and its research impact, as well as boost collaboration.

BRII Papers, Reports and Presentations

Rumsey, S. and Loureiro-Koechlin, C. (Forthcoming 2010) The role of an entity registry in scholarly communication: exploring creative uses of research activity data. New Review of Academic Librarianship.

Loureiro-Koechlin, C. (Forthcoming 2010) "Explaining abstract concepts with concrete examples - entity registry and research activity data." Sconul Focus.

BRII Project Completion Report to the JISC.

BRII Project Final Report to the JISC.

Blue Pages Video Clip A short demo of the Blue Pages (recorded 19th March 2010.)

BRII Stakeholder Grid A list of BRII's stakeholders, interests and challenges.

BRII Summative Evaluation report An independent evaluation led and facilitated by Neil Beagrie of Charles Beagrie Limited. (March 2010.)

Loureiro-Koechlin C. (2009) BRII Presentation at the Supporting research students - a unique book launch at Hull University Business School. (5th March 2010.)

Rumsey, S. (2010) BRII registry & other outputs A description of the pilot Research Activity Data Registry functionality, services and other outputs that will be developed by the project end (March 2010) and suggestions for further work.

Adding a researcher profile. Video clip demonstrating how to search for a researcher profile in the ORA registry and then embed this in a content managed website.

Loureiro-Koechlin, C. (2010) Uncovering user perceptions of research activity data (published in Ariadne, January 2010.)

Loureiro-Koechlin C. (2009) BRII Project - Use Cases report (project milestone, February 2010.)

Oxford Blue Pages Screenshots

Rumsey, S. (2009) A case analysis of registering research activity for institutional benefit (published in the International Journal of Information Management, 2009.)

Loureiro-Koechlin C. (2009) Selling an abstract concept to a practical audience (presented at the Modular e-Administration of Teaching (MEAoT) Assembly, Centre for Applied Research in Educational Technologies (CARET), University of Cambridge, 10 December 2009.)

Loureiro-Koechlin C. (2009) Building the Research Information Infrastructure (BRII) (published in Inside OR, November 2009.)

Loureiro-Koechlin C. (2009) Making sense of research activity data (presented at the OR51 conference, University of Warwick, 8-10 September 2009.)

Loureiro-Koechlin C. (2009) BRII Stakeholder Analysis report (project milestone, July 2009.)

Loureiro-Koechlin C. (2009) Reaching out to a big, complex university (presented at the Stakeholder Buy-In Assembly, SERS, Oxford University Library Services, University of Oxford, 9 June 2009.)

Bowtell, A. and Loureiro-Koechlin C. (2009) BRII Stakeholder Analysis and Sample Applications (poster presented at the Making Connections JISC event, 23-24 April 2009, Manchester.)

Building the Research Information Infrastructure (BRII)

Friday, 27 March 2009

Some thoughts about Data

No comments:

Post a Comment

About this Blog

Cecilia Loureiro-Koechlin

Project Website

Our Goal

BRII Papers, Reports and Presentations

JISC Assembly

Labels

Interesting Links

Blog Archive

Blog List

Building the Research Information Infrastructure (BRII)

Friday, 27 March 2009

Some thoughts about Data

No comments:

Post a Comment

About this Blog

Cecilia Loureiro-Koechlin

Project Website

Our Goal

BRII Papers, Reports and Presentations

JISC Assembly

Labels

Interesting Links

Subscribe to the BRII Blog

Blog Archive

Blog List