I asked my interviewee whether she could contribute some of the data she has so far. She said I could get all that information from the same sources she used, that all of those sources were public. And that made me think…. On the one hand, I thought she was right. All the information she had came from other sources which we can all access. She had not created new data but just worked on existing data. However, on the other hand the sets of data she had been working on represented new pieces of information. The work she carried out on that data transformed it in new data. Data+Work=NewData. So I guess new arrangements of data provide new meanings to that information.
A simple example: You can access an international online database and download a list of publications on economy. You enter that list in a spreadsheet and filter the publications belonging to a particular Oxford University author. Then you attach that list to the author’s profile and you get his bibliography, all his publications since before he joined Oxford. You can do that with all economists in Oxford and you will get a number of bibliographies from Oxford economists. This list will correspond to the produce of Oxford economists across their careers.
You can also use that list to filter all the publications from authors whose affiliations are Oxford University, current staff, or staff who has already left, but who produced the publications while they were working in Oxford. This second list is the produce of Oxford economists in Oxford.
Both lists come from the same source, and possibly they contain the same set of fields, however they represent different things.
When I did my PhD I came across a book called “Information, systems and information systems: making sense of the field” by Peter Checkland and Sue Holwell (1998) a must read if you are in the Information Systems field! This book is about information systems, their creation and relation to IT. In chapter four they discuss the concepts of Data, Information and Knowledge, and they introduce the concept of Capta. These concepts may help you to understand all these processes of transformation of data and how they can acquire different meanings. For Checkland and Holwell (1998) data represents all these masses of facts, observations and concepts that exist in the Universe. Once data are captured as part of an information system, a conversation or any kind of interaction they become Capta. Capta therefore are a subset of data which have been selected through a purposeful process, i.e., according to a criterion which fits a particular purpose. Capta are transformed into Information when they are given meaning and context by their interpreters. Because they depend on interpretation, a subjective process, information can have different meanings to different people. Finally, large structures of information form Knowledge.
Now, how can I explain this in the context of the BRII project?
Well, all these processes of transformation of research activity data into capta and into information happen all over the University. People acquire research activity data from different internal or external sources and transform them according to their needs. New people may use these transformed data and give them new meanings, again according to their new contexts. This seems like a mess, but
- BRII will sit in the middle, facilitating these processes.
- BRII will extract capta from data and store it in the Research Information Infrastructure (RII).
- If we see this from a different angle, within the universe of the RII and call its content data again, I can say that BRII will provide the means to reuse that data and transform it into capta, capta for every system or individual who access the RII looking for information. (Different purposes and different contexts.)
- The RII will be a tool to give meaning to huge, disparate, disconnected, complex set of Research Activity data.
What am I in this context?
I am the person who is looking for these sources of data and capta and tries to understand what they mean to their users and what new meanings can be obtained from them in the future.
How do we know the purpose of capta, capta which has been adapted by some people from other sources of capta also within the University. Should the RII store only sets of raw data and allow its users to transform them? Should we use all these sources, data and capta? How?
Again, the answer lies on the semantic web. Semantic web technologies allow the labelling of data with labels which are meaningful to peolple and to computers, that is, tagging data with meaning. From the context of the RII these labels convert data into capta. From the context of the services accessing the RII tags are the means through which users can convert data into capta for their own needs.