Cecilia: What is an Entity Registry and what is the difference with conventional stores?
Anusha: Lets first list our entities, so we are clear about what we mean when talking about an entity - person, organisational unit, publications (journal articles, books, chapters...), funder information and research activity info.
Now, the main difference between an entity store and conventional store (typically databases) is that in a conventional store, the columns relate to attributes of each entity and it needs to be created at the time of creation of the database. So how is this a problem? Well,
- We need to think up of all the attributes relating to the entity (example: all the attributes that make up a person) at the time of creation and we cannot change our structure very easily later on. (cost of change rises exponentially with time)
- All the people have to follow the same structure. So you cannot account for variations very easily.
- You cannot accomodate all of the multiple relationships easily. Eg - person belongs to multiple departments/colleges, person has multiple roles, person has multiple titles, person has multiple names
Cecilia: This sounds obviously relevant to BRII because we are collecting information from all kinds of sources around the University, and most importantly because we do not have control over the content or format of that information.
But beyond these technological advantages, what are the benefits that this way of organising data brings to scholars?
Anusha: The extra benefit to visitors is that we can show them multiple and different relationships between entities very easily (like collaborators, linking funders, research activity, people, departments in whatever way we want).
Or the entity registry will be transparent to them, as they will only see things like the Blue Pages.
The entity registry will be transparent or open to all to access our data. If they are interested in the data, they can build tools to analyse the data in whatever they want (we haven't yet done this, but will be doing so).
With a service like the Blue Pages, for a keen observer, the entity registry will be noticable. For other users no and rightfully so, as they need not know what's happening at the back (for example: we see nothing about how google does its work). They can however observe that some of them have a lot of information and are linked in multiple ways to other entities, while some others hardly have any information. The key thing is that the data can be linked very easily to other entities in multiple ways.
The power of the Blue Pages is mainly derived from three things
- Quantity of data - The more we have the better
- Variety - A one stop shop, visit one website rather than 10 different websites
- The way we present our data and the deductions / analysis we perform on our data (like finding collaborators). If we can think of more deductions like this, it would be useful and make our web service more powerful.
Cecilia: and how do you see all this helping scholarly communication?
Anusha: scholarly communication is more than just Journals. Journals were and to some extent still are primary sources of communication but they aren't the only sources. We now have institutional repositories which are helping with this. Also, "scholarly" in scholarly communication does not refer to the people, but to the type of communication. So its anyone (not just scholars) communicating on a scholarly topic.
Cecilia: yes and I guess that having all these connections facilitates these communications.