The dimensions of open knowledge

Recently, the OpenEcon working group from the Open Knowledge Foundation released its first Open Knowledge Index, which “has been designed to measure and track progress in opening up information, data and knowledge in a broader sense to the public” (data may be downloaded here).

This is very good news as it is a way for comparing performance among different countries, as in the graph below, and an excellent opportunity to address important theoretical and empirical questions on the role of knowledge and information in democracy and democratization.

Open Knowledge Index score of the countries in the sample.

The Open Knowledge Index is a composite indicator that “captures three dimensions of knowledge” (see technical details here): capability (access to knowledge), legislation (availability of knowledge), and open society (effective use of knowledge and feedback). Each of these dimensions, in turn, is a sub-index created through the combination of different variables.

The authors acknowledge that the indicators are “in an early testing stage”, and I guess they will be refined in the near future. I am unaware of the discussions behind the design process of these indicators, and probably what I’m going to say has already been discussed. I will focus here on the “open society” index, which I think deserves some discussion.

This sub-index, which tries to capture “the capacity to use the data and feed it back into the open data ecosystem”, is created through the combination of three different variables:

Of these variables, I think that the use of the number of Wikipedia edits  to obtain a good measure of the openness of knowledge in society may present a number of potential problems.

A first problem is that, as noted by Jakob Nielsen some years ago, participation inequality (which is a common problem in political science) takes an extreme skewed form in online communities. Hence the well-known “90-9-1 rule”, by which 90% of users don’t contribute content at all, 9% only contribute from time to time, and 1% produce most of the content, producing a “long-tailed” distribution. In Wikipedia, their own data tell us that 82,800 active contributors are working on more than 19,800,000 articles. So, what is a value of this variable really telling us about a society when it is presumably showing a feature of a very small fraction of that society?

Moreover, a second problem is that behavior related to Wikipedia edits suffers also from a clear cultural and geographical bias, represented by the fact that 43% of the active contributors make their contributions in English, and more than 50% of contributions in English come from the United States.

A third problem might be associated to the fact that while the other two variables in this indicator somehow capture the institutional context of the “openness of knowledge ” (especially the World Bank’s governance indicators (AGI)), this third variable represents a behavioral dimension—i.e., how do (a small group of) people actually perform in this context. I’m not sure what does this behavioral component contribute to this indicator.

In conclusion, as a part of an indicator of the openness of knowledge in society, these systematic error components, in my opinion, should be addressed in the discussions on the validity of the measurement—i.e., what is really that we want to measure with this variable.

Below is the R code to freely reproduce the graphic in this post:


######################################
#DOTPLOT OF THE OPEN KNOWLEDGE INDEX
######################################
#LOAD THE DATA
data <- read.csv("data/open_knowledge_indicator_0.1.csv",
                 header=TRUE,
                 sep=",")
#SORT DATA
order.index <- data[,c(1,3)]
order.index <- order.index[order(order.index[,2]),]

#PLOT IT IN A DOTCHART
dotchart(order.index[,2],
         pch=20,
         labels=order.index[,1],
         cex=0.8,
         xlab="Open Knowledge Index")

About these ads
This entry was posted in Uncategorized and tagged , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s