NAVIGATION:

 

homecurrent work › parc blog

SPOTLIGHTS:

PART 1: The slowing growth of Wikipedia: some data, models, and explanations - PARC blog

posted 22 July 2009 | ed h. chi    view bio

In September of 2008, we blogged about a curious change in Wikipedia that we didn’t know how to explain that we had known for a while, and the ASC group has been looking into understanding this change in the last 6-9 months or so. The change that we were curious about was that the growth rates of Wikipedia have slowed. We were not the only ones wondering about this change. The Economist (archived here), for example, wrote about it.

We are about to publish a paper in WikiSym 2009 on this topic, and I thought we should start to blog about what we found.


Monthly edits and identified revert activity

The conventional wisdom about many Web-related growth processes is that they’re fundamentally exponential in nature. That is, if you want some fixed amount of time, the content size and number of participants will double. Indeed, prior research on Wikipedia has characterized the growth in content and editors as being fundamentally exponential in nature. Some have claimed that Wikipedia article growth is exponential because there is an exponential growth in the number of editors contributing to Wikipedia [1]. Current research show that Wikipedia growth rate has slowed, and has in fact plateaued (See figure at right). Since about March of 2007, the growth pattern is clearly not exponential. What has changed, and how should we modify our thinking about how Wikipedia works? Prior research had assumed Wikipedia works on a “edit begets edit” model (That is, a preferential attachment model where the more an article gets edits, the more likely it would receive more edits, and thus resulting in exponential growth [2].) Such a model does not preclude some ultimate limitation to growth, although at the time it was presented [2] there was an apparent trend of unconstrained article growth.


Monthly active editor – number of users who have edited at least once in that month

The number of active editors show exactly the same pattern. The 2nd figure on the right shows how since its peak in March 2007 (820,532), the number of monthly active editors in Wikipedia has been fluctuating between 650,000 and 810,000. This finding suggests that the conclusion in [1][2] may not be valid anymore. We have a different process going on in Wikipedia now.


Article growth per month in Wikipedia. Smoothed curves are growth rate predicted by logistic growth bounded at a maximum of 3, 3.5, and 4 million articles.

Some Wikipedians have modeled the recent data, and believe that a logistic model is a much better way to think about content growth. Figure here shows that article growth reached a peak in 2007-2008 and has been on the decline since then. This result is consistent with a growth processes that hits a constraint – for instance, due to resource limitations in systems. For example, microbes grown in culture will eventually stop duplicating when nutrients run out. Rather than exponential growth, such systems display logistic growth.

We will continue to blog about what we believe might be happening in the next few weeks, as we find time to summarize the results.

[1] Almeida, R.B.m, Mozafari, B., and Cho, J., On the evolution of Wikipedia. ICWSM 2007, Boulder, Co., 2007.
[2] Spinellis, D., and Panagiotis, L. The collaborative organizations of knowledge. Communications of the ACM, 51(8), 68-73, 2008.



posted in uncategorized

Bookmark and Share

 

View Comments

July 28th, 2009 at 6:53am Posted by Yan Shikhvarger

These are very interesting points. I’d like to add some additional food for thought… There seem to be 2 issues as you point out: plateau in traffic and plateau in contributors.

1) Traffic: It’s important to note that Wikipedia’s traffic is often tied to search visibility for topics. Google is responsible for over 45% of referring traffic to Wikipedia, with Yahoo! coming in next at 9%, so search visibility is a huge traffic driver (see referral analytics). Yet, growth in search has not been exponential as well so that is perhaps correlating to what we are seeing here. This TechCrunch post charts search activity and while Google is growing — not exponentially of course because it is a pretty mature product — other search engines remain flat. (It is also interesting that Knol has not managed to take any real traffic away from Wikipedia.)

2) Contributors: Wikipedia’s contributor model is based on committed and experienced volunteers. There is a limit to this number of users because there does seem to be a learning curve with syntax and guidelines that are a barriers to continuous growth beyond the core group of contributors. I would also be curious to see if an organization like Mozilla is facing a similar issue in its contributor growth model. There could also be a certain “churn” factor for contributors as they can simply get burned out in doing the work; this is something that can be observed in bloggers as many do quit because of the commitment needed.

Wikipedia does seem to be entering a stagnation phase and that is never a good sign.

 

July 28th, 2009 at 11:38am Posted by Ed H. Chi

Thanks for commenting, Yan.

(1) Beyond search visibility, it may simply be a fundamental attention limit. Wikipedia received increasing site rank (according to Alexa.com) up until about summer of 2006 to about the 8th most-trafficked website in the world. It then became very difficult to get any more traffic. This fundamental attention limit may have had an effect on overall growth. There is simply a fixed amount of user attention to Wikipedia-oriented topics (ie., things that are more or less encyclopedic).

(2) Our later blog posts will examine these barriers (usability, community guidelines, rule limits), and how they affect the Wikipedia editor population. In a nutshell, newbies experience disproportionate resistance compared to more experienced editors, and this resistance has been increasing over time.

On churn, we have some results showing how the patterns have been changing over time. It would be very interesting to compare it against blogging data. (Do you have pointers to blogging measurements that show saturation in the blogging space?)

It’s unclear whether Wikipedia is entering stagnation or if it needs to switch to a maintenance mode. We will continue to monitor and contribute scientific understanding where we can…

 

July 30th, 2009 at 11:33am Posted by Yan Shikhvarger

These are great points and I am looking forward to the further posts. The notion you mention of experienced vs. newbies is very interesting as well and makes sense.

As far as blogger churn, I have not seen in depth data on this but just a few things from technorati

http://ideas.blogs.nytimes.com/2008/11/11/silence-of-the-blogs/

and

http://www.businessweek.com/the_thread/blogspotting/archives/2007/04/blogging_growth.html

 

July 30th, 2009 at 2:35pm Posted by Ed H. Chi

Thanks for the links, Yan. The 2nd link in particular was very useful and interesting, though it doesn’t point to hard evidence of why the behavior is changing. Stay tuned…

 

September 23rd, 2009 at 2:12pm Posted by Population shifts in Wikipedia (part three) - PARC blog

[...] Wikipedia research (previously covered in part one and part two) continues to get media attention, most recently including coverage in Time magazine [...]

 

November 23rd, 2009 at 11:40am Posted by A modified proposed model of Wikipedia growth (part four) - PARC blog

[...] mentioned in our first post on the slowing growth rate of Wikipedia [see also our second and third posts on the topic] it appears that Wikipedia article growth peaked [...]

 

January 28th, 2010 at 2:40pm Posted by The slowing growth of Wikipedia (part two): resistance from dominant editors - PARC blog

[...] one of this post, which shared findings on the slowing growth of Wikipedia, recently received coverage in the New Scientist (as well as Fast Company, Business Insider/ [...]

 

Post Your Comment