« Restarts | Main | Centrality and Class »

January 04, 2006

Predicting Guild Survival

In an earlier post, we looked at the percentage of guilds that are not seen again after a one-month period. About 21% of guilds are not seen after a period of a month. Since then, we've focused a lot on creating metrics with which to compare guilds, including metrics that take into account the social network of each guild. This allowed us to address a question that we couldn't address before - How well can we predict a guild's survival rate?

To perform this analysis, we took the first and last 10 days of August and extracted all unique guilds in both samples. We only included a guild if its guild size was greater than one. We then calculated the following metrics for each guild:

Size: The number of unique characters seen to bear this guild tag over the 10 days.

Max Subgraph: The number of members in the largest subgraph. See this post for more information.

Mass Count: The number of subgraphs in a guild larger than 3. In other words, dyads are not counted.

Total Count: The number of subgraphs in a guild, including dyads.

Density: Connections between guild members can be mapped out as a matrix. The density of a guild is the percentage of cells that are filled in in that matrix.

Centrality: For each guild member, their degree centrality is the number of connections they have divided by the total number of connections they can have (i.e., the guild size - 1). The guild's centrality is the average of all of its character's centrality scores.

We then looked at the data from the last 10 days of August. If a guild seen in early August was not observed in late August, we marked it as "dead". Otherwise, we marked it as "survived". Using this method, we had 4,259 unique guilds in our sample. Of those, 924 (or 22%) were not seen at the end of August and marked as "dead".

We then ran a logistic regression with survival as the dependent variable and the metrics mentioned as the predictor variables. To make a long story short, Size and Mass Count were the only two substantial predictors. Both of those predictors were positively correlated with guild survival. Including the other predictors did not help our model significantly. The r-square for the model using these two predictors was .12.

We then went ahead and calculated the survival likelihood of guilds in our sample. Using a strict cut-off, the model that was provided by the logistic regression was accurate in 73% of the "death" cases and 69% of the "survival" cases. The model provided results that were better than chance alone, but not as strong as we would have liked. On the other hand, the exercise showed that simple metrics of individual guilds can be used to predict their long-term survival.

Server Sample: RP (High), PvE (Medium), PvE (High), PvP (High), PvP (High)
Sampling Period: First and last 10 days of August.
Sampling Resolution: ~12 minutes
Parsing Method: The sample unit is each unique guild with a guild size greater than 1.
Data Filter: None
Sample Size: 4,259 guilds

Posted by Nick & Nic

Posted at January 4, 2006 05:43 PM

Trackback Pings

TrackBack URL for this entry:
http://blogs.parc.com/cgi-bin/mt-tb.cgi/64

Comments

Post a comment




Remember Me?

(you may use HTML tags for style)