SOTF Word-Count Project
Posted: Wed Oct 17, 2018 6:41 pm
Welcome to the SOTF Word-Count Project Thread. This thread tracks the word-counts of completed versions of SOTF, mostly as a point of curiosity. It bears constant repeating that word-count is not indicative of character quality in any way. It is sort of indicative of character length, though even as a metric of that it's imperfect; to get a character's full story you'll typically also be reading their threadmates, who can vary wildly; word-count illustrates only the rough length of a character's personal narrative as presented in their own posts.
Version-wise, word counts may be inaccurate to greater or lesser degrees. Due to my method of calculation, shared text is counted twice (or more, depending on the number of characters sharing it), inflating the total. Text not belonging to any character on the death order (the V4 death squad, mysterious seagulls, etc.) and announcement text is not counted at all, deflating the total. I suspect the inflation is significantly greater than what is compensated for by the deflation, but don't care nearly enough to check; per-version counts are a happy accident of the parsing of the data that actually interests me (the character data).
I'm ordering each version both by death order and by size from short to long. In addition, I am pulling out a long-to-short Top Twenty-Five for all versions including more than 100 characters (every Main version to date) and a Top Ten for every version including less than 100 characters (every Mini version to date). I'm also tracking the average length of characters as the game progresses. For my purposes, game progress is measured in announcements, which are in turn roughly reflective of death order. This is a flawed but functional enough metric.
I've noted characters who fall into a number of unique circumstances. Most commonly, characters with a @ symbol after their names share a chunk of at least one post with at least one other character, inflating their total. In some cases, this is about 100 shared words in a single post. In others, literally all of a character's story is shared text. I made no effort to differentiate; any line drawn would be arbitrary and add lots of extra work.
I'm also compiling an all-time ranked count list, updated for each version. This will be available in an overall list containing every character to that point, ranked short-to-long, and a Top Fifty, ranked long-to-short. I have maintained this scheme even for the first version on Main and Mini, producing some quirky results as the price for consistency. Upon overall completion of both sites' counts, I will create one combined short-to-long list and a long-to-short Top One Hundred. I'm not going to try pinning the two sites to each other retrospectively; Mini didn't even exist until V4, and I can't remember the precise timings (and find such fiddliness of little worth). Minis are ordered by start date, as many ran in part simultaneously.
This initial post will serve as an index, with links added as my analysis is completed. Feel free to reply, discuss, etc.—the post order doesn't matter since the index will link directly to each post. I will only be adding versions to this archive upon completion, and once more i caution against drawing any conclusions from this—long characters are not by default better or even more involved or developed. Do not make me regret this by padding word counts going forward. That's a bad time for everyone and makes posts miserable to read.
With no further ado:
Version-wise, word counts may be inaccurate to greater or lesser degrees. Due to my method of calculation, shared text is counted twice (or more, depending on the number of characters sharing it), inflating the total. Text not belonging to any character on the death order (the V4 death squad, mysterious seagulls, etc.) and announcement text is not counted at all, deflating the total. I suspect the inflation is significantly greater than what is compensated for by the deflation, but don't care nearly enough to check; per-version counts are a happy accident of the parsing of the data that actually interests me (the character data).
I'm ordering each version both by death order and by size from short to long. In addition, I am pulling out a long-to-short Top Twenty-Five for all versions including more than 100 characters (every Main version to date) and a Top Ten for every version including less than 100 characters (every Mini version to date). I'm also tracking the average length of characters as the game progresses. For my purposes, game progress is measured in announcements, which are in turn roughly reflective of death order. This is a flawed but functional enough metric.
I've noted characters who fall into a number of unique circumstances. Most commonly, characters with a @ symbol after their names share a chunk of at least one post with at least one other character, inflating their total. In some cases, this is about 100 shared words in a single post. In others, literally all of a character's story is shared text. I made no effort to differentiate; any line drawn would be arbitrary and add lots of extra work.
I'm also compiling an all-time ranked count list, updated for each version. This will be available in an overall list containing every character to that point, ranked short-to-long, and a Top Fifty, ranked long-to-short. I have maintained this scheme even for the first version on Main and Mini, producing some quirky results as the price for consistency. Upon overall completion of both sites' counts, I will create one combined short-to-long list and a long-to-short Top One Hundred. I'm not going to try pinning the two sites to each other retrospectively; Mini didn't even exist until V4, and I can't remember the precise timings (and find such fiddliness of little worth). Minis are ordered by start date, as many ran in part simultaneously.
This initial post will serve as an index, with links added as my analysis is completed. Feel free to reply, discuss, etc.—the post order doesn't matter since the index will link directly to each post. I will only be adding versions to this archive upon completion, and once more i caution against drawing any conclusions from this—long characters are not by default better or even more involved or developed. Do not make me regret this by padding word counts going forward. That's a bad time for everyone and makes posts miserable to read.
With no further ado:
Index
V1 Stats / V1 Pre/Post-Game Stats
V2 Stats / V2 Pre/Post-Game Stats
V3 Stats / V3 Pre/Post-Game Stats
V4 Stats / V4 Pre/Post-Game Stats (Part 1, Part 2)
V5 Stats / V5 Pre/Post-Game Stats (Part 1, Part 2)
V6 Stats
V1 Stats / V1 Pre/Post-Game Stats
V2 Stats / V2 Pre/Post-Game Stats
V3 Stats / V3 Pre/Post-Game Stats
V4 Stats / V4 Pre/Post-Game Stats (Part 1, Part 2)
V5 Stats / V5 Pre/Post-Game Stats (Part 1, Part 2)
V6 Stats