Data analysis, round 2.
Fig. 1 – xy – Click to enlarge
As before, ach red dot represents a character in the show, and each blue vector represents one of the personality traits. The direction and magnitude of each vector represents its contribution to the principal components (the data can be summarized in terms of two components which account for most of the variance – Component 1 and Component 2).
However, this time, the data had more variance, so I plotted using three dimensions. The graph above has Component 1 (x axis) versus Component 2 (y axis), and is the most useful. The graphs below are Component 1 and Component 2 versus Component 3 (z axis), which provides a little more insight into the results.
Figs. 2, 3 – xz, yz – Click to enlarge
So, now for the actual analysis of the results.
When comparing this data set to the previous one, two things jumped out at me. First, this data set contains a lot more variance, which is only to be expected because of the larger sample size. Unfortunately, that's bad for interpreting data – or so I thought. When I analyzed it, the results actually turned out better than the first set, and this is apparent in Fig. 1. If you look at the traits, you'll notice that Humorous, Casual, and Witty are all more or less aligned along Component 1 (x axis), and that Serious, Intellectual, Passionate, and Witty are aligned along Component 2 (y xis).
This is great news for the analysis, because it means that that Component 1 can be interpreted as representing light-heartedness, while Component 2 represents seriousness.
Once that interpretation is reached, Figs. 2 and 3 become more important. Fig. 2 (left) represents light-heartedness treating seriousness as a constant, and Fig. 3 (right) does the converse; the graphs show the absoluteness of the two general traits. Component 3 (z axis) can't be correlated with any overarching traits, so it's essentially irrelevant. But, it does serve to relate the characters with the traits.
So, here's a character-by-character rundown of the data.
Applejack: Serious, witty, and humorous, with a tendency towards being passionate.
Fluttershy: Difficult to interpret, but shows a slight tendency to seriousness.
Pinkie Pie: Intellecual, serious, and the least casual.
Rainbow Dash: Tending slightly towards serious and passionate. Overall, pretty similar to Fluttershy, but more humorous.
Rarity: Passionate, witty, and prone to image-posting.
Twilight Sparkle: Casual and humorous, sometimes posts images, and with a slight tendency towards being intellectual.
Now, it's interesting to note the differences between these traits and the characters as they appear on the show, especially with Pinkie. That would seem to indicate flaws with the survey itself and how it calculates personalities.
I've also been looking at the statistics of the posters' personalities. The most common is Twilight Sparkle with a hint of Fluttershy, followed by Applejack + Twilight Sparkle. That shows that there's a lot of diversity on the thread, but that there are sort of two main groups of posters – the casual ones and the serious ones. Now we already knew that, but what's interesting to note is that the serious posters have a more casual side, while the casual posters don't really have a serious side. This can be seen in two ways; by looking at the combinations of characters, or by looking at how one-dimensional the Casual and Serious vectors are in Fig. 1. It's also interesting to see that the shortest vectors are Images and Passionate, which shows that they don't depend so much on personality.
So, overall, I think this little project has been a success; I've learned a lot about the posting habits of the different users, and now I better understand the dynamics of the thread as a whole. And, as a scientist, I always feel compelled to understand everything, and I hope that whoever bothers to read this gains a better understanding of the thread too.
Sorry about pushing this from the last page.