Civ4 AI Survivor Season 2: The AI-stralian Outback


Conclusions

Season Two of Civ4 AI Survivor concluded in May of 2015 with an outstanding finish in the Championship match. If you haven't seen the results yet and want to remain unspoiled, it's time to stop reading this page right now. The rest of this conclusion section goes into a detailed breakdown of the data from this year's competition. Special thanks go to Livestream viewer LinkMarioSamus for compiling most of this data and emailing it to me. I'm not sure I would have had the time to type up this summary without his help. We'll start with the overall ranking of the 52 leaders, presented in an Excel table screenshot like last year:

The overall shape of the results looked similar to the first Survivor competition, which was to be expected given that the same format was used again. There were fewer early eliminations this time around, as the first death took place on Turn 124 last year (Louis) and there were six eliminations on Turn 150 or earlier. Only one such early fall happened this year (De Gaulle). The opening round had fewer eliminations in general, leading to the huge Wildcard game, and there were more total survivors. Perhaps that's because the maps were better balanced this time around, fewer cases of one AI leader getting boxed in and rolled up early on. Twelve leaders managed to make it through this year's version of the Hunger Games without biting the dust, compared to only eight from last year. Exactly two leaders have survived both years: Zara Yacob and Catherine. I don't think either one comes as a huge surprise, they are both clearly strong AI performers.

It was a rough year for a lot of the multi-leader civilizations. France failed to get any of its three leaders out of the opening round last time, and failed once again this year. A French leader has also been dead last in both seasons, with Louis taking the spot in 2014 and De Gaulle occupying the same position in 2015. In Civ4 AI Survivor, the French leaders apparently are indeed cheese-eating surrender monkeys. England had a good run last year but couldn't get any of its three leaders into the playoffs this time around. Germany was an absolute disaster, with its two leaders occupying 50th and 46th places respectively. Ugh. Was that Weimar Germany in action? America had a poor competition as well, with FDR's 19th place finish in the Wildcard game the only thing salvaging the early exits of Lincoln (48th) and Washington (51st). Greece had the most polarized results, as Alexander tied for 48th place while Pericles made it all the way to the championship, finishing just off the medal stand in fourth.

At the other end of the spectrum, it was an excellent competition for the Mesoamerican civs. The Incans and Mayans took home two of the top three spots, with only the Aztecs having a poor showing. (Oh Monty, don't ever change! So much craziness.) The Celts once again sent both of their leaders into the playoffs, although for a second time neither one was able to make the championship itself. Brennus and Boudica appear to be the Buffalo Bills of the Survivor competition thus far. India also sent both of its leaders to the playoffs, and Russia again had a strong performance, if not quite as good as last year. I'll also mention that Egypt managed to improve its performance to "average" this year, up from "atrocious" after an early double elimination last year. Hatshepsut's troll-tastic survival performance saved the day for the Egyptians in the overall rankings.

We can also expand the leader rankings across both years of competition, with some very interesting results. De Gaulle truly stands out for his ineptitude as an AI leader, with an average finish of 51st place out of 52 competitors. It's truly hard to pull something like that off! The bottom spots are dominated by a mixture of excessive wonder-builders (Ramesses / De Gaulle / Louis) and religious fanatics (Montezuma / Isabella / Charlemagne). Those seem to be the AI personalities that do the worst. Willem's AI personality also appears to be pretty bad, with two early exits despite his incredible trait pairing. I've tried to find a consistent finishing pattern based on things like peace weight or aggression rating, and after two years I still can't see one. It's a mixed bag of leader personality types all the way up and down the list, with economic leaders and militaristic ones jumbled together. Neither one seems to be a guaranteed path to success in these games.

There are some definite surprises at the top of the list. The presence of familiar names like Catherine, Mansa Musa, Zara Yacob, Pacal, etc. are exactly what we would expect without even running the competition. Huayna Capac grades out as the overall top leader after two years, and there's a general consensus that Huayna Capac of Inca is the strongest restricted leader choice in Civ4 (or at least very close to it). Make sense he would be at the pinnacle of the AI Survivor pyramid. Mao's presence in second place is a real shocker though - I mean, Expansive / Protective? Mao graded out only 0.5 spots behind Huayna, and if the Incan leader hadn't snuck out that cultural victory in the championship, Mao would have been ranked the top leader overall. Just goes to show the strength of the amazing Chinese civilization. I also tended to overlook Cyrus and Pericles during the course of this season, but they both made the championship game, and that was on top of reaching the playoffs in the previous year. Cyrus AI and Pericles AI are apparently stronger than I thought. The Celtic duo round out the bottom of this group, also something that I never would have seen coming. Would you ever pick Brennus (Spiritual / Charismatic) or Boudica (Aggressive / Charismatic), both of them paired with the crummy Celtic civ, to be top ten finishers? They've been doing it for two years now, however, so shows what I know.

Some of the AIs have been models of consistency from year to year. Catherine finished in 9th place last year and 8th place this time, narrowly missing the championship both seasons. Three other leaders were nearly identical, from Mao's continued excellence (7th and 5th) to Hannibal's mediocrity (26th to 24th) to De Gaulle's unrivaled ineptitude (50th to 52nd). Conversely, Lincoln had the most variable performance, swinging 41 spots (!) from 8th place last year to 49th place this year. Other inconsistent leaders were Elizabeth (36 spots worse), Alexander (35 spots worse), and last year's champion Justinian (35 spots worse). Gandhi had the best improvement year over year, finishing 35 positions higher than in 2014. Something about that number 35, I guess. A lot of these leaders make logical sense to appear high on the variability scale, as Gandhi / Elizabeth will either run away with the game economically or get invaded and run over, with the converse result for a warmonger like Alexander who either succeeds with early aggression or fails and gets stunted. Justinian was probably lucky to win last year, and the Byzantine leader accomplished little of note in 2015.

Two years of data also allows us to evaluate the performance of the leader traits again:

These tallies use the leader ranking data from the previous section, averages of both competitions. The second season helped smooth out some of the randomness in results from the first season, producing a trait ranking that looks a lot closer to what we would expect to see. The biggest change was the cratering of Philosophical trait, which graded out as the best performing trait in 2014. I was surprised to see that result last year, since I'd never seen much evidence that the Philosophical trait boosted AI performance. But seven of the nine Philosophical leaders made it at least to the Wildcard game, five of them reached the playoffs, and two of them were in the championship. Either I had been wrong about the trait, or this was the result of small sample size bias. As it turned out, another year made a huge difference indeed. Philosophical was an awful performing trait in Season Two, with only two of the nine leaders finishing in the top half of the field. Seven of the nine leaders finished in 27th place or lower, including finishes in 48th place (Lincoln) / 48th place (Alexander) / 50th place (Frederick). Average score for a Philosophical leader this year was 32.56, the worst finish by any trait in either season. When averaged together with last year's results, the Philosophical trait grades out at 26.67, right in the middle of the field. If we get another year's worth of data, I expect it to slip below Charismatic and Spiritual. The AI always gets a million Great People on Deity difficulty; getting slightly more of them from Philosophical doesn't help very much.

With the Philosophical aberation from last year reverting back towards its likely normal place, we can see the truly most valuable traits for this competition. Once again Imperialistic is at the top of the pecking order, something that makes perfect sense from an AI perspective. We've repeatedly seen that size = power in these games, and Imperialistic leaders tend to grab more of it on average. Even the most marginal iceball locations are still profitable for the Deity AIs to possess, between their ultra-cheap growth costs and the pittance they pay in maintenance. The other top traits are the familiar faces that we would expect: Financial, then Creative, then Expansive, then Organized. These are the best economic traits in Civ4, and even the AI can't help but benefit from them. I was surprised when Creative graded out as the sixth best trait last year, behind Philosophical and Charismatic. This two year average looks a lot more like what we would expect to see there, with a strong season of Creative performances evening out the weak showing from last year.

The same three traits that finished at the bottom of last year's ranking remained there in this two year average. We tend to regard Aggressive and Protective as the two worst traits in our Multiplayer games; neither one offers much of anything in the way of an economic advantage, and the bonus to direct combat promotions gets nullified by the slower growth curve. I don't care if your Warriors have Combat I promotion if I'm playing Suryavarman and super-charging my early start with Expansive/Creative. (There are situations where Aggressive can be a good option, like short Ancient chokefests or late era games, but we typically don't use these settings.) The Aggressive and Protective traits don't seem to be much better in the hands of the AI. It doesn't help either that a lot of the leaders with these traits are far too militaristic for their own good, picking wars they can't win and stagnating their development in the process. The fact that Tokugawa gets the Aggressive/Protective pairing is pretty much a microcosm of why these traits stink.

And then there's the Industrious trait. This was the worst trait last year, and it was beaten out only by Philosophical for this year's lowest spot. It is firmly in last place in the two year average scores above. Only one Industrious leader made it to the playoffs; the other eight were all shut out, and six of the nine failed to make it even to the Wildcard game. Interestingly, the Industrious trait shares both the top leader in the competition (Huayna Capac) and the dead last finisher (De Gaulle). This trait looks significantly worse without Huayna Capac's championship performance; remove Huayna from the list, and the average score jumps all the way up to 36.63 - yeouch! Even at the current 32.67, this is the worst trait by a fair margin. Why does Industrious perform so poorly? For the same reason that Imperialistic performs so well: it's the anti-landgrab trait. While the other AI leaders are claiming territory, Industrious leaders (which tend to have heavy Build Wonder AI emphasis) are tying up their capitals on slow wonder builds of questionable value. The Industrious leaders wind up with less territory on average, and are therefore more likely to be beind in tech and rolled over by another neighbor. It really does appear to be the kiss of death - for everyone not named Huayna Capac, at least!

I've also tallied our running kill total for the leaders. Huayna Capac is this year's winner of the Golden Spear award, finishing off five opponents en route to his overall victory. The Incan emperor also picked up three kills in Season One, putting him well out in front of the field at eight in total. I was surprised to see Cyrus in second place, pairing his two kills from 2014 with an outstanding four more from this season. Another signal that I consistently underestimated Cyrus. Catherine also figures prominently into this list, to no one's surprise, and she's put up those five kills in only four games total. Mao wasn't able to sustain all the momentum from last year (he took the Golden Spear award for Season One) despite making the championship and having three games to try and pad his totals. Still, a tie for third isn't bad at all.

Suleiman, Suryavarman, and Zara Yacob all had disappointing seasons and weren't able to keep the magic going this time around. Julius Caesar, Kublai Khan, and Pacal had the opposite result, putting up big numbers in their second appearance. Mansa Musa was slow and steady, consistent as always, exactly what you'd expect from him. He'd probably have more kills without his terrible playoff game draw against Ragnar the troll. There's plenty of weird stuff on this list, like ultra-pacifists Elizabeth and Lincoln having two kills apiece, while Montezuma, Alexander, Charlemagne, Gilgamesh, and Hannibal combine for a grand total of zero. Even Gandhi has one kill! Over Gilgamesh, no less. Somehow Napoleon also has two kills (one per year) despite finishing in 32nd and 38th place, never making it out of the opening round. Meanwhile, Asoka has made the Wildcard game the first year and the playoffs the other year, all without managing to secure a single kill. Zero kills in four games is quite the pacifist. Nineteen leaders in total have failed to claim a scalp thus far. I guess they'll have to wait for next year to try and share in the general bloodlust.

Finally, we finish by looking at the rankings on a per-game basis, to see which matches had the highest level of competition. This table uses the two year average ranking of leaders rather than trying to pick one year or the other. A single year's data wouldn't be very useful here, as it would only claim that all the leaders who made the championship were the strongest. The competitors in each individual match are listed in order of finish from left to right, along with their overall two year power ranking. I've then calculated each game's average leader ranking on the right side.

There's a number of interesting tidbits to digest here. Since the median score in a group of 52 leaders will be 26.50, we would expect that to be the average score for the opening round games, and that's essentially what we do find here. There were seven games with scores higher, seven games with scores lower, and two games with nearly identical average scores (26.17 apiece). The results look close to a normal distribution, with a mean of 26.27 and a standard distribution of 3.89. Unfortunately there are only two year's worth of data here (N=16), which is a bit on the small size. It's still better than the N=2 being used to determine the rankings themselves though! I guess we need to run this for another thirty or so years to get a decent sample size.

For whatever reason, most of the extreme games on the ranking scale tended to take place in Season Two. Pure random chance at work there. Season Two Game 1 grades out as having the strongest field, with an average score of 18.17. That's two full standard deviations away from the mean, which indicates that this was a rare field indeed. We'd expect to see a field this strong roughly once in every 20 games. The funny thing is that I didn't think much of this collection of leaders when it was drawn. However, Pericles and Mao both currently sit in the top five overall, and they both made the championship this year, plus Elizabeth and Zara Yacob both made the championship last year. That averages out to a strong group indeed. No other opening round game to date has had three leaders present in the overall top ten. This collection of leaders truly was an outlier, as it's a full standard deviation better than the second-strongest field (Season Two Game 8), In fact, this match nearly outranked two of the playoff games with average scores around 17! Too bad the game itself wasn't all that interesting to watch.

There's a similar game at the other end of the distribution. Season Two Game 5 stands out for having an especially weak crop of leaders. At an average score of 34.83, this one is actually more of an outlier than Season Two Game 1 (it's 2.20 standard deviations away from the mean), although for all of the wrong reasons. Hey, remember how we all couldn't understand how Shaka and Tokugawa were able to advance to the playoffs this year? Now we have our answer: the two of them were competing in the weakest field of leaders in any AI Survivor game to date, and by a huge margin. Hannibal has been mediocre thus far, Frederick only managed to advance to the Wildcard round in Season One by rolling a game full of peaceful AIs, and the Isabella / Willem duo are two of the worst performing leaders in the competition. I'm amused at how this was nonetheless a wildly entertaining game, capped off with Shaka's extremely early Domination victory. I actually think having the presence of poor AI leaders tends to make for more interesting games, since we end up with more extreme swings in the performances.

Both of the Wildcard games have nearly identical values, both sitting right around 23. That makes sense to me, as the Wildcard as currently set up will contain a whole bunch of leaders that finish just outside of the playoffs roughly in the 19-27 range. You would expect them to end up with an average score roughly in the middle. The playoff games have shown much more variance, ranging from Season One Playoff 1 (8.50) to Season Two Playoff 3 (17.67). That's a huge difference right there! If we keep running the games, the variance should get even larger with time, as the AI leaders that make the playoffs a single time due to random chance will eventually get ranked appropriately in lower overall spots. We'll be better able to identify the lucky ones with more years of competition in the books. Already some of these playoff fields look like the product of fortunate circumstances; Lincoln and Alexander look quite weak from Season One Playoff 2, and the Gandhi/Ragnar pairing from Season Two Playoff 3 also looks out of place. On the other hand, Season One Playoff 1 had a dynamite field: Huayna Capac, Catherine, Cyrus, Season One champion Justinian, Suryavarman, and then the random presence of Augustus. I should have rolled a better map for that game...

The two championships thus far have had very different average rankings. Season Two's final match has the highest overall average score to date at 6.67, and a field of some very strong customers. Huayna Capac, Mao Zedong, Cyrus, and Pacal are four of the top five leaders overall - wow! Now they're partly listed so high simply because they made the championship, making it a self-fulfilling prophecy of sorts, but look at how much weaker the first championship game grades out. No one in the top five, and Mansa / Zara the only ones cracking the top ten. In fact, the Season One championship scores out as only an average playoff game right now. That's quite a bit of a drop - it was not a good season for last year's finalists. Will we have a winner's curse in future years?


Huayna Capac, Survivor Season Two Gold Medalist

It's been a ton of work running this competition, but it's also been a great deal of fun. The decision to run the games on Livestream this year was a great success, and the interaction with the viewers made it all worthwhile. Using Google Forms to submit the picking contest entries was another winner on every front. I can't believe that I collected them all manually from forum posts last year, no wonder I made so many mistakes! It also helped that I became a lot better at using Excel in the intervening year - thanks, new job. With luck, we'll be back again next year for another round of this competition, maybe with another few tweaks. Until then, stay safe and never go full Wang Troll Kon in your own Civ4 games!