Tireless Tracker
Analyzing Your Own Cube Drafts
I’ve always adored data and statistics in the context of Magic (I’m a huge Frank Karsten fan). When I fell in love with Cube about three years ago, I was acutely interested in what data-based resources existed for Cube design. But Cube is a subjective enterprise. Different Cubes have different goals in gameplay and drafting, and drafters have preferred playstyles and varied skill levels that can shift a Cube metagame. Couple this with Magic’s inherent complexity, and you have a recipe for disagreements in both card and archetype evaluation.
I’ve been collecting data on my own Cube for almost two years, and I currently help the XMage Cube Group curate and analyze their 3-0 decklist dataset. I’ve analyzed cards like Experimental Frenzy and Demonlord Belzenlock to evaluate their power level through simulation. If a Cube problem exists and can be looked at with data or simulation, I’ll always give it a go.
This is the first installment in a series titled Tireless Tracker. The goal of this series is to answer questions about Magic using real world data with a particular focus on laying a statistical foundation for Cube discussion. This is an ambitious goal, as data analysis is never truly objective and is riddled with biases and noise. We’ll be fighting these shortcomings by clearly defining the questions we’re attempting to answer and acknowledging the limitations of our approaches.
Each installment of Tireless Tracker will have at least four components.
- Problem: This section will discuss what aspect of Magic or Cube we’re trying to investigate. I’ll discuss the problem’s features and complexities and what answering the question might tell us.
- Data: In this section, we’ll discuss the dataset. I’ll explain where the data comes from and how it was analyzed. I’ll also cover any confounders in the dataset, which are features that make analysis difficult.
- Analysis: This is where I’ll discuss my analysis of the dataset, what conclusions we can draw, and how confident we are in these conclusions.
- Resources: This section will contain links to the dataset, any code I used to analyze it, and any other resources pertaining to the article.
Analyzing Your Own Cube Drafts
I talk often with people on the MTG Cube Talk Discord about tracking my own Cube data, and I’ve been delighted to see that many also do this or are interested in doing so. The major hurdle everyone faces is finding a way to keep track of Cube data in a way that is both efficient and fruitful. Keeping track of everyone’s first and last picks from each pack is easy but may not tell you much. Keeping track of every individual pick and each decklist’s winrates is too time-consuming, even if it contains lots of useful data.
Problem
How should a Cube owner collect data? I’ve been using this method for about two years now: after every draft, I ask the drafters to take a picture of their deck and send it to me. Later, I transcribe the decklists into a simple text file on my computer, noting the colors of the deck, its archetype, its match and game record, and the cards it contains. I then use Python to parse all these files and combine their information into one dataset. I will provide the Python script and instructions for use in Resources.
Currently, the script outputs a few different analyses:
- Archetype Analysis: This script will analyze the win rates of each archetype and which cards appear most frequently in that archetype. It allows for sub-archetypes — a UB Control Reanimator deck can be classified as both Control and Reanimator. If dates are given to the decklists, it will analyze their win rates over time.
- Card Analysis: This script will analyze which cards appear most in your decklists and their maindeck vs. sideboard rates if sideboards are given. It will also output individual win rates for each card (this feature comes with significant limitations in interpretability, see Analysis).
- Color Balance: This script will analyze what colors are most often drafted in your Cube. It will do this based both on the decks themselves and the cards in the decks. For example, if I have one UB Control deck in my dataset, blue and black share an equal archetype representation (0.5-0.5). But if that deck is playing 5 black cards and 17 blue cards, black will have a 5/23 = 0.217 card representation.
Data
Over the past two years, I’ve collected 404 decklists from drafts of my Cube. It is a Strix Scale 8-F unpowered Cube, and I aim to maximize power and efficiency within the unpowered design restriction (for example, I include Mana Drain and Mind Twist). Reanimator and creature-cheat strategies are well supported, and I have no planeswalker quotas. I typically draft with 4-6 friends, occasionally a full 8 person draft or a 2-3 player Winston draft. You can find all the decklists used in this article here.
Confounders
A confounder is any feature of a dataset that prevents accurate analysis. There are two main types of confounders: bias and noise.
A bias is a trend in the data that exists as a result of some external force. In the case of drafted decklists, the primary bias is drafter preference. As a player, I love drafting aggressive decks and will actively pick Goblin Guide and Sulfuric Vortex over most cards in the Cube. Because I am the most experienced player in my playgroup at drafting my Cube, this means that aggressive decks may be overrepresented in terms of win rate.
A similar bias exists for individual cards. If skilled players think that mediocre a card is good, that card may have a high win rate because skilled players draft it. The same is true in reverse — good cards that are underdrafted by skilled players may end up in the hands of less experienced players, resulting in a lower win rate.
Whereas bias refers to a consistent variation in a direction, noise refers to random variation. There are innumerable sources of noise in this dataset. For example, an aggro deck can do poorly in a draft because no aggro cards are opened, or an incredible deck can lose all its games due to random chance. Unlike bias, noise can be reduced with a large enough dataset, but it is never truly eliminated. Noise will be ubiquitous in all the analyses that we do.
Analysis
Archetype and Subarchetype Breakdown
In my own Cube, I’ve chosen to keep a higher order archetype breakdown (Aggro, Midrange, Control), and a sub-archetype breakdown (Ramp, Combo, Reanimator). This means that all decks are either Aggro, Midrange, or Control, but some have sub-archetypes (Control-Reanimator, Midrange-Ramp, etc). Here are their win rates in my Cube:
Archetype | Decks | Game Record | Win Rate |
|---|---|---|---|
Aggro | 102 | 445-328 | 0.58 ± 0.03 |
Midrange | 169 | 656-658 | 0.50 ± 0.03 |
Control | 135 | 490-532 | 0.48 ± 0.03 |
Ramp | 63 | 300-214 | 0.58 ± 0.04 |
Combo | 31 | 122-116 | 0.51 ± 0.06 |
Reanimator | 25 | 74-97 | 0.43 ± 0.07 |
The win rates do not average to 0.50 because of sub-archetypes. For example, I classify all ramp decks as midrange decks. This leads to ramp decks being counted “twice” in the above table, increasing the overall win rate of midrange. It is clear that aggro and ramp decks are the top dogs in my Cube. Given the natural speed of these decks, this has led to a faster paced Cube environment than many other unpowered Cubes that I have played.
We can also investigate the common cards in each archetype:
Archetype | Most Common Cards |
|---|---|
Aggro | Strip Mine, Sulfuric Vortex, Porcelain Legionnaire |
Midrange | Polluted Delta, Recurring Nightmare, Demonic Tutor |
Control | Ponder, Coldsteel Heart, Azorius Signet |
Ramp | Birds of Paradise, Craterhoof Behemoth, Gaea's Cradle |
Combo | Sneak Attack, Emrakul, the Aeons Torn, Oath of Druids |
Reanimator | Entomb, Reanimate, Griselbrand |
The most common cards in each archetype tend to either be archetype enablers (Sneak Attack, Reanimate), powerhouses in the archetype (Sulfuric Vortex, Craterhoof Behemoth), or flexible cards that will fit any color deck in the archetype (Strip Mine, Coldsteel Heart). This makes sense given that these cards either pull you into an archetype or are flexible in terms of color commitment.
I keep track of the dates of the decklists, so we can also interrogate the change in archetype win rates over time. To do this, I use a rolling average, which examines the average win rates of archetypes in a certain window of time. This enables us to make comparisons between time frames. In this case I use a 70 deck rolling average.
We can see from this graph that aggro has always been a powerhouse in my Cube, although there have been times where it was not a top performer. I’ve taken a look at decklists during this time frame, and I’ve discovered that this was when the number of aggro decks per draft increased. The natural conclusion is that aggro’s average win rate decreased because people fought for the archetype. I suspect that this is why aggro does so well in my Cube generally — there are usually only one or two people drafting it. In theory, archetype win rates are self-correcting; players will realize which archetypes are the best and will compete to draft them, lowering their average win rate. Aggro decks likely dodge this self-correction, as many players who play Cube simply do not like playing aggro even if it is “optimal”. As a Cube designer, this presents a conundrum. Do I provide tools to other archetypes against aggro, or does this punish drafters for recognizing that aggro is underestimated and drafting it? While I’ve introduced some tools against aggro and ramp decks decks like Pyroclasm, Whipflare, and Plague Engineer, these are questions I haven’t yet answered for myself.
This analysis also supports something I’ve suspected for some time — the performance of an archetype in a Cube depends not just on the cards in the Cube but also on the playgroup. I have played many Cubes where midrange or control strategies were dominant, despite these Cubes supporting aggro and ramp just as well as my own. This is likely because the experienced players in those playgroups enjoy drafting those strategies. This difference trickles down to card evaluations as well; spot removal and wraths are very important in my Cube, where you need to interact or die against aggro and ramp decks. Slower value engines, however, are more important in a Cube where midrange or control decks are popular. This can dramatically affect how we perceive the power level of cards and is important to remember in discussions.
Individual Card Win Rates
When I first started collecting data on my Cube, I hoped to evaluate the strength of individual cards. In theory, strong cards lead to strong decks, so maybe looking at the cards in winning decks could identify the performers and the duds. The easiest approach is to look at card “win rates”. For example, if every deck that has Tinker in it wins every game it plays, then the “win rate” of Tinker is 100%. I analyzed the cards that cards that have been in more than 20 decks (272 cards). I’ve chosen some illuminating examples to show below, but you can view the full table here.
Rank (Out of 272) | Card | Games | Win Rate |
|---|---|---|---|
1 | Fireblast | 192 | 0.646 |
2 | Hellrider | 172 | 0.628 |
6 | Jackal Pup | 207 | 0.614 |
10 | Carnage Tyrant | 158 | 0.608 |
11 | Joraga Treespeaker | 298 | 0.607 |
