Mapping the Cube Landscape
Table of Contents
What is Cube? This is a question that draws a variety of answers from the Magic community. Colloquially, the “Cube” format is associated with the MTGO Vintage Cube, known for containing the most powerful cards in Magic. In a broad sense, however, Cube is simply a self-curated draft format, composed of cards that fit specific design and gameplay goals.
But this definition does little to communicate the diversity of environments and experiences that are possible using Magic’s entire card pool. As a result, I have always wanted to explore the true design spectrum of Cube, the avenues that players take to express themselves through the game they love.
A few months ago, I had the incredible opportunity to do so. By partnering with Cube Cobra, the most popular site for organizing and designing cubes, I gained access to a trove of data from tens of thousands of cubes. I set out to visualize these data and to finally answer the question “what is Cube?”
The result is the Cube Map, an interactive atlas that positions cubes based on the cards they contain and visualizes the incredible diversity of the format.
Discover just how diverse the Cube format can be, themes you never expected, and where your own cube fits.
This article will explain how the map was created, its limitations, and our plans for it in the future!
If we are to visualize the diversity of cubes, we need a unified foundation to compare them. The most intuitive framework we can use to represent cubes as data is to simply count the number of each card present within a cube. As an example, let’s take four simple cubes: one has Lightning Bolt, one has Ponder, one has both, and one has neither. In this framework, each card represents a dimension on our graph:
So cubes with two or fewer cards are represented as corners on a square (like above). Cubes with three or fewer can be represented as corners of a geometrical cube. Cubes with four or more…
See the problem?
With this framework, there is no effective way to visualize four or more dimensions. Even worse, there are over 21,000 unique cards represented in cubes on Cube Cobra. If we can’t visualize four dimensions, how can we hope to visualize more than 21,000?
Dimensionality reduction is the solution.
The goal of dimensionality reduction, or “projection”, is straightforward: find the best way to represent multi-dimensional data with just two dimensions1. A person’s shadow, for example, is a 2D projection of their 3D body. Projection usually loses information, as two dimensions can rarely retain all the information stored in higher dimensions. Continuing the analogy, while a shadow shows how tall and wide someone is, it does not communicate how “thick” they are2. The point of dimensionality reduction is to find dimensions that best preserve higher dimensional relationships. We want points that are similar (ie close together) in high dimensions to also be close together in our low dimensional projection.
The key to this process is that some dimensions are more important than others. Take the following data as an example:
These data are two dimensional, but imagine that we want to reduce their dimensionality and “project” them into just one dimension. Can you spot the one dimension (i.e. the line) that can best explain the data? Click the toggle below the graph to find out.
The best dimension is a mix of our two original dimensions! It’s just one dimension, but it explains 98% of the information present in the data3. Dimensionality reduction works by combining our original dimensions into one or two dimensions that best explain the data. It leverages relationships between the original dimensions to do this. In the above data, for example, dimension 1 is linearly correlated with dimension 2, so a single line that combines them preserves the most information. In real datasets, the best dimensions are usually a more complicated blend of the original dimensions.
How does this apply to Cube? Let’s again take some example cubes: Cube1, Cube2, Cube3. There are 100 unique cards among these cubes, so these three data points have 100 dimensions. Since each dimension represents an individual card, combinations of dimensions will represent the weighted presence or absence of multiple cards. Can you find the two dimensions that best explain the differences between these cubes?
You’ll notice that each cube has the same red and black cards but has only two of green, blue, or white cards. Thus our dataset, despite having 100 original dimensions (cards), actually has only two dimensions related to the presence of absence of green, blue, and white cards.
These dimensions explain 100% of the information in these data4. Real cubes are messier, but the dimensionality reduction approach remains the same: find cards some cubes play and others do not. These are the cards that best explain differences between cubes and so best preserve information.
Unfortunately, it’s less straightforward to reduce the dimensionality of a dataset that contains tens of thousands cubes and over 21,000 different cards. It cannot be done by eye—cubes are simply too varied and the dataset is too large.
Luckily, there are algorithms designed to automate dimensionality reduction, each with its own flavor and use cases. The map was eventually made with a more complex variation, but to help build intuition I will explain the first approach I tried: principal component analysis (PCA). It is undoubtedly the most popular dimensionality reduction technique, and in fact, I have already covered it in the toy examples above.
PCA algorithmically finds the best linear dimensions (or “principal components”), which means that the reduced dimensions it finds are linear combinations of the original dimensions. In the example of the data on a line, we went from two dimensions to one, and the best “principal component” was ½×(dimension 1) + ½×(dimension 2). In the example with the cubes missing colors, we went from 100 dimensions (cards) to two. Principal component 1 was the “blue” dimension which gave blue cards a positive weight, while principal component 2 was the “green and white” dimension, giving white cards a positive weight and green cards the same weight but negative.
PCA is popular because it is interpretable. By seeing how the original dimensions contribute to the new dimensions, we can understand the patterns that PCA is using to explain differences between cubes.
To apply it to our dataset, I converted each cube to a series of 0’s and 1’s for each card—1 if the card was present and 0 if it was absent—then ran PCA on the result5. The results illuminate some high level differences between cubes:
Notably, these two dimensions explain only 10% of the differences between cubes, which is far worse than our toy examples. To determine what cards separate cubes, we can examine the cards that contribute most to each dimension:
Dimension 1 (Card: weight)
Dimension 2 (Card: weight)
Evolving Wilds: 7.0
Birds of Paradise: 6.2
Terramorphic Expanse: 6.4
Restoration Angel: 6.1
Snapcaster Mage: 6.1
Carrion Feeder: 6.0
Raise the Alarm: 5.8
Thalia, Guardian of Thraben: 6.0
Faith's Fetters: 5.7
Giant Growth: -1.1
Ancestral Recall: -1.7
Black Lotus: -1.7
Dead Weight: -1.3
Vampiric Tutor: -1.8
Wheel of Fortune: -1.8
Dual Lands: -2.0
While we can see how each card contributes to each dimension, interpreting the meaning of these new dimensions is more difficult. My loose interpretation is that dimension 1 looks to be related to cube budget—cards with positive weights are those that are commonly played in cubes without budgets, while cards with negative weights are played in budget cubes. Dimension 2 is less obvious, but it appears to be capturing elements of cube categories and power level. Cards with positive weights are Pauper cards, while cards with negative weights are present in Vintage cubes. These underlying themes are, of course, somewhat overlapping, which reveals that PCA is not as useful for our dataset as it might be for others6.
From this, it is clear that PCA can extract higher level features of the cubes—power level, budget, and cube categories. But it does a poor job visually of separating cubes in its plot, and we need more granularity. For example, it doesn’t help us understand differences between flavors of Pauper Cubes, and we don’t see more niche design choices like Unset cubes. In short, it doesn’t truly communicate the diversity of approaches designers take to building a cube.
While PCA is poorly equipped for this task, nonlinear dimensionality reduction is a perfect fit.
PCA is so interpretable because it is a linear approach. Each card contributes some amount to the identified dimensions, and we can see how those dimensions are defined. It preserves “global” structure—cubes that are far away in the plot shown earlier are truly very different.
With nonlinear dimensionality reduction, we sacrifice interpretability and global structure to learn more about “local” structure, or differences between cubes that are relatively similar. To simplify, let’s consider some two-dimensional data:
Imagine we want to project this data into one dimension. PCA would find the line that best fits this data, and in doing so, completely fail to explain the shape of the data. A nonlinear approach might find a more complicated representation; in this case, a spiral instead of a line7. In doing so, it loses interpretability, as we can no longer clearly discern how the original dimensions contribute to the new dimensions. We also lose global structure—a point at the center is considered very far “along the spiral” from the spiral’s end, even though the distance in our original dimensions wasn’t too far.
While global structure is lost, local structure is emphasized. By keeping close points together, the nonlinear approach is able to “unfurl” the spiral shape of the data. Practically, the result is that if we use a nonlinear approach, moderate to long range distances are not preserved, but data points (cubes) that are very similar will be located close together8. There are a bevy of nonlinear approaches we could use, but I settled on Uniform Manifold Approximation and Projection (UMAP), mostly because it makes the nicest plots9.
Using UMAP is significantly more subjective than PCA. For example, PCA compares cubes using the Euclidean distance, which is the form of distance with which most people are familiar (the length of the line segment between two points)10. UMAP has no such restriction, so we can use other distance metrics that are more suitable for our dataset.
An intuitive approach is to instead define similarity between cubes as the number of cards they share, which is called the Hamming distance. Unfortunately, this metric is heavily influenced by cube size, as larger cubes naturally share more cards with other cubes. Much of the early work on the map was trying to determine the right distance metric—Euclidean distance resulted in a homogenous blob, while Hamming distance tended to group larger cubes together regardless of their underlying design goals. I decided to use Jaccard distance, which defines similarity as the number of cards shared between two cubes divided by the total number of unique cards they contain.
UMAP also has several parameters that control how it reduces dimensionality11. Choosing these parameters is entirely subjective, so there are no “right” choices. I chose parameters that encouraged the formation of small clusters because I knew that very small clusters of cubes with unique design restrictions were possible (for example, the emerging Degenerate Micro Cubes). Other equally valid parameter choices merged larger landmasses or led to more islands. Because of this subjectivity, the map is not a ground truth—it is merely one way among many of visualizing these data.
Running UMAP on the dataset resulted in the Cube Map that you see. It does an excellent job of separating out cubes of unique and niche design restrictions. But you’ll also notice that the map colors different “clusters” of cubes that are grouped together. This is not a result of UMAP, which is simply a dimensionality reduction algorithm and does not actually make distinctions between groups of cubes. It does not tell us, for example, whether two or three types of Pauper Cubes exist, even if we visually see different clusters of Pauper cubes. Because its dimensions are not interpretable, it also cannot tell us what cards distinguish different kinds of cubes.
To tackle these questions, we can use clustering algorithms.
Clustering is a process that groups together similar data points and separates them from others. For example, most approaches would group the following two-dimensional data into four clusters:
In our dataset, clustering will draw “boundaries” on our map to identify cubes that are similar to each other and different from others.
Clustering algorithms have trouble with the following kinds of data:
- High dimensional data. Due to something called the curse of dimensionality, clustering data with thousands of dimensions is very difficult12.
- Data with clusters of different sizes and densities. There is an intrinsic tradeoff with most approaches: if you focus on finding small clusters, you risk erroneously splitting large clusters. If you focus on finding large clusters, small clusters will often be needlessly merged.
- Data with many clusters. Not only does this make computations difficult, it also increases the chances your clustering will be wrong and makes fine-tuning difficult.
Unfortunately, these features perfectly describe our cube data. The dataset extremely high-dimensional, as each of the 21,000+ cards is a dimension. Cubes with narrow design restrictions (e.g. Alara set cubes) result in high density “islands” on the map, but cubes with more general goals (e.g. EDH or Legacy cubes) form the continuous, lower-density “mainland” portions of the map. Finally, there are hundreds of different kinds of cubes represented in the map.
Of course, that won’t stop us!
There are hundreds of clustering algorithms that exist, and each has their own strengths and weaknesses. For better or worse, our needs are actually quite narrow. Because the scale of our data is so large, we must run clustering after dimensionality reduction with UMAP13. This eliminates algorithms that assume a uniform cluster shape or density, as UMAP projections do not fit these assumptions. Additionally, some algorithms require you to prespecify the number of clusters, which is inappropriate in cases like ours where the number of clusters is both large and unknown. After some tinkering, I settled on an algorithm called HDBSCAN, which is designed to handle clusters of different densities14.
Interestingly, HDBSCAN can choose to not assign points to a cluster if it deems them too far away from existing clusters and not dense enough to form their own clusters. While it calls these points “noise”, these are just cubes with slightly different design goals or compositions that prevent them from fitting nicely into clusters. For example, the algorithm declines to cluster my own cube, likely because I cube Unset cards in an otherwise power-optimized Legacy cube. This phenomenon is widespread—HDBSCAN consistently refuses to cluster around a third of the dataset!
This demonstrates an inherent flaw in our approach. Fundamentally, we are placing boundaries on a continuous spectrum of design choices. The boundaries are data-driven and in some cases quite obvious (e.g. Alara set cubes), but they cannot hope to truly capture the full spectrum of what makes Cube great: our ability to fine-tune an environment to our unique preferences and design goals. I see this wealth of unclassified cubes as a testament to the boundless creativity of Cube designers.
With that said, we can force HDBSCAN to assign all cubes to a cluster. I decided to do this because it leads to a more complete map and allows people to explore what cluster their cube is most similar to, even if it’s not a perfect fit.
HDBSCAN consistently identifies over 300 clusters, ranging in size from a small cluster of cubes designed around terrible cards to a cluster encompassing over a thousand EDH cubes15. The clusters overlap fairly well with areas of density on the map.
But HDBSCAN doesn’t tell us what defines these clusters—after all, it was run on UMAP-reduced data, which has uninterpretable dimensions. With cluster assignments, however, we can actually examine the cards played in each cluster. We could, for example, simply examine the most popular cards among cubes within each cluster.
This ends up being reasonably informative, but in reality, it’s not truly what informed the map in the first place. One of the most popular cards in the cluster containing the MTGO Vintage Cube is Lightning Bolt—93% of these cubes play it. One of the most popular card in several Pauper cube clusters is… also Lightning Bolt. In fact, Lightning Bolt is played in 50% of cubes globally, and it sits in the top 5 most popular cards of nearly 100 clusters. Looking at popular cards within any given cluster tends to yield globally popular cards and doesn’t help us assess what truly defines each cluster16.
We should instead ask what cards are enriched in any given cluster relative to other clusters. There are a few different frameworks we can use for this, but because we are dealing with binary data (presence or absence of cards), Fisher’s exact test works excellently. For two clusters A and B and a given card, the test yields a score that tells us how enriched that card is in cluster A relative to cluster B17. For each cluster, I calculated this score for each card relative to all other clusters in a pairwise manner. Cards with high scores define that cluster relative to all other clusters.
These results end up being more informative and are the globally defining cards you see for each cluster on the map. The most globally defining card for the cluster containing the MTGO Vintage Cube is Sneak Attack, while the most enriching card for the largest Pauper cluster is Phyrexian Rager. A big improvement over Lightning Bolts!
One drawback of this approach is that clusters that are distinct but similar will often have the same globally defining cards. Many clusters in the northern part of the map, for example, have Sneak Attack and Fetchlands in their list of globally defining cards. But the clustering algorithm considers these clusters different enough to be distinct, and we’d like to know why.
To answer this, we can simply examine the Fisher exact results between cubes close together. We can, for example, look at the cards that differentiate a given cluster from its 15 closest neighbors. These are the locally defining cards listed in the map for each cluster.
The results are fascinating. The cluster containing the MTGO Vintage Cube, for example, is partially distinguished from its neighbors by cards like Golos, Tireless Pilgrim and Gonti, Lord of Luxury. Pauper cubes are partially differentiated by whether they choose to cube Bouncelands. These comparisons showcase the personal inflections designers use to express themselves and highlight the incredible diversity of our format.
In the future, we’d like to allow direct comparisons of individual clusters. The cluster that contains my cube, for example, is differentiated from the cluster containing the MTGO Vintage Cube not by Power but by the decision to support Black aggro—my cluster is 28 times more likely to play Bloodsoaked Champion and Gutterbones.
These two algorithms, UMAP and HDBSCAN, allow us to compactly visualize the astounding breadth of the Cube format for the first time. We plan to continue exploring and refining the map over time, and if you have any ideas on things to do, please let us know at [email protected]!
This work would not have been possible without Cube Cobra—we at Lucky Paper are incredibly grateful for their partnership and support on this project. I also personally want to express my gratitude for the team at Lucky Paper, who supported this crazy idea and had the web development skills to bring it to life.
Here’s to the best way to play Magic—Cube!
What do the axes on the map mean?
The axes on the map have no direct meaning. This is because UMAP, the algorithm we use to make the map, does not result in interpretable dimensions. High-dimensional data is complicated—no one has devised an approach that is interpretable and preserves local connections between points. Approaches that are interpretable fail miserably on this data.
Why isn’t the cluster containing the MTGO Vintage Cube defined by the Power Nine?
Because many cubes that look otherwise similar to the MTGO Vintage Cube do not play the Power Nine. UMAP has no concept of card importance, so in calculating cube similarities, the inclusion of Black Lotus is no more important than the inclusion of Evolving Wilds. Broadly, this illustrates an important point about the map—just because two cubes are close together does not necessarily mean their gameplay is similar.
Why do many of the clusters overlap? How is it possible that a cube close to mine is in a different cluster?
The confusing answer is that the clusters may not actually overlap, at least not in higher dimensions. Clustering is run on a higher dimensional form of the data, where clusters may not overlap like they appear to in two dimensions. It’s like looking at a picture of a person with someone else in the background—in 3D space they may be separated, but in the 2D picture, they may be overlapping.
I don’t play many of the cards that define my cluster. What gives?
This is relatively common, especially for low density clusters. It may be that your cube (like my own) didn’t fit nicely into any clusters, and your cube was simply placed in the “closest match”. It may also be that the cluster itself is ill-defined. Particularly within larger landmasses, we’re placing boundaries on a continuous spectrum, so there are bound to be ambiguities.
Is there some significance of the location of island X in the map?
In short, no. UMAP does not preserve global structure, so the locations of islands are essentially arbitrary. This also means that long-range distance can’t be easily interpreted. A cube spatially halfway between the MTGO Vintage and MTGO Modern cubes can resemble the former far more than the latter.
Why has the map changed since I last saw it?
UMAP is stochastic, which means that its results can change in different iterations. While we keep UMAP’s random seed the same when we update the map, the map can still change as new data is added. Plus, people change their cubes!
- We can visualize three dimensions, so why not use three? The answer is that humans are terrible at interpreting three-dimensional plots.
- Of course this isn’t strictly true. You can make a guess at someone’s “thickness” despite only knowing their height and width because these metrics are correlated. Dimensionality reduction works much better when correlations exist because the effective dimensionality of the data is lower.
- The term “information” can mean many things. In this context, I am referring to the variance in the data, or essentially the differences between points.
- Astute readers might wonder—how can two dimensions fully explain presence or absence of three colors? Does that not require three dimensions? Only two dimensions are needed because for these cubes, the presence of blue and green automatically implies the absence of white. So the “third” dimension gives us no additional information. If there were a fourth cube with green, white, and blue cards, a third dimension would be needed.
- In preparing the data for the map, I removed duplicate cubes. I also filtered out cubes that had fewer than 50 unique cards, as these cubes tend to be unfinished and would be placed semi-randomly on the map. My sincerest apologies to the Hidden Gibbons Cube.
- To explain a bit more: PCA identifies new dimensions that are uncorrelated (or “orthogonal”). Somewhat confusingly, this does not mean independent. Two variables can be uncorrelated but still dependent if their relationship is nonlinear—as a result, PCA is terrible at capturing complicated relationships in a dataset. As it turns out, the cube ecosystem is full of complicated relationships.
- Real nonlinear methods do not work like this. The two most popular methods, UMAP and t-SNE, work by establishing connections between similar points in higher dimensional space (“local structure”) and then trying to maintain those connections in a lower dimensional representation.
- For the Cube Map, this means that the location of “islands” in the map is essentially arbitrary. Different iterations of nonlinear algorithms, for example, will place the Degenerate Micro Cube cluster in different positions. Sometimes it’s next to the MTGO Vintage Cube’s cluster, and sometimes it’s halfway across the map.
- UMAP is designed to preserve nonlinear relationships (which it calls “manifolds”) between similar data points. It is also incredibly efficient and can run directly on the original data space (tens of thousands of cubes x 21,000 cards). The other popular method, t-SNE, cannot handle the scale of these data and requires you to first reduce the dimensionality to 10-20 dimensions with another algorithm like PCA first.
- PCA does not actually “choose” to use euclidean distance—in a beautiful bit of math, it arises naturally from PCA’s goal of maximizing the variance explained in the data.
- These are controlled by the
min_distparameters, respectively. The documentation for UMAP is fantastic, and they have made some excellent interactive visualizations to understand exactly how these parameters control UMAP.
- The curse of dimensionality refers to the fact that high dimensional data do not conform to our low dimensional expectations. In particular, distance between points gets extremely large, so even points that are similar get very far from each other. As a result, clustering in low dimensions is like trying to separate colors of the rainbow, but in high dimensions it’s like trying to distinguish between slightly different shades of green.
- Clustering on the results of UMAP is contentious because it does not preserve long-range distance and can result in fake clusters. Luckily for us, it is easy to validate clusters by seeing if the cubes share design goals. Clustering on PCA results is more standard because it preserves global distance, but it fails miserably on these data.
- HDBSCAN stands for Hierarchical Density-Based Spatial Clustering of Applications with Noise. What a mouthful! It is similar to other algorithms in that it builds a hierarchy of clusters, but it begins by transforming the data to increase local density—doing so helps fight “noise”. I ran HDBSCAN on the data reduced by UMAP to 6 dimensions.
- Here we see the strength of HDBSCAN in handling clusters of different sizes and densities. The island of EDH cubes is large and has low density, but UMAP kept it all in one piece. Reasonably, I think, since there don’t appear to be any obvious subclusters within its mass.
- This isn’t strictly true. There are clear cases where cluster-popular cards are also cluster-unique—Sheltering Ancient is the most popular card within strictly-worse cubes and is also unique to that cluster.
- For stats-conversant readers, I am referring to the p-value. Stats-fluent readers, however, will recognize that this is a dubious use of the p-value, since a low p-value does not inherently mean a result is important or the effect size is large. It leaves a bad taste in my mouth too, but it seems to do a reasonable job and the effect sizes are generally quite large.↩