I built a first proof of concept for my project idea. Here is how it went:
In my previous article I described a tool idea that could help beginners build decks using Scryfall Tagger tags. To realize this, I need to scrape data from EDHREC and from Scryfall tagger since they both don’t offer an API.
My game plan is the following:
- Take a commander name
- Find the EDHREC page
- Scrape all the cards EDHREC lists for the commander including their synergy score
- Scrape Scryfall Tagger to find the tags for each of the cards
- Calculate scores for each tag to rank them
- Do some recommendations (even though I don’t know how I want to present those yet)
Taking a commander name is simple. I chose Etali, Primal Storm.
Finding the EDHREC page is no problem either. Just smash together https://edhrec.com/commanders/ and the commander name in kebab case: etali-primal-storm.
Now comes the part that is most interesting to me: scraping websites. Since I’ve never done this before, I started off asking chatGPT. For a Node.js environment, it recommended using cheerio for static websites or puppeteer for dynamic websites. I thought that Scryfall tagger seems rather static, so I asked chatGPT for a cheerio code snippet to scrape the site. The snippet worked, but as a response, I found the loading screen that is displayed when opening Scryfall tagger. I never recognized it.
Knowing that the website to scrape is dynamic, I used puppeteer. It’s quite fascinating to see a browser being handled by your program. Also, I do love how good generating code with ChatGPT works. I literally just copy-pasted the HTML container with the tags from Scryfall tagger and told chatGPT to extract those. It worked on the first try. Impressive!
Scraping EDHREC was a bit more challenging since I needed to instruct puppeteer to navigate to the ungrouped table view of the cards to make the card names and synergy scores scrapable easily. During the process, I learned that you can enter CSS selectors in the DOM search using the dev tools of a browser. That saved me a lot of time when trying to figure out a CSS selector that only selects the button I want.
To calculate the synergy scores for each tag, I take all recommended cards having the tag and add up their synergy scores.
Linking all these elements together, I now have a program that takes a commander name in kebab case and returns the top ten highest synergy tags. The program still has a few issues.
First of all, the process of scraping the tags for a card takes almost one second due to the loading time of Scryfall tagger. Since EDHREC recommends a little over one hundred cards, my program takes about two minutes for one commander. I see multiple possible solutions here. I could use a threshold and just ignore each card that has an absolute synergy value of less than 10. I could also implement caching, so I won’t need to scrape the card again once it’s known by my program. A third option would be to only scrape the names of all tags from Scryfall tagger, and for each tag query, Scryfall for all the cards having the tag. Saving those relations in a database would then allow me to access the tags of a card way faster.
My second critique is that some of the resulting tags are meta tags about the name. Etali, Primal Storm apparently has high synergy with alliterative card names. While one could call this a fun fact, it isn’t useful for deck building.
Lastly, some of the gameplay tags don’t seem relevant. Many of the cards recommended for Atla Palani, Nest Tender have a unique type line1.
I should probably create a list of tags that is filtered out by default and leave the user the choice to remove them from the filter for the edge cases where the tags are truly relevant.
I want to finish this article with an exemplary output of the current program state for the Reaper King:
| tag | synergy score |
|---|---|
| alliteration | 550 |
| changeling | 546 |
| protects-creature | 289 |
| flicker-creature | 287 |
| repeatable token generator | 279 |
| mana filter | 278 |
| tribal-choose | 278 |
| card types in graveyard matter | 243 |
| synergy-artifact | 221 |
| french vanilla | 219 |
