Unzipped: Laying the foundation

Avoiding past mistakes

When we built our first game, Synonym Circuit, we knew absolutely nothing about making a daily web game. We knew what we liked about Wordle, Spelling Bee, and Connections, but we had no clue how the sausage was made.

So when we blindly embarked on developing Synonym Circuit, we made some rookie mistakes that eventually led to us sunsetting the game a year after its launch. The biggest of those mistakes was hitching our wagon to an expensive, third-party API that was at the core of our game. In making this move, we not only limited our ability to make any profit – we also limited the game’s potential growth and improvement. Like a plant in too small a pot, our game could not mature and scale freely. I promised myself that next time would be different…

Goal # 2: Scalability

Thinking ahead

Word Zip, like Synonym Circuit, is a word game. But aside from this fact, the two games have some critical differences that influence big decisions like, “What dictionary am I going to use?”. Where our first game was all about the meanings of words, our new game would care only about the spelling and relevance of words. This freed me up to get creative with dictionary options and veer away from paid APIs in favor of open-source dictionaries that were free to download and overhaul.

Where our first game was all about the meanings of words, our new game would care only about the spelling and relevance of words.

After weeks of scouring the internet for “the right” dictionary, I found that each one I tried was either way too verbose or way too limited. I could not find the “goldilocks” solution that would make for perfect puzzles. I was close to giving up on the project, thinking it would never be scalable like I had envisioned, and it would be destined to go the way of the dodo just like Synonym Circuit. But something about this game made me keep pushing, keep thinking outside the box.

Manual curation can scale?

My solution feels counterintuitive on the surface; I would achieve scalability by manually curating each puzzle… forever. Kind of. Let me set the scene for you:

Problem:

Given 2 random letter pools, automatically create a set of words that can be zipped with those letters.

Solution:

I built an algorithm that would take a secret 10-letter word, split it into 2 pools, and return all the possible words that can be zipped from those pools.

Problem:

That list of possible words was way too long and contained tons of obscure or archaic words.

Solution:

I added the ability to select undesirable words from the list and either remove them from my dictionary or add them to a new “bonus” dictionary.

Problem:

Curating puzzles like this is not feasible long term. It takes too much time and manual effort.

Solution:

By adding the ability to classify undesirable words and permanently remove them from the dictionary, I created a system where curating puzzles would get faster and faster until the task would become truly automated at some point.

Screenshot of the user interface for "Puzzlemaker", a tool that creates puzzles for Word Zip

This system had the added benefit of giving us full control over each puzzle, allowing us to keep an open dialogue with our users in regards to our word choices. Now that I had a clear path forward, it was time to think about structuring the database in a way that would function at scale.

This system had the added benefit of giving us full control over each puzzle, allowing us to keep an open dialogue with our users in regards to our word choices.

Keeping the database small

A large database can be slow and expensive once it’s being leveraged by a large user base. I knew that I needed to avoid hosting the entire 500,000 word dictionary myself (which was a consideration, mainly because I thought it could be leveraged by future word games), so I started brainstorming alternative structures.

The structure

There was no getting around the fact that the database would grow over time, so I made it my goal to minimize the rate of growth as much as possible. In the end, my method was this:

Create a solution set for a puzzle with the tool I had created
Curate that solution set to be exactly how I wanted it (removing non-optimal words)
Add this puzzle to the “puzzles” table with a unique puzzle ID
Add each solution word to the “solutionset” table as its own row
Each of these rows contains a list of puzzle IDs (this word is in the solution set for each of these corresponding puzzles)
Add the bonus words to the “bonus_solutionset” table, mirroring the same structure.

For every new puzzle added to the database, the “puzzles” table would grow by one row, and the solutions tables would potentially grow by multiple rows. However, anytime a solution word already existed in the database, that row would be updated with the new puzzle ID instead of creating a duplicate row. This meant that the length of these solution tables would increase at a much slower rate than if I dumped the entire dictionary into the database. My data became cheaper to maintain, faster to access, and easier to manage!

If data optimization isn’t the most exciting topic to you, don’t worry! I’ll be diving into more game design and visual challenges in the next blog post. Thank you for coming with me on my humble journey toward word game nirvana!

Sincerely,

Developer Steve