Wordle is nothing short of a waste of time, and so to get rid of it as fast as you can, might as well optimize the resolution by using the right set of words.

But which are these?

Algorithm

The game is played by submitting up to 6 words. Each word is composed of 5 letters. The goal is to guess a secret word in as little turns as possible. At every turn we submit a word (that exists) and receive feedback per whether the words of the letter we submitted are either in the correct place, or in the word but in a different place. One of the strategies to win the game is to find the letters that compose it, then from them submit anagrams till we find it.

My approach will be to cover as much of the alphabet in as little words as possible, while probing the letters in order of frequency in the words we’re guessing. We will attribute a score to each letter corresponding to their frequency in the dictionary of 5 letter words. We’ll then associate a score for each word based on the letters that compose it. From there we’ll order the dictionary based on word scores, and find the combination of words that give us the highest score possible.

Brute forcing our way in

Since I’ll run this only once, I won’t attempt at optimization.

The first thing we want to do is find a good set of words to start probing for the letters.

We start by establishing the frequency of each letter in the dictionary. We could take a random dictionary and filter for 5 letter words, but it’s trivial to extract the list of words Wordle uses1.

const fs = require("fs")
let dict = JSON.parse(fs.readFileSync("assets/2022/wordle.json").toString()).map(word => word.toLowerCase())
let letters = "qwertyuiopasdfghjklzxcvbnm"
let countWordsContainingLetter = ((letter) => dict.filter(word => word.indexOf(letter) >= 0).length)
let letterFrequencies = {}
for (let letter of letters) {
  letterFrequencies[letter] = countWordsContainingLetter(letter)
}
const orderedLetters = Object.keys(letterFrequencies).sort((a, b) => letterFrequencies[b] - letterFrequencies[a])
console.log(`Letters in order: ${orderedLetters}`)

That gives us: s, e, a, o, r, i, l, t, n, u, d, y, c, p, m, h, g, b, k, w, f, v, z, j, x, q

From there we want to find the best starting words, i.e. those with letters the most likely to be in the word of the day. We will attribute a score to each word based on the frequency of each individual letter composing it.

function scoreWord(word) {
  let res = 0
  let foundLetters = {}
  for(let letter of word) {
    // We only count the score for each letter once
    // to prevent words with multiple high-score letters
    // to be favored
    if (foundLetters[letter] === undefined) {
      res += letterFrequencies[letter]
      foundLetters[letter] = true
    }
  }
  return res
}

function weightAndSortWords(dictionary) {
  return dictionary.map(word => ({ word, score: scoreWord(word) })).sort((a, b) => b.score - a.score)
}

let weightedWords = weightAndSortWords(dict)

Now we just find the first combination of words that give you the best score, eliminating the letters we already covered in previous turns to avoid repetition.

function containCommonLetters(a, b){
  for (let letter of a) {
    if (b.indexOf(letter) >= 0) {
      return true
    }
  }
  return false
}

function findWords(weightedWords, howManyWords = 10) {
  weightedWords = Array.from(weightedWords)
  let words = []
  while(weightedWords.length > 0 && words.length < howManyWords) {
    words.push(weightedWords[0])
    weightedWords = weightedWords.filter(w => !containCommonLetters(weightedWords[0].word, w.word))
  }
  return words
}

function getTotalScore(words) {
  return words.reduce((tot, word) => tot + word.score, 0)
}

function displayWords(words) {
  console.log(`Total score: ${getTotalScore(words)}. Words: ${words.map(word => `${word.word}: ${word.score}`).join(", ")}`)
}

displayWords(findWords(weightedWords))

That gives us arose, until, pygmy.

I have the best words

That’s not bad, but then pygmy has two ys and that’s not a letter we’d like to repeat. We want to find the series with the best score. Maybe by starting with another word than arose, we could have done better. Ideally we’d test the entire combination of words and find which series makes the most sense.

Now the catch is that if the series with the best score has 5 words, it doesn’t help us much because then we don’t have room to try permutations. Instead we want to limit the number of words that we need to cover the alphabet. So we add a constraint of how many words we want in the series.

function findBestWordsCombination(weightedWords, howManyWords) {
  let bestScore = 0
  let bestWords = undefined
  weightedWords = Array.from(weightedWords)
  for (let wordIndex = 0; wordIndex < weightedWords.length; wordIndex++) {
    const words = findWords(weightedWords, howManyWords)
    const totalScore = getTotalScore(words)
    if (totalScore > bestScore) {
      bestWords = words
      bestScore = totalScore
    }
    weightedWords.shift()
  }
  return bestWords
}

Then we’ll try with 3 and 4 words

console.log(`3 words`)
displayWords(findBestWordsCombination(weightedWords, 3))
console.log(`4 words`)
displayWords(findBestWordsCombination(weightedWords, 4))

What are the words then?

  • 3 words: Total score: 49752. Words: aisle: 23674, tronc: 15560, dumpy: 10518
  • 4 words: Total score: 56994. Words: gawds: 16135, tryke: 16122, colin: 15321, bumph: 9416

Conclusion

I’m not sure what I accomplished there. There’s probably better approaches, but at least I have some help to finally beat my wife at Wordle.

(Solution hidden for those who didn’t complete their daily duty)

notes

  1. Which a summary de-obfuscation will easily reveal.