Code Breaking 101: Keyword Guessing

0

Keyword Guessing is a technique used to help you see a solution for a puzzle but not outright solve it for you. There are two scenarios that could utilize keyword guessing:

  • given a mix of letters and numbers that could be rearranged to form a passcode (transpositions)
  • an encoding of a passcode that is likely in a known passcode format but the decoding method is not known yet (reverse engineering, usually for substitutions)

The basis for keyword guessing are Regular Expressions (or regex or regexp for short), a way to search for text based on patterns. Traditional searching is based on literal characters: if you looked for Ingress in a document, it matches words with Ingress in it (and the word Ingress too) and nothing else. With regular expressions, searching for Ingress will do the exact same thing. However, regular expressions are much more powerful if you use special characters; for example if you looked for Ingres+ in a document, it will look for Ingres or Ingress or Ingresssssssssssssssssss and so on, but not Ingre and we’ll see why below.

Special Characters

Regular expressions are formed from two types of characters: characters that have special meanings and characters that don’t. We won’t explore all of the special characters, just the ones that would be useful for keyword guessing.

  • ^ means to match the beginning of a string of text
  • $ means to match the end of a string of text
  • [ and ] are used to match a single character that is enclosed within it. [shaper]would match s, h, a, p, e, or r.
  • { and } are used to specify how many of the previous character (special or not) to match. Ingres{1,3} would mean to match Ingres, Ingress, or Ingresss but not Ingressss. Using a single number will match exactly that many characters, putting a comma before the number will match at most that many characters. Putting a comma after the number will match at least that many.

Transposition Example

We’ll use these four primarily for transposition guesses, for example:

https://plus.google.com/+SteinLightman/posts/3AwiCrUWxfF

GlyphHacks.png

gvgzqom2mcor74c7

The length of the code is 16 and the pattern is

[2-9][a-z][a-z][a-z][2-9]keyword[a-z][2-9][a-z][2-9][a-z]

There are 5 characters used in the prefix and 5 characters used in the suffix, leaving 6 (16 – 5 – 5 ) letters for the keyword

^[gvgzqom2mcor74c7]{6}$
  • We use ^ and $ together to make sure we find the exact 6 letter keyword. Leaving the ^ or $ out may include words that might not apply
  • We put the entire code (including numbers) inside the [] to save time but also in case we run into the ‘3rdlaw’ keyword
  • {6} is used since we are looking for a word that involves 6 total characters from the 16 characters inside the [] block.

We have a tool to make looking up keywords via regex:

http://tools.decodeingress.me/#/keywords

Pasting the regex pattern to the tool gives us only one keyword: covcom

As mentioned earlier, regex won’t necessarily provide a solution but it’ll guide you toward one. Using our transposition methods:

gv
gz
qo
m2
mc
or
74
c7
gvgz
qom2
mcor
74c7
gvgzqom2
mcor74c7

From here, we see that the 4×4 has covcom in it if you read through it upwards from the bottom-left.

7mqg4covcomg7r2z

A few more special characters:

  • . means to match any character
  • + means to match the previous character (special or not special) at least once
  • ( and ) are used to group together characters that can either be used before a + or {} group or to be recalled further in the search (with \n, n being the group number; this is called back referencing)

Substitution Example

We’ll use these for substitutions codes, we’ll go back to Passcode Deocding Walkthrough #011.3‘s Code 1:

piljn bneu rczjg wiwjn soqszizozicw tete zrnjj sijnne zdc zewkc

I mentioned that we can determine how soqszizozicw becomes substitution because the following characters are matching:

  • the 1st and 4th positions
  • the 2nd and 8th positions
  • the 5th, 7th, and 9th positions
  • the 6th and 10th positions

Translating this into a regex pattern:

^(.)(.).\1(.)(.)\3\2\3\4..$

Let’s dissect this pattern:

Characters 1, 2, and 3
(.)(.).

We are capturing the 1st and 2nd characters to be stored in \1 and \2 respectively, the 3rd character is not going to be captured so there isn’t a set of parenthesis around it.

Character 4, 5, and 6
\1(.)(.)

With the \1, we are making use of one of the back references we set up earlier, the 4th character will match the 1st. We also capture the 5th and 6th character into \3 and \4 respectively.

Character 7, 8, 9, and 10
\3\2\3\4

The 7th character needs to match the 5th character which is stored in \3. Likewise, the 8th character matches the 2nd, so we use \2. The back references can be used more than once. So we’ll use \3 again for the 9th character. The 10th character matches the 6th so we’ll use \4 for it.

Characters 11 and 12.
..

The last two characters do not match previous characters nor need to be matched. Trying this pattern in our keyword finder gives us 1 match:

substitution

With those special characters and a list of keywords, we can guess keywords to some of the codes that Niantic releases. These methods will not always work, we’re making an assumption that the keyword can be guessed by using one of these two methods. If a keyword is found and a passcode is discovered, the assumption is correct; otherwise we’ll need to try another method.

What’s Next?

In the next article, we’ll be exploring more examples of using regex to help us with finding keywords.

About Author

Leave a Reply

Code Breaking 101: Keyword Guessing

by Jack Truong (SQL) time to read: 4 min
0