Ammo Navigation Weblog Company Support Store Rogue Amoeba
Rogue Amoeba
Tue, 16 Oct 2007

One of the professional hazards of being a programmer is the cold sweat which comes when you suddenly realize that some code you've written has a terrible bug. It's worse when you realize that the bug has already been there for months.

A few months ago I set out to redo our license key system. The old system (RASN) generated a single unique license key based on the name of the purchaser. This had the potential to cause problems when two (or more) people with the same name purchased our software, as they would each receive the same license key.

For the new system (RASN2), we decided to add a unique number to each code to make it different from others generated for the same name. However we didn't want to make our codes any longer, so I had to cram more information into the same space. RASN used hexadecimal, a 16-digit number system using the digits 0-9 and the letters A-F. By adding in the rest of the alphabet, then eliminating letters and numbers which are easily confused, such as 0 (zero) and O (oh), RASN2 was able to have 27 different digits. This opened up enough room to squeeze in a short unique number next to the rest of the license data. We started using the new system a few months ago and it has worked well for us, well enough that nothing user-facing changed.


A sample license key

The cold-sweat moment came the other day as I was entering a license key into a copy of Fission. The way some of the letters lined up almost looked like a word, how funny. Hey, you could even get a whole license code made up of four-letter words. Four. letter. words. Oh. S---!

Then I thought, this can't be that bad. After all, the chances of generating some bad word at random must be really low. But I ran the numbers anyway just to be sure. It turns out that the chances of a random license containing the word F--- is actually one in 65,000. That's pretty common, and it's even worse if you count instances split between two groups with a dash in the middle. And of course there are several bad words we can generate, and these odds apply to each one separately.

It turns out that one day in the not-too-distant future, our random number generator gets filthy. On that day, one out of every 128 licenses generated will start with the F-bomb.

Once I recognized the problem, the solution was easy. We built a list of inappropriate words (a list Paul referred to as "my favorite list ever") and now check the code portion of each key against the list before sending it out. If there's a match, we generate a new code automatically. This was quickly put in place and new purchasers can be assured that their license code will never tell them to F***-THIS-S***. Problem solved.

This could also be a marketing opportunity for us as well. Would you pay an extra $2.99 for a vanity code, containing your favorite four-letter word? We think there are enough depraved individuals out there that this could be the latest in upselling innovation! Some day, perhaps. Some day.

Posted by Mike | Permalink | View/Post Comments (25)

Comments


George Browning
Tue Oct 16 15:49:22 2007

Coincidentally, a couple of years ago we had to do the same thing with our license key system for Curio. Fortunately, we already had a huge list of profanity. Er, why you might ask? We created the list for our Sleuth browser client which is included in our Curio K-12 edition. We use it to automatically filter out any resulting pages that contain naughty words (protecting the children as it were). So, a few tweaks later, our License code now verifies the generated key against the same list. Whew!

David S
Tue Oct 16 16:06:04 2007

IF##-KING-LOVE-ROGU-EAMO-EBA

Chris Ryland
Tue Oct 16 17:38:12 2007

We had exactly the same problem as we used only letters in our serials. But since I only needed 16 letters (hex digits swizzled to only letters), I simply didn't include F, U and I (which is confusing anyway) and a few others that might make up naughty words. Hope that's safe. ;-)

Aslak Raanes
Tue Oct 16 17:55:20 2007

I guess this does not, luckily, cover profanities in non-english languages?

Patrick
Tue Oct 16 18:16:23 2007

Why not just exclude all the vowels?

Duncan
Tue Oct 16 18:23:01 2007

Can I get the following vanity key?

IUSE-WIRE-TAPS-TUDI-OINS-TEAD

(Actually, I don't, but you get the idea...)

or,

YOUR-LOGO-HERE-ONLY-$399-PERH-UNDR-EDPI-ECES

pauldwaite
Tue Oct 16 18:32:23 2007

On swearing in general, I’ve never read a better article than this:

http://www.tnr.com/docprint.mhtml?i=20071008&s=pinker100807

Vynce
Tue Oct 16 19:00:15 2007

because "MTHR" still reads as "Mother" to a lot of people, patrick -- and "TH1S" is legible, too.

the solution to the profanity filter, as i see it, is either ignore it, or go much deeper.  I'd prefer ignore it -- i think people need to learn that computers randomly generate stuff sometimes, adnthey need to not look for meaning where there is none.  otherwise, we're going to have to start watching for and preventing 666, 420, and 911 as well.  and where do you draw the line?  is "hoot" ok?  what about nips?  or hsit?  or that clothing store, fcuk?  and what about leet-sp34k? 455? 455h0le? how far from profane is still profane?

j0k3 th4t, i say.  and j0k3 th3m if they can't take a xpnj.

Tom
Tue Oct 16 19:33:43 2007

FCKGW.

That looks an awful lot like F*CK to many, many people, I bet.

Paul
Tue Oct 16 19:38:19 2007

FYI - Apple had a similar problem with the Password Assistant that suggests memorable passwords based on the system dictionary. It turns out that f*** and s*** and the like are in there, and had to be handled in a similar way. There is a list somewhere of objectionable words, and if one of them turns up in a password suggestion then it is discarded and a new one is generated.

Eric Schneider
Tue Oct 16 21:16:21 2007

Simple problem, simple solution.

No vowels, no dirty words.  Stoplists for bad words is just wrongthing.

  -es

Paul (Rogue Amoeba Staff)
Tue Oct 16 21:16:22 2007

George Browning: Very handy. We made our own list, and then found one off the web.

http://www.noswearing.com/about.php

From the FAQ:

"Anyway, we didn't find a list so we spent almost an entire workday shouting obscenities over cubicles and writing a list. It was quite fun, but inefficient. This is our attempt to prevent others from having the same fun. It started as a simple, accessible, searcheble list; the translator was just for fun."

David S: Well thank you sir!

Chris Ryland: In fact, most swears aren't possible, because we left out letters. But F--- was possible, and so we some other things, so we decided to fix it.

Aslak Raanes: I suppose not, no. And something like 30% of our orders are foreign, though the software is English only, so it's not much of a concern.

Patrick: We could, but it's a bit late now. We excluded look-alikes, which left us with a good set. Excluding more would have hurt us as far as security goes.

Duncan: If you've got the dough, we can make anything happen. I think we'd have to special case those though.

pauldwaite: I saw that last week, it was definitely interesting.


Vynce: Ignoring it would be my preferred method too. However, as Mike points out, 1 in 65,000 is pretty common, and that's just per four-letter swear, and it gets worse on certain days. People should accept randomness, sure, but we're not going to change society. We're better off fixing the issue (it didn't take more than 60 minutes) and not dealing with even one ticked off customer.

Tom: It doesn't to me. But anyhow, there are limits to what we'll filter. The blatant stuff (F---) is out. FCK, FCUK, whatever, that's all fine. If you don't know the word, it won't be obvious.

Paul: That's really interesting. Do you have a link to info on that? I'd love to read more.

brew
Tue Oct 16 21:17:52 2007

FACK-NAYE
FECK-NASS
FOCK-KING
EXCE-LENT

Alex
Tue Oct 16 23:00:20 2007

My job frequently involves large amounts of random 10-character codes.

When we get these from the people who generate them for us, we have to scrub them for bad words for precisely this reason.

All this to say, I feel your pain.

But I also laugh at it.

James Welborn
Tue Oct 16 23:00:26 2007

I had to RMA a drive from Seagate a while back whose serial number seemed to be making fun of me for having a Mac. The first 6 characters were "G4CUNT"

Adam Salter
Tue Oct 16 23:16:58 2007

ROFL-FFSI-DONT-CARE

LKM
Wed Oct 17 01:27:11 2007

But what about us poor foreigners? Will we still have to endure registration keys swearing at us in our native languages? :-P

Steven Fisher
Wed Oct 17 02:36:38 2007

I took out all the vowels in our code generator. The first four letters were FVCK.

sambeau
Wed Oct 17 06:40:02 2007

I used to work at BBC online. Their profanity filter text file was one of the funniest things I have ever read. There were words and phrases that 20+ years of life in Glasgow had never gifted me before. And some that were just to weird to repeat.. whoever wrote it was an evil genius.

sd
Wed Oct 17 10:32:11 2007

We used to have an automated password reset on our help-desk IVR. And I got calls from people who complained that the temporary password they were given was "meatball" or "birdbath" I can only imagine who I would have heard from if the password generator had been more random.... :-p

John Muir
Wed Oct 17 13:02:45 2007

Praise be to NUMERIC serials. Group them in little chunks and I find them easier to deal with as a user (even if a fair bit longer) than alphanumeric runes, especially those deprived of memorable words!

Carl
Wed Oct 17 13:06:45 2007

After our system was cracked back in the beginning of the year we decided to go a completely different route:

http://agilefolks.com/s/63099a3cc

We thought this would be cool and there would be no issues. It was one of the most controversial things we ever did. People seemed to either totally love or totally hate it. We had to add a backup text based system and finally recently added a CLICK HERE to have it automatically added. Throw Entourage Mail into the mix which doesn't allow you to drag and drop it like Apple Mail (thanks Microsoft) and Firefox (same) and what we thought we be a no-brainer gets a little more complicated. At least profanity filtering is a non-issue though. :)

Throughout the process I was a little shocked by the fact that many people seemed to love the long cryptic text codes. Personally as long as I can cut and paste them I don't care. If I have to type them that's a different story. :)

alexr
Wed Oct 17 17:55:29 2007

Bungie's Marathon had a similar profanity exclusion process for it's serial numbers. Unfortunately, due to a lack of dead stripping of strings, that list ended up in early releases. People noticed while looking for cheat codes. The C preprocessor is your friend. :-)

Carlo
Thu Oct 18 01:47:42 2007

This never ran across my mind. Hilarious.

lumpi
Fri Oct 19 13:26:01 2007

I always thought that! If I had to program stuff like that it would be among the first stuff that pops my mind. Nice to see a professional confirm it. I didn't waste brain cells on my concerns then.


This post is archived, and commenting has been closed.
Copyright © 2008 Rogue Amoeba Software, LLC. All rights reserved.