Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The block list is 2/3 of the (minified) library. I found this entire choice odd.

First, it's highly incomplete because you can find at least 10x more combinations spelling the same "word". And probably 10x more slurs that aren't in this block list. Second, because it's hardcoded in your source. Third, because there are more elegant solutions.

Such as to pick an alphabet that can't spell readable words unless you're trying really hard to read a slur into it. Say this (no vowels or digits):

bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ (length 42)

The full lower+upper+digits alphabet they use is 62. Feels like you're losing a lot, but... not really.

- A 128-bit id in base 62 = 22 letters.

- A 128-bit id in base 42 = 24 letters.

JUST TWO MORE LETTERS. And it's one more letter for 64-bit id (11 vs 12). And we can avoid this entire silliness. The problem is the author doesn't realize that logN is... logarithmic, I suppose.



A slight mod, I'd remove Y despite not exactly a vowel, and add back digits that can't be interpreted as vowels.

bcdfghjklmnpqrstvwxzBCDFGHJKLMNPQRSTVWXZ25679

Gives us base 45. And below is a JS snippet to make an id. There's your lib.

    function id(num) {
        num = BigInt(num);
        const dict = "bcdfghjklmnpqrstvwxzBCDFGHJKLMNPQRSTVWXZ25679";
        let id = '';
        while (num > 0n) {
            id += dict[Number(num % 45n)];
            num /= 45n;
        }
        return id || dict[0];
    }
Example:

    id(123456789012345678901234567890n);

    "bq99hC6fbtjLrkxLPm"


Totally agree. 2/3 is wild, especially given it seems like you could mitigate most of the risk just by removing vowels from the dictionary.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: