I looked into doing something like this once and decided it wasn't going to be very effective, for a few different reasons.
JS engines (or even WASM) aren't going to be as fast at this kind of work as native machine code would be. Especially when you consider that libraries like OpenSSL have heavily tuned implementations of the SHA algorithms. Any bot solving a SHA-based challenge would be able to extract the challenge from the page and execute it using native machine code faster than any legitimate user's browser could. And if you increase the difficulty of the challenge, it's just going to punish real users running the challenge in their browser more than it would the bots.
It's also based on the assumption that proof-of-work is going to increase the cost of doing business for the bots in some way and discourage their behavior. Many of the bots I was dealing with in my case were either using cloud compute services fraudulently or were running on compromised machines of unknowing people. And they tended not to care about how long it took or how high-effort the challenge was, they were very dedicated at getting past it and continuing their malicious behavior.
There's also the risk that any challenge that's sufficiently difficult may also make the user's browser angry that a script is either going unresponsive or eating tons of CPU, which isn't much different from cryptocurrency miner behavior.
Yes, the range of applications where proof of work is viable is really narrow. (In fact, so narrow that I suspect it can't work for anything where the abuse has a monetary motive.)
One way to think about this is by comparing the cost of passing the POW to the money the same compute resources would make when mining a cryptocurrency. I believe that a low-end phone used for mining a CPU-based cryptocurrency would be making O(1 cent) per day. Let's say that you're willing to cause 1 minute of friction for legit users on low-end devices (already something that I'd expect will be unacceptable from a product perspective). Congratulations: you just cost the attacker 1/1500th of a cent. That's orders of magnitudes too low to have any impact on the economics of spam, credential stuffing, scraping, or other typical bulk abuse.
Yep, I have also done this and come to a similar conclusion. The best performance I got was with the WebCrypto API, where I got, IIRC, 100,000 SHA512 hashes a second, with a very optimized routine for iterating the hashcash string. SHA512 probably makes the most sense since AFAIK it's the most intensive hash supported by WebCrypto, and the WebCrypto digest API is async, so a whole lot of time is going to be spent awaiting no matter what you do.
I think WebCrypto is available in workers, so you could probably use numerous workers to get a better hashrate. Still, I don't suspect that would jump it past a million, which seems pretty bad for a desktop computer, and it would be a lot worse on a mobile phone.
It might still be a meaningful impediment when combined with other measures, but with the low bar you'd have to set for preimage bits for mobile devices, it's a little questionable.
Thank you for your detailed response, you raise some very interesting and valid points!
> JS engines (or even WASM) aren't going to be as fast at this kind of work as native machine code would be
You are right. mCaptcha has a WASM and a JS polyfill implementations. Native code will definitely be faster than WASM but in an experiment I ran for fun[0], I discovered that the WASM was roughly 2s slower than native implementation.
> It's also based on the assumption that proof-of-work is going to increase the cost of doing business
mCaptcha is basically a rate-limiter. If an expensive endpoint(say registration: hashing + other validation is expensive) can handle 4k requests/seconds and has mCaptcha installed, then the webmaster can force the attacker to slow down to 1 request/second, significantly reducing the load on their server. That isn't to say that the webmaster will be able to protect themselves against sufficiently motivated attacker who has botnets. :)
> There's also the risk that any challenge that's sufficiently difficult may also make the user's browser angry that a script is either going unresponsive or eating tons of CPU, which isn't much different from cryptocurrency miner behavior.
Also correct. The trick is in finding optimum difficulty which will work for the majority of the devices. A survey to benchmark PoW performance of devices in the wild is WIP[1], which will help webmasters configure their CAPTCHA better.
[0]: https://mcaptcha.org/blog/pow-performance Benchmarking platforms weren't optimised for running benchmarks, kindly take it with a grain of salt. It was a bored Sunday afternoon experiment.
This is a much better explanation of what it does than captcha where I expect "proof-of-human". A PoW based rate-limiter is a really interesting idea! Usually, the challenge with unauthenticated endpoints (ex. signups) is that the server has to do more work (db queries) than the client (make an http request) so it is really easy for the client to bring the server down. With PoW, we're essentially flipping that model where the client has to do more work than the server. Good work!
It looks like your pow_sha256 library is using is the "sha2" crate, which is a pure Rust implementation of SHA2. So your benchmark is around the delta of your library compiled to native code vs. your library compiled to WASM, which is an interesting benchmark but I don't think it answers the right question.
A more interesting benchmark would probably answer the question "what would those looking to defeat mCaptcha use and how does that performance compare?" So perhaps an implementation of an mCaptcha challenge solver using OpenSSL would be warranted for that.
You should compare against a GPU implementation of SHA256 (hashcat has a good one).
You may also want to consider ASICs - although you won't be able to build your own ASIC, you can look at the hashrates offered by bitcoin mining hardware, and extrapolate.
Also remember that native code can use multithreading, so if your challenge is something that could be farmed out to multiple CPU threads until one finds the solution, that's another factor in favor of native code performance.
Lets just assume that you solve the "it takes a while to run" thing through some clever bits of hard-to-optimise math, that's difficult to parallelise or multithread or whatever.
If all it takes is computer time, then that's a cheap thing for the botnet operator to solve. They can just spin up another instance, and split up the crawling task to another computer (or 200).
I haven't encountered a case where multithreading will make the algorithm weaker, but I do have a variation of the benchmark code(on disk, at the moment) that will spin up multiple worker threads to compute PoW.
Hmm, is it a better rate limiter than others? I know that nginx, for example, makes it pretty easy to rate limit based on IP address with the `limit_req` and `limit_req_zone` directives.
In essence, ngix's rate limiter also works by making each request consume a resource, but it makes the resource consumed an IP address (or range) rather than compute resources. It seems intuitive that a malicious actor would have an easier time scaling compute than IP addresses, while a legitimate user will _always_ have an IP address but might be on a machine with 1/100000th the compute resources of the malicious actor.
"Can" is true, and a good point. "Should" is a bit more dubious though; if IP-range-based rate limiting is enough, not wasting your users' battery with PoW-based rate limiting seems like a good thing. It seems like a potentially useful tool in your tool belt, which you should probably only deploy if IP-address-based rate limiting proves to be not enough.
Can you elaborate on why you chose SHA256 as the hash function?
Attackers aren't exactly limited to web apis, and SHA265 is known to be trivially parallelizable on a GPU. RandomX is one such example[0], which reminds me of a similar initiative called RPC-Pay.
Yes, the spam doesn't actually have to be profitable for the seller of the spammed product. Not as long as he thinks it is, and is willing to pay the person spamming on his behalf.
And as you say, it's often stolen resources. There may even be another link in the "think it pays" chain: the spammer may buy hacked instances out of a mistaken belief that he can make money on it, by selling spamming services to merchants who also only think it pays. There's a certain "crime premium" where some people seem willing to pay extra (in money or effort) for the feeling that they're fooling someone.
The goal of CAPTCHA is to tell computers and humans apart. I appreciate the effort towards better UX, but there are already "invisible" CAPTCHAs like Botpoison that discriminate better than this. PoW solutions are just more energy-intensive rate limits.
> I appreciate the effort towards better UX, but there are already "invisible" CAPTCHAs like Botpoison that discriminate better than this.
Interesting project, thank you for sharing! From Botpoison's website[0] under FAQ:
> Botpoison combines:
> - Hashcash , a cryptographic hash-based proof-of-work algorithm.
> - IP reputation checks, cross-referencing proprietary and 3rd party data sets.
> - IP rate-limits.
> - Session and request analysis.
Seems like it is PoW + IP rate-limits. IP rate-limits. though very effective at immediately identifying spam, it hurts folks using Tor and those behind CG-NAT[1].
And as for invisibility, CAPTCHA solves in mCaptcha have a lifetime, beyond which they are invalid. So generating PoW when the checkbox is ticked gives optimum results. But should the webmaster choose to hide, the widget, they can always choose to hook the widget to a form submit event.
Thinking about it a bit more, systems like mCaptcha and Botpoison aren't really CAPTCHA in the strict sense - they solve a somewhat different problem than telling if there's a human at the other end, and IMO that's an important distinction to make (and doesn't necessarily make them inferior to other solutions.)
I still think PoW alone is not enough as it can be automated, albeit at a slower rate. Most of the time I worry more about low-volume automated submissions than high-frequency garbage. The real value is in the combination of factors, especially what BP call the "session and request analysis" and other fingerprinting solutions.
> Thinking about it a bit more, systems like mCaptcha and Botpoison aren't really CAPTCHA in the strict sense
Very true! I chose to use “captcha” because it's easier to convey what it does than, say, calling it a PoW-powered rate-limter.
> The real value is in the combination of factors, especially what BP call the "session and request analysis" and other fingerprinting solutions.
Also true. I'm not sure if it is possible to implement fingerprinting without tracking activity across the internet --- something that a privacy-focused software can't do.
I have been investigating privacy-focused, hash-based spam detection that uses peer reputation[0] but the hash-based mechanism can be broken with a slight modification to the spam text.
I would love to implement spam detection but it shouldn't compromise the visitor's privacy :)
[0]: please see "kavasam" under "Projects that I'm currently working on". I should set up a website for the project soon. https://batsense.net/about
Client-local fingerprinting is not inherently evil, just when you combine it with additional signals and decide to use it in violation of users' privacy. AFAIK it's the most reliable way to distinguish unique visitor agents, and under that use case it's far more respectful of personal information than an IP address.
I'm in two minds about how I feel inconveniencing those behind CG-NAT. I don't want to punish the innocent, but the ISPs aren't going to move towards better solutions (IPv6) without a push from their paying customers, and they'll never get that push with sufficient strength if we work tirelessly to make the problem affect us and not affect those subscribers.
In the real world, on the other hand, customers will simply conclude that one site doesn't work while others do, and that the problem must be with the one site that doesn't work.
> The goal of CAPTCHA is to tell computers and humans apart.
This is definitely not true today, and i'm not sure it was ever true. I remember the first (or what i remember to be the first) <form>-submitted CAPTCHAs asking me to copy some text, answer a riddle or perform some simple math... all of which could be bypassed by a script. The goal was to avoid a random web-scanning bot to fill your DB with garbage, not protect against targeted attacks.
Nowadays there's large portions of the web i can't browse because of bad IP reputation (i browse through tor) and because the most widely deployed CAPTCHA systems insist that i'm not human.
Google RECAPTCHA in particular has an obsession with fire hydrants (that's not even a thing here in France), traffic lights, bicycles and buses. However, it's never clear if i should click the square where just a tiny bit of the object appears, and some images are so unobvious that what i know to be a bicycle part i'm not sure the algorithm will pick up as such. So i'm trapped in endless CAPTCHA loops in which the robot claims i'm not human.
I realize a GPT3-powered bot would probably say this, but i'm 100% human. Meanwhile, malicious bot authors buy CAPTCHAs for a few cents a dozen from sketchy online marketplaces. CAPTCHAs only prevents legitimate use, as malicious actors have way more considerable means and resources at their disposal than we ordinary users do.
PS: It's not just Google though. As much as they claim to be the good guys, CloudFlare claims an enormous part of their malicious traffic comes from the tor network. However, given that they block 80-90% of my `GET /` requests as if they were malicious, i wouldn't be surprised if these stats were complete bullshit that did not stand scrutiny. But well, if the algorithm says so... (post PS: even HN a few months back started blocking many GET requests from Tor except on the homepage).
From what I understand, this would allow somebody with reasonable resources to easily crack captchas with a high-end GPU. You could increase the complexity of the PoW, but ultimately you then just end up in a resource race.
Ideally you want an attacker to spend as long as possible using up computation resources, whilst your normal clients spend on average little. I believe you would want a multi-stage approach that involves some human interaction, which is statistically failed more often by computers, and keeps them busy for a long time doing pointless math.
The reason it would have to be multi-staged is that you don't want the attacker doing the computing to realize they are solving a pointless list of problems with no hope of getting a token. A normal user might on average do a -> b -> c, whereas you would make an attacking machine do a -> b -> c -> d -> e -> f, and the not get a token for example. (It would have to be statistically setup as not to encourage the attacker to bail early.)
I think I would make it a slightly RAM heavy problem that limits how many of these problems could be solved in parallel. You would obviously setup the problem in such a way that networking is minimal.
For example, you could send a variation of a Langton's Ant map pre-seeded with random patches for a significantly large map, and then tell the client to run for X time steps, then give you the result in some specific area.
This isn’t a CAPTCHA. A CAPTCHA should be something that can tell a human user apart from a non-human user with a high degree of accuracy, such that it should be hard for a bot to pass even a single instance.
As noted elsewhere in the thread, this is a rate-limiting system. It cannot protect low-rate critical resources like registrations, but would be useful in mitigating large-scale spam attacks.
According to Wikipedia, CAPTCHA stands for Completely Automated Public Turing test to tell Computers and Humans Apart. This code shouldn't be called CAPTCHA as it makes no attempt to tell humans from machines. (If anything, SHA256 proof-of-work is a problem that's easy for a computer but hard for a human, the opposite of a Turing test.) I'd call it "Proof-Of-Work Client-side RAte Protection" -- it's a POWCRAP, not a CAPTCHA.
(I invented the acronym just now in this post; it could perhaps use some work.)
I think calling this a Captcha isn't a greay way to explain what this is about.
Bots spamming your sign up page are playing an economics game. Can I quickly and cheaply enough do whatever this bot action is so that the end result is worth the cost? Captchas were good because you need humans to verify them, which raised the cost to around $0.05 per action. But now ML breaks most captchas cheaply and easily.
Proof of work based 'captchas' are another solution. They slow down the bots, and cost compute time, reducing the economic efficiency of the attacker. They make it not worth the effort. And if the attacks continue, you can autoscale the difficulty.
And for humans, it's a minor slowdown without requiring any mental effort from them.
As a spammer I wouldn't care much about this, they run thousands of requests in parallel and don't care if they take 1 second or 10 seconds to finish. After all, their machines run all day long.
Much of the more professional spammers are running on hacked botnet machines anyways so they don't really care about maxing the CPUs.
Compute is cheap and the only loser here is the end user with a low end device.
This looks like a rate limiting API rather than a Captcha. Captcha is a tool to distinguish automated bots from real users. Let’s not confuse the term like we already did with “crypto” please.
Accessibility is a critical to mCaptcha. In fact, Codeberg is trying out mCaptcha purely because of its more accessible[0]. That said, it is possible to choose a difficulty factor very high to deny access to folks with older, slower devices. A survey to benchmark mCaptcha performance on devices in the wild is WIP[1]. I hope it will provide insights to help webmasters integrating mCaptcha to select difficulty factors that work for their visitors.
The problem is that mCaptcha allows you to deny access to older, slower devices, but does nothing to deny access to bots. Which can likely run optimized implementations of the PoW which run hundreds of times faster than the web based version.
Much, much simpler solution --- an actual time based rate limiter:
1) Include a hidden, unique token when the login screen is served to client.
2) On the client, enforce a minimum 3 sec delay from time of screen load before the login will be submitted with the hidden token included.
3) On the server, if the hidden token isn't returned or is unknown/not found or if the delay from the time of issue is less than 3 sec., then reject the login and optionally ban the IP address after too many failed attempts.
The 3 sec delay is more than enough to discourage/prevent brute force attacks but not enough to annoy legitimate users.
But this is totally useless - it doesn't consume any resources of the spammer! They can just launch a large number of requests (possibly from their army of bot machines to defeat IP limiting) and have them each wait 3s while doing other nefarious things in parallel.
A PoW has at least the redeeming feature of consuming some resources for those 3s (or so...) making it self-limiting; there's only so many such computations a computer can do in parallel.
But it doesn’t though! A typical laptop can easily hold tens of thousands of I/O connections open at once in your favorite async I/O environment - that number can be in the millions with careful optimizations applied. Each connection just needs a sleep(3) applied between the initial form request and the submission.
A 3 second form delay just means the difference between a spammer launching 1000000 requests and posting them immediately to your database, vs them launching 1000000 requests and posting them to your database 3 seconds later.
Who are we kidding here --- most likely, your server can't handle 1000000 simultaneous requests.
My servers don't have enough bandwidth for that. Most of the connections are going to get dropped one way or the other. In my case, they will be intentionally dropped as being a likely denial of service attack.
it doesn't matter that you're waiting 3s because you can just do it in parallel.
A little math for you:
An 8 char password using only letters and numbers has roughly 1 x 10^14 permutations. Just for the sake of argument, let's assume that your server and your service provider can actually handle 1000000 simultaneous, parallel requests from 1000000 different IPs.
It can't unless you're Google or Facebook running your own data centers but for the sake of argument let's just ignore that reality and push on.
To check every possible 8 character password by making 1 million parallel attempts at guessing the password every 3 seconds would take roughly 10 years. Luck being what it is, you'd probably only need to check half of them ... but that would still take 5 years.
Back in reality land, you'll be out of business before 5 years because you can't serve your paying customers. 1000000 parallel requests hitting a run of the mill server is effectively a "denial of service" attack.
In reality land using servers that I run, your 1000000 different IPs would all be banned after about 30 seconds.
Wouldn’t this exhaust the server resources with storing all the tokens? Which you would need to do significantly longer than 3 secs? Banning the IP may help but then you could just do a simple fail2ban instead?
Wouldn’t this exhaust the server resources with storing all the tokens?
Not really --- only 8 bytes per token and they can be discarded on successful login. Tokens older than X minutes or with more than X attempts can be discarded/rejected too.
How many users are legitimately attempting to log in to your server at the same time?
If you worried about this, encode the current time into the token using a hashing/encryption/checksum method of your choice. This way, everything needed to validate the attempt is submitted along with the credentials.
Wouldn't a spammer almost always be more able and more willing to dedicate resources than an actual user?
Let's say it takes a legitimate user 60 seconds of PoW on their phone. That's a dealbreaker. Now a spammer that takes them 15 seconds on their server, they will still spam.
I guess it could rate limit against high throughput of spam (more like DoS prevention than antispam really), but an IP rate limit would probably work as well if not better. An attacker with a large pool of IPs will probably have a large pool of CPUs anyway.
Has to be easy enough for an old laptop to prove in under 5 seconds but hard enough that (if this becomes popular) someone with a mining rig can’t be cracking millions of these a day.
Looks like a decent implementation of HashCash - great! They should implement TOTP-based nonce generation to avoid the need for a database entirely, that would make the system much more lightweight to deploy.
Some people miss the point of this. If you and another person are running away from a cheetah (fastest land animal) then you don't have to outrun the cheetah to survive, just the other person. The same is true for sites getting away from spammers. They will go for sites that are easy to attack before yours, so as long as the effort outweighs the outcome you're safe. It doesn't matter if the solution isn't perfect, it's better than nothing, and if it's not widely adopted it's not worth the effort to bother cracking.
Only recently, yes. WASM performance is tricky. A memory-heavy algorithm will DoS visitors.
That said, there are protections within mCaptcha to protect against ASICS(PoW result has expiry and variable difficulty scaling), but they are yet to be validated. If they should prove to be insufficient, then I'll try a different approach with memory-heavy algorithms.
Check out the latest revision of the gist. I have added some explanations. Do you think this implementation will work more efficiently or less efficiently? What kind of statistics do you collect in the database? Is there anything interesting in these statistics? Is collecting statistics worth the performance slowdown that occurs? How effective is banning a client by IP/IPv6?
I just want to say - people critique every service out there that slows spam and bots. Those critiques are valid from the "it won't stop everything" view, but it clearly stops a proportion, and the wider variety of products out there the less likely a spammer will have a canned answer for a particular site.
Agreed, it's been an interesting discussion so far with lots of interesting ideas. mCaptcha is a very niche software, that will only work some use cases, but that's okay as long as whoever's deploying it is aware of its drawbacks. :)
> Web mining is infeasible due to the large memory requirement and the lack of directed rounding support for floating point operations in both Javascript and WebAssembly.
Nice! I had written a little algorithm that one could use to implement something like this (maybe interesting if you want to understand how it could work): https://github.com/fabiospampinato/crypto-puzzle
I think there's something to this, it costs you next to nothing to generate these puzzles and you get a guaranteed, tunable, slowdown factor on attackers (or cost increase for them I guess).
This fights spammers, that use own server, but this will not protect from hostile taken computers, that can use their hash power to resolve the captcha.
I've used SHA-256 HashCash before, for PoW authentication on an API built to support a mobile app. The nice bit about using HashCash is increasing the difficulty for successive failed attempts.
I implemented this in response to our API being hammered by exploratory bots/scripts sending malformed requests. By adding this, it dropped failed requests by 99%.
Apologies, the project isn't ready to be showcased yet. I literally woke up to a message from a friend that said it was on HN. I wish I could explain it on here, but I'm afraid it isn't that easy. Here's the high level overview:
1. mCaptcha sends a PoW configuration(first XHR request in the demo widget[0]) which includes a challenge text("string"), a salt and a difficulty factor
2. Client generates proof of work by concatenating "string" + salt until difficulty factor is met. If difficulty factor isn't satisfied, it will continue trying to generate Proof of Work(PoW) by appending nonce and incrementing it until the difficulty factor is satisfied.
3. Client sends PoW to mCaptcha, which includes nonce, original salt and "string"(second XHR request in the demo widget)
4.mCaptcha computes hash for "string" + salt + nonce. If difficulty factor is met(i.e resultant hash > difficulty factor), then mCaptcha responds with access token.
5. Client sends access token to the web service.
6. Web services authenticates access token with mCaptcha and only grants access to protected resource, if the token checks out.
I will work on a more detailed specification and report back when it is ready(3 weeks, I think)
This is a pro-bot rate-limiter. In reality, most humans use low-spec feature phones with slow CPUs. This punishes those humans and promotes bots running on GPUs and highjacked cloud infra.
Would only make sense to do if the browser exposes an optimized argon2 API to web apps. And it would have to be argon2 or some other memory hard hash for it to not be a joke.
Just checked a commercial captcha solving service, Recaptcha rate is currently at $1-$2 per thousand. Looks like this is only going to be cheaper to operate commercially.
I think the Lightning Network will be useful for this role someday. The only proof of work that ever needs to be necessarily run are on the Bitcoin network, and you can transfer Bitcoin or satoshis or fractions of a satoshi as the representative of work, instead of the doing the work directly.
> they will have to generate proof-of-work(a bunch of math that will takes time to compute)
This was tried before and failed! The reason it failed was Maths! Most favour clicking simple images or rearranging visual elements instead of solving Maths problems.
Commercial spammers use high-end resources, where as users don't have that luxury. So will it not be easy for the spammers to scale the resources to overcome this?
JS engines (or even WASM) aren't going to be as fast at this kind of work as native machine code would be. Especially when you consider that libraries like OpenSSL have heavily tuned implementations of the SHA algorithms. Any bot solving a SHA-based challenge would be able to extract the challenge from the page and execute it using native machine code faster than any legitimate user's browser could. And if you increase the difficulty of the challenge, it's just going to punish real users running the challenge in their browser more than it would the bots.
It's also based on the assumption that proof-of-work is going to increase the cost of doing business for the bots in some way and discourage their behavior. Many of the bots I was dealing with in my case were either using cloud compute services fraudulently or were running on compromised machines of unknowing people. And they tended not to care about how long it took or how high-effort the challenge was, they were very dedicated at getting past it and continuing their malicious behavior.
There's also the risk that any challenge that's sufficiently difficult may also make the user's browser angry that a script is either going unresponsive or eating tons of CPU, which isn't much different from cryptocurrency miner behavior.