- First analysis: CAPTCHAs and users
- Second analysis: CAPTCHA and security
- Recommended solutions
- HoneyPot and Time measuring, two simple techniques to put in place to identify bots
- Anti-spam and blacklist solutions to remove bot requests
- A logical or mathematical test, also called textual CAPTCHA
- An email, SMS or phone verification for reinforced security
A CAPTCHA (“Completely Automated Public Turing-test to tell Computers and Humans Apart”) is an automated test aimed at telling apart a human user from a software program. It is often used on forms to prevent spam.
There are several types of CAPTCHAs, most of them are visual tests that ask the user to type a series of deformed letters displayed on the screen.
There are security, privacy, usability and accessibility issues associated with CAPTCHAs. You must not use them unless you both:
- limit their use to cases where you detect suspicious activity (for example, you detect bot-like behaviour and need to test whether the user is human)
- have evidence to show that alternative solutions will not work for your service
Why are CAPTCHAs problematic?
Third-party CAPTCHA services could introduce additional risks, including:
1. security issues – if your provider’s security is compromised, your service and its users may also be affected
2. privacy concerns – for example, third-party services might set cookies, collect analytics and track users across multiple sites
3. performance issues – if you rely on a supplier, it means you’ll be affected by any performance problems or outages they experience
Your service could still be at risk, even with a CAPTCHA in place. Advances in computer imaging and the use of CAPTCHA farms means some bots will still be able to access your service.
First analysis: CAPTCHAs and users
CAPTCHAs are often problematic, even for savvy users. It is often necessary to undergo several trials before giving the right answer to a CAPTCHA. For some users a CAPTCHA is a no-go, plain and simple. For example a blind user cannot solve a visual CAPTCHA. Even if some sites provide alternatives, like audio CAPTCHAs in addition to visual CAPTCHAs, it actually seldom works. It’s even the first source of difficulty quoted by visually impaired users according to WebAIM’s latest survey at the end of 2017.
The boom in computation means enables bots and other programs to be more and more efficient. It is quite easy nowadays to write a program that guesses the text in an image or vocalised in an audio file.
For the CAPTCHA to stay efficient against bots, it is necessary to augment deformations in the picture and audio file, thus making it even harder for users to get it right.
This is why Google is trying to find alternatives to differentiate between a human user and a robot. They now propose “No CAPTCHA”, a plain checkbox from the user’s point of view. It uses additional information such as screen resolution, mouse pointer moves, the time it takes to fill a form, etc.
Even if this solution is more efficient, it is still not satisfactory from an accessibility point of view because, in case of doubt, a standard CAPTCHA is displayed. It is often the case for a user who does not use a mouse but a keyboard, or for a screen reader user (visually impaired users). You must thus always provide an alternative contact means (email, telephone, etc.) in the case when the CAPTCHA cannot be filled.
Between users that cannot input the CAPTCHA text and those who don’t understand what’s expected of them, adding a CAPTCHA is not benign regarding the audience of a site. Considering CAPTCHAs are problematic to many users, the first recommendation is to not use a CAPTCHA.
Second analysis: CAPTCHA and security
In a 2014 article by Google, we read that artificial intelligence get a 99.8% score when solving “even the most difficult variant of distorted text” – thus getting a better score than a real user! Services can guess which font was used in an image, or whether the image contains an object (a cat, a car, a hat, etc.). Same conclusion in this more recent article Breaking CAPTCHA Using Machine Learning in 0.05 Seconds.
Considering this (users bouncing from the site in frustration and uncertain security), we come back to our first recommendation: do not use a CAPTCHA.
Our idea is, first, to determine risks and to ask ourselves the following questions:
- What are the risks in case of an attack?
- What is the real need, between bouncing bots off and providing a secured solution?
According to our answers, we will be able to provide the solution most fit to the problem.
HoneyPot and Time measuring, two simple techniques to put in place to identify bots
These two techniques are transparent for the user, and the risks they pose are very limited.
The first solution consists in adding a hidden field in the form. This will never be filled by a user. If you detect server-side that the field was filled, it must be malevolent software.
The second technique consists in measuring the time it takes for the user to fill the form. If it is very fast, there’s a good chance that it’s malevolent software.
Anti-spam and blacklist solutions to remove bot requests
It is also possible, server-side, to triage information with anti-spam software and automatically remove submissions from malevolent software by analysing content data and the originating IP.
A logical or mathematical test, also called textual CAPTCHA
This is done through a simple sentence asking the user to copy a word, to solve a simple mathematical operation, etc.
- “Copy the word: ‘House’”
- “How much is 2 plus 2?”
- “Which animal meows?”
- “What colour is the sky?”
- Under the condition that the questions stay very basic, this kind of CAPTCHA will not be problematic to users.
A good practice in this case will be to provide a refresh solution, which will be very helpful for people with cognitive impairments.
An email, SMS or phone verification for reinforced security
This solution consists in sending an email, an SMS or in calling directly the user to make sure they can confirm the transaction, by clicking on a link or by sending the code they received.
This solution is at the same time more constraining for the user who must communicate personal information and more tedious to put in place, but it’s a very efficient solution when the security level needs to be high.
There is no perfect solution, either for the user or security-wise. You should opt for the best technique according to the service provided. Also, it is important in the case of attack to have logs to analyse and to prepare for further attacks.
- Captcha Accessibility and Thoughts by W3C
- Inaccessibility Of CAPTCHA by W3C
- The Captcha Conundrum & Accessible Alternatives by DigitalA11Y