SlideShare une entreprise Scribd logo
1  sur  9
Télécharger pour lire hors ligne
Automatic solving of Google reCAPTCHA v2
Authors: Ioseba Palop, Óscar Bralo, Álvaro Núñez
Redaction: Carmen Torrano
1 Abstract
CAPTCHAs are designed to distinguish between machines and human beings. Since
automatically solving CAPTCHAs implies that a bot can impersonate a human being, it is very
important to guarantee the effectiveness of CAPTCHAs. In this paper a mechanism to
automatically solving Google reCAPTCHA v2 is presented. In particular, it automatically solves
the audio challenge available for visually impaired individuals.
Although this reCAPTCHA is considered the hardest to break, the presented solution achieves a
92% success rate. This shows that Google reCAPTCHA v2 is not secure. Thus, the problem of
distinguishing humans from bots is still not properly solved.
2 Context
Ever since Alan Turing first proposed his famous Turing test in 1950, the problem of
distinguishing between people and robots has been a challenge in the field of artificial
intelligence. One of the methods presented for making such tests automatic are CAPTCHAs
(Completely Automated Public Turing test to tell Computers and Humans Apart). The term
CAPTCHA was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford
from Carnegie Mellon University [1].
CAPTCHAs [2] are automated tests designed to tell computers and humans apart by presenting
users with a problem that humans can solve but current computer programs cannot yet [1].
They are frequently used to prevent automated abusing of online services and secure different
applications, such as preventing bots from voting continuously in online polls, automatically
registering of millions of spam email accounts, automatically purchasing tickets to buy out an
event, etc.
CAPTCHA usually consist of a visual challenge, showing an image that the user should recognize,
like deciphering distorted characters, or answering questions related to the image shown (such
as identifying a house from a given set for example). However, since visual challenges limit
access to millions of visually impaired individuals, audio challenges were created. In this case, a
set of words, sentences or digits should be recognized from the audio.
Audio challenges are less frequent than their visual counterpart. It is estimated that nearly 1%
of all CAPTCHAs are delivered as an audio [5]. Additionally, Bursztein et al. [5] affirm that audio
challenges are harder to solve than image ones.
Von Ahn et al. [3] provide an estimation of the effort that humans spend solving CAPTCHAs.
Their results pointed out that humans around the world type more than 100 million CAPTCHAs
every day. The authors proposed the idea about turning this big amount of effort productive and
employing it for useful tasks, like digitizing books. This is the core philosophy behind the
reCAPTCHA project which is implemented by more than 40,000 Web sites.
When Google acquired reCAPTCHA in September 2009, they announced that current Artificial
Intelligence technology can solve even the most difficult variant of distorted text with 99.8%
accuracy. Consequently, in 2014 Google launched a new version of reCAPTCHA [4]. Its main
novelty is the distinction between machines and humans with a click. That is why this new
version of CAPTCHA is also known as “No CAPTCHA reCAPTCHA”. This distinction relies on
several security considerations are introduced in the design. Some of them are further detailed
in section “Security measures of Google reCAPTCHA v2”.
Their creators claim that it is designed to have anti-bot protection, in fact the slogan is “tough
on bots, easy on humans”. Google also affirms that reCAPTCHA is the most widely used in the
world, being used by Snapchat or Wordpress among others.
3 Solving Google reCAPTCHA v2 using audio challenge
In order to solve reCAPTCHA, the following steps are required:
1. Clicking the “I am not a robot”
checkbox. In some cases, the
reCAPTCHA will be solved only
clicking this checkbox. This behavior
is totally random based on Google
algorithms.
Fig.1. Example of a form with the Google reCAPTCHA v2.
2. If the click has not been classified as
human behavior, an image will
appear (visual challenge). However,
in order to make it accessible (for
blind people for example) the user
is presented with a headphones
button to get an audio challenge.
Fig. 2. Example of the visual challenge. The headphone
button is located in the bottom left corner.
3. After clicking on the headphones,
the user can click the “Play” button
and the browser will play the audio
challenge. Alternatively, it is also
possible to download the audio file
as MP3 by clicking the download
button.
Fig. 3. Example of the screen corresponding to the audio
challenge.
4. When playing the audio, only five digits are pronounced by different people, always in
English, with different intonations, different accents and different pauses.
5. The user is supposed to type the
digits heard into the box. If the
digits are correctly introduced,
reCAPTCHA considers that the
challenge has been solved by a
human. For each audio challenge
there is only one chance to solve it.
Fig. 4. Example of validated reCAPTCHA.
4 Security measures of the reCAPTCHA v2 audio challenge
Bursztein et al. [5] show in their study a comparison of the features corresponding to different
CAPTCHAs. According to the results, in 2010 the Google audio challenge had the following
characteristics: male voice, length of 5 to 15 digits, single digit charset [0-9], the average
duration of 37.1 seconds, sample rate 8000Hz, no beep and no repeat.
As of this paper, there is no official information about the security mechanisms implemented in
Google reCAPTCHA v2. However, five main measures have been deduced:
- It detects when a click is simulated, hence distinguishing it from a real mouse click of a
user.
- Audios are recorded with different speakers: pitch, intensity and accent.
- The digits have different pauses between them.
- The timing when typing is monitored, so, if the digits are typed too quickly, it is flagged
as machine behavior.
- Google controls the time spent to click the Verify button. In case it is clicked too quickly,
for example before the complete duration of an audio track, it is considered as bot
behavior.
- If it is considered that a bot is trying to automatically solving Google reCAPTCHA, the IP
address is banned for a certain period of time.
5 Related Work
In March 2016, Suphannee Sivakorn et al. [17] presented in Black Hat Asia 2016 their paper “I’m
not a human: Breaking the Google reCAPTCHA” with the automatic resolution of Google
reCAPTCHA using the image challenge. They achieved 70% of success rate, feeding their system
previously, storing and tagging all the images for future resolutions.
No more automatic solutions have been reported to date.
6 Solution details
The solution has been designed as a client-backend service architecture. The client consists of a
Chrome extension developed in Javascript and the backend service has been developed in the
.NET Framework.
The extension is designed so that it is enabled automatically when it detects an instance of
reCAPTCHA in the web page the user is currently visiting.
The proposed technique for automatically solving Google reCAPTCHA takes advantage of the
accessibility option, bypassing the audio reCAPTCHA.
The steps to get to the audio challenge and solve it were explained in Sec. “Example of solving
Google reCAPTCHA v2”.
The goal is to reproduce the steps that a human being would take without being detected as a
machine behavior.
6.1 Steps
1. For triggering a click on the “I am not a robot” checkbox, the extension detects the
coordinates where the reCAPTCHA iframe is located. To obtain those coordinates, it is
necessary that reCAPTCHA appears in the visible part of the DOM. This is expected since
human behavior is being simulated, human need to actually see the corresponding
checkbox. Once inside the visible DOM, the chrome extension is able to get the
coordinates correctly regardless of the window size and position. Then a call to the
backend service is made in order to perform the click on the checkbox coordinates.
2. The backend service triggers a click event in the specified position.
3. When the iframe with the image challenge appears, the extension gets the coordinates
where the headphones button is located, and make another call to the backend service
with the headphones position.
4. The backend triggers a new click event in the headphones button.
5. As soon as the last iframe is loaded, the chrome extension is able to obtain the url
corresponding to the audio file, together with the other needed coordinates (textbox,
verify button) in order to perform the last step. Then, this information is sent to the
backend service.
6. The backend service then processes the audio in order to get the numbers that can be
heard from that audio using Google Speech API. The audio file can have one of two
contents. If the behavior is judged as machine-like, the audio will play something similar
to this: “We are sorry, but we have detected that your computer is sending automatic
requests and to protect our users …”. In this case, the process stops. Otherwise, the
audio contains five digits and the process continues with the next steps. The audio
processing details are explained in Sec. “Voice recognition”.
7. The backend triggers a click event on the textbox and writes the digits. In order to bypass
the protection mechanism related to the typing speed and avoid being detected
because of typing too quickly, our solution waits for a random time between 0.5 and 1
seconds after typing each digit. This strategy is enough to deceive this protection
mechanism and make reCAPTCHA algorithms think that this behavior is human-like.
8. Finally, the backend service triggers a click event on the Verify button, the request for
solving reCAPTCHA is sent and Google replies if it has been correctly solved.
6.2 Voice recognition
The Google speech recognition API allows the definition of the set of words expected in the
audio file. This contributes to the effectiveness of the recognition when phonetically similar
words appear, maybe because the pronunciation of the speaker is not clear enough. Since
Google reCAPTCHA only uses digits, it is enough to specify a list of numbers from zero to nine.
To start working with Google Speech API, first the audio file has to be converted from MP3 to
FLAC, because this is one of the formats Google API recognizes.
The backend service sends three parallel requests to Google Speech API (one using an unaltered
version of the audio file, one reducing the silences between digits and another one reducing the
speed of the audio) in order to improve the success rate. Then, it stores the results to decide
which one should be used. The criterion to decide which of these three recognition results is the
winner is based on the number of digits recognized by each of them. The higher the number of
digits recognized, the better the method is considered. In case of a tie (same number of digits
recognized), any of the results is taken, in this case, the first one after ordering the alternatives.
We realized that introducing only three correct digits (not even sequential but three digits in any
position) allows the user to solve reCAPTCHA. We sort the results based on the count of digits
recognized, first five, then four and then three. Any count under three is discarded. If got any
number with count five, then it uses this one, if not, then it gets the four digits count, and so on.
6.3 Technical anti bot considerations
One of the security measures of Google reCAPTCHA is that the user needs to make real clicks.
Thus, in order to bypass this protection, and after trying to use a wide variety of
programmatically solutions, we decided to simulate the real clicks by making calls to the
Windows API. This solution makes possible to trigger a mouse event that is exactly the same that
a human being handling a mouse would do.
In a random (but small) percentage of the cases, instead of launching an audio challenge when
user clicks on the headphones, the Google reCAPTCHA launches a text challenge, where the user
should choose between different words proposed. In this case, our system would not be able to
automatically solve the reCAPTCHA, since its aim is solving audio reCAPTCHAs. This happens
randomly based on the machine learning algorithms within the reCAPTCHA. The solution to this
situation is to reload the page or ask for a new audio challenge. The resolution time of the
proposed solution is approximately 20 seconds with a 92% of success rate.
6.4 Experiments and Results
For the experiments a set of 1172 audio files was collected. From them, 328 were solved
automatically when clicking the “I am not a robot” checkbox.
We studied the effectiveness of the proposed solution for the remaining cases (844).
Table 1 shows the results obtained. Four cases are possible:
- Solved means that the proposed solution has resolved the captcha recognizing three or
more digits from the audio and Google verified it.
- Not Solved refers to two different cases. One is where the speech recognition API is able
to recognize three or more digits but these are wrong and Google did not verify the
whole number. The second case is when reCAPTCHA detects a bot-like behavior.
- Incomplete is where less than three digits are recognized from the audio file.
- Fail occurs when zero digits are obtained from the voice system or any error happens.
Result Recognized digit count Partial (%) Total (%)
Solved
3 11,01
92,064 32,58
5 48,47
Not Solved 2,84
Incomplete 4,74
Fail 0,36
Table 1. Performance results.
Table 1 shows that the proposed solution is able to automatically solve Google reCAPTCHA with
a 92.06% success rate, detailed with the count of digits recognized. It means that 92% of the
times it was able to impersonate a human being. Considering that CAPTCHAs are used to protect
against abusing services, this fact implies overkill and important consequences.
From those cases where the reCAPTCHA was automatically solved, we studied the effectiveness
of each processing audio technique. In 46.98% of cases the winner algorithm was the audio with
silence processing. The 38.47% the winner was the raw audio and the 14.29% was the audio
with speed processing.
We would like to mention that experiments cannot be repeated using the same audio twice, as
the verification process can only happen once.
7 Recommendations for strong audio challenges
Since one of the weakest points of Google reCAPTCHA audio is using only five digits, the
recommendation is using longer sequences. Furthermore, numbers do not need to be reduced
to one digit only (for example, numbers from 0 to 999 could be used).
Additionally, using the whole alphabet (not only numbers) in order to increase the search field.
Increasing the number of possibilities makes it harder for bots to break the CAPTCHA. Even
complete words could be introduced and mixed with letters and numbers.
Furthermore, the experiments reveal that solving only three out of five digits is enough in order
to solve the CAPTCHA. This decreases the search space to one thousand possibilities (three
digits). This again brings us to the known principle in security: a system is as secure as its weakest
link.
Another recommendation would be to introduce distortions in the audio. This would make the
understanding of the audio more difficult to machines. The background noise could be another
good thing to add.
Conclusions
Although Google reCAPTCHA was designed to be easy on humans and hard on bots, in this paper
it is shown that it is not secure. In fact, the proposed solution is able to break it in 92% of the
cases. This fact shows that the challenge of distinguishing humans from bots apart is still an open
problem.
The proposed solution relies on taking advantage of the audio challenge available for vision
impaired individuals. One of the weaknesses of the Google reCAPTCHA v2 is that for the audio
challenge it asks for five digits only. Furthermore, even guessing any three out of the five digits,
it is possible to solve the challenge. This reduces the scope to only 103
possibilities, which is far
from being considered secure.
In this paper it is shown that it is possible to automatically solve Google reCAPTCHA v2 in 92%
of the times. Considering that this is the strongest CAPTCHA, the situation is alarming. This
implies being able to impersonate people in scenarios such as e-voting, spam in mail accounts,
performing denial of service attacks and so on. For achieving a more secure digital world
stronger CAPTCHAs have to be designed.
Attendee Takeaways
- Notions about CAPTCHA and reCAPTCHA.
- State of the art in CAPTCHAs.
- Description of the Google reCAPTCHA v2 and some security measures it applies.
- Technique to automatically solving the Google reCAPTCHA v2. Voice recognition
processing techniques used.
- Recommendations for designing strong audio challenges.
What’s new?
The proposed solution is a fully automated reCAPTCHA solver that takes advantage of Google
speech API and it is the only known solution that is entirely based on the audio challenge. It
achieves a 92% success rate which is the highest among any other existing solutions, without
previous learning needed and no data storing. Since reCAPTCHA is owned by Google the present
proof of concept breaks a Google service by using another Google service.
Why Black Hat?
The consequences of bypassing CAPTCHAs can be very dramatic since they are designed to
distinguish between humans and bots in actions such as voting continuously in online polls,
automatically registering for millions of spam email accounts, automatically purchasing tickets
to buy out an event, etc. Furthermore, the number of users that interact with reCAPTCHAs is
extremely high.
We consider that it is vital to protect such scenarios and offer security for preventing these kind
of abuses.
Given the popularity of Black Hat and the type of public attending, we consider that it is the
perfect scenario for presenting our research. Given the importance of the consequences of these
attacks and the volume of users affected, we think that it should be presented in a conference
such as Black Hat.
With this talk, we also expect to create awareness not only about the importance of designing
strong CAPTCHAs, which an unsolved challenge nowadays, but also we hope that this helps in
the purpose of creating a more secure and trustable society and world.
References
[1] http://www.CAPTCHA.net
[2] L. von Ahn, M. Blum, and J. Langford. “Telling Humans and Computers Apart Automatically,”
Communications of the ACM, vol. 47, no. 2, pp. 57-60, Feb. 2004.
[3] Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008). reCAPTCHA: Human-
based character recognition via web security measures. Science, 321(5895), 1465-1468.
[4] https://www.google.com/reCAPTCHA/intro/index.html
[5] Bursztein, E., Bethard, S., Fabry, C., Mitchell, J. C., & Jurafsky, D. (2010, May). How Good Are
Humans at Solving CAPTCHAs? A Large Scale Evaluation. In IEEE Symposium on Security and
Privacy (pp. 399-413).
[6] Tam, J., Simsa, J., Hyde, S., & Ahn, L. V. (2008). Breaking audio CAPTCHAs. In Advances in
Neural Information Processing Systems (pp. 1625-1632).
[7] Wilkins, J. (2010). Strong CAPTCHA guidelines.
[8] Tam J, Huggins-Daines JD, von Ahn L, Blum M. (2008, July). Improving audio CAPTCHAs. In
Proceedings of the 2008 symposium on accessible privacy and security (SOAPS 2008), USA.
[9] Houck, C., Lee, J. (2010, August). Decoding reCAPTCHA. DEF CON 18 Hacking Conference.
[10] Adam, C-P, Jeffball. (2012 May). Codename Stiltwalker. Layer ONE hacker conference, USA.
[11] Cruz-Perez, C., Starostenko, O., Uceda-Ponga, F., Alarcon-Aquino, V., Reyes-Cabrera, L.
(2012, June) Breaking reCAPTCHAs with Unpredictable Collapse: Heuristic Character
Segmentation and Recognition, Volume 7329 of the series Lecture Notes in Computer Science
pp 155-165
[12] Chellapilla, K., and Simard, P. (2004). Using Machine Learning to Break Visual Human
Interaction Proofs (HIPs). In Advances in Neural Information Processing Systems 17, Neural
Information Processing Systems (NIPS). MIT Press.
[13] Chellapilla, K., Larson, K., Simard, P., and Czerwinski, M. (2005). Building Segmentation
Based Human-friendly Human Interaction Proofs. In 2nd Int’l Workshop on Human Interaction
Proofs, Springer-Verlag. LNCS 3517.
[14] Bursztein, E., Matthieu, M., and John M. (2011). Text-based CAPTCHA strengths and
weaknesses. In Proceedings of the 18th ACM conference on Computer and communications
security. ACM.
[15] Ahmad, E., Ahmad S., Yan, J., and Tayara, M. (2011). The robustness of Google CAPTCHA's.
Computing Science, Newcastle University.
[16] Yan, J. and El Ahmed, A.S. (2008, October). A Low-cost Attack on a Microsoft CAPTCHA. In
15th ACM Conference on Computer and Communications Security (CCS’08). Virginia, USA. ACM
Press. pp. 543-554.
[17] Suphannee Sivakorn, Jason Polakis, and Angelos D. Keromytis. (2016). I’m not a human:
Breaking the Google reCAPTCHA.

Contenu connexe

En vedette

Latch MyCar: Documentation
Latch MyCar: DocumentationLatch MyCar: Documentation
Latch MyCar: DocumentationTelefónica
 
Hacking iOS: iPhone & iPad (2º Edición) [Índice]
Hacking iOS: iPhone & iPad (2º Edición) [Índice]Hacking iOS: iPhone & iPad (2º Edición) [Índice]
Hacking iOS: iPhone & iPad (2º Edición) [Índice]Telefónica
 
Índice del libro de Windows Server 2016: Administración, Seguridad y Operaciones
Índice del libro de Windows Server 2016: Administración, Seguridad y OperacionesÍndice del libro de Windows Server 2016: Administración, Seguridad y Operaciones
Índice del libro de Windows Server 2016: Administración, Seguridad y OperacionesTelefónica
 
Servicio VPN con OpenVPN y Latch sobre Raspberry Pi
Servicio VPN con OpenVPN y Latch sobre Raspberry PiServicio VPN con OpenVPN y Latch sobre Raspberry Pi
Servicio VPN con OpenVPN y Latch sobre Raspberry PiTelefónica
 
CyberCamp 2015: Low Hanging Fruit
CyberCamp 2015: Low Hanging FruitCyberCamp 2015: Low Hanging Fruit
CyberCamp 2015: Low Hanging FruitChema Alonso
 
Índice Libro "macOS Hacking" de 0xWord
Índice Libro "macOS Hacking" de 0xWordÍndice Libro "macOS Hacking" de 0xWord
Índice Libro "macOS Hacking" de 0xWordTelefónica
 
Some dirty, quick and well-known tricks to hack your bad .NET WebApps
Some dirty, quick and well-known tricks to hack your bad .NET WebAppsSome dirty, quick and well-known tricks to hack your bad .NET WebApps
Some dirty, quick and well-known tricks to hack your bad .NET WebAppsTelefónica
 
WPM: Wordpress IN Paranoid MODE
WPM: Wordpress IN Paranoid MODEWPM: Wordpress IN Paranoid MODE
WPM: Wordpress IN Paranoid MODETelefónica
 
Malware en Android: Discovering, Reversing & Forensics
Malware en Android: Discovering, Reversing & ForensicsMalware en Android: Discovering, Reversing & Forensics
Malware en Android: Discovering, Reversing & ForensicsTelefónica
 
Máxima Seguridad en WordPress
Máxima Seguridad en WordPressMáxima Seguridad en WordPress
Máxima Seguridad en WordPressTelefónica
 
Índice Pentesting con Kali 2.0
Índice Pentesting con Kali 2.0Índice Pentesting con Kali 2.0
Índice Pentesting con Kali 2.0Chema Alonso
 
RamsonCloud O365: Paga por tus mensajes de correo en Office 365
RamsonCloud O365: Paga por tus mensajes de correo en Office 365RamsonCloud O365: Paga por tus mensajes de correo en Office 365
RamsonCloud O365: Paga por tus mensajes de correo en Office 365Telefónica
 
No me indexes que me cacheo
No me indexes que me cacheoNo me indexes que me cacheo
No me indexes que me cacheoChema Alonso
 
Analizando la efectividad de ataques de correlación pasivos en la red de ano...
Analizando la efectividad de ataques de correlación pasivos en la red de ano...Analizando la efectividad de ataques de correlación pasivos en la red de ano...
Analizando la efectividad de ataques de correlación pasivos en la red de ano...Chema Alonso
 
Tu iPhone es tan (in)seguro como tu Windows
Tu iPhone es tan (in)seguro como tu WindowsTu iPhone es tan (in)seguro como tu Windows
Tu iPhone es tan (in)seguro como tu WindowsChema Alonso
 
De paseo por la Deep Web
De paseo por la Deep WebDe paseo por la Deep Web
De paseo por la Deep WebChema Alonso
 

En vedette (16)

Latch MyCar: Documentation
Latch MyCar: DocumentationLatch MyCar: Documentation
Latch MyCar: Documentation
 
Hacking iOS: iPhone & iPad (2º Edición) [Índice]
Hacking iOS: iPhone & iPad (2º Edición) [Índice]Hacking iOS: iPhone & iPad (2º Edición) [Índice]
Hacking iOS: iPhone & iPad (2º Edición) [Índice]
 
Índice del libro de Windows Server 2016: Administración, Seguridad y Operaciones
Índice del libro de Windows Server 2016: Administración, Seguridad y OperacionesÍndice del libro de Windows Server 2016: Administración, Seguridad y Operaciones
Índice del libro de Windows Server 2016: Administración, Seguridad y Operaciones
 
Servicio VPN con OpenVPN y Latch sobre Raspberry Pi
Servicio VPN con OpenVPN y Latch sobre Raspberry PiServicio VPN con OpenVPN y Latch sobre Raspberry Pi
Servicio VPN con OpenVPN y Latch sobre Raspberry Pi
 
CyberCamp 2015: Low Hanging Fruit
CyberCamp 2015: Low Hanging FruitCyberCamp 2015: Low Hanging Fruit
CyberCamp 2015: Low Hanging Fruit
 
Índice Libro "macOS Hacking" de 0xWord
Índice Libro "macOS Hacking" de 0xWordÍndice Libro "macOS Hacking" de 0xWord
Índice Libro "macOS Hacking" de 0xWord
 
Some dirty, quick and well-known tricks to hack your bad .NET WebApps
Some dirty, quick and well-known tricks to hack your bad .NET WebAppsSome dirty, quick and well-known tricks to hack your bad .NET WebApps
Some dirty, quick and well-known tricks to hack your bad .NET WebApps
 
WPM: Wordpress IN Paranoid MODE
WPM: Wordpress IN Paranoid MODEWPM: Wordpress IN Paranoid MODE
WPM: Wordpress IN Paranoid MODE
 
Malware en Android: Discovering, Reversing & Forensics
Malware en Android: Discovering, Reversing & ForensicsMalware en Android: Discovering, Reversing & Forensics
Malware en Android: Discovering, Reversing & Forensics
 
Máxima Seguridad en WordPress
Máxima Seguridad en WordPressMáxima Seguridad en WordPress
Máxima Seguridad en WordPress
 
Índice Pentesting con Kali 2.0
Índice Pentesting con Kali 2.0Índice Pentesting con Kali 2.0
Índice Pentesting con Kali 2.0
 
RamsonCloud O365: Paga por tus mensajes de correo en Office 365
RamsonCloud O365: Paga por tus mensajes de correo en Office 365RamsonCloud O365: Paga por tus mensajes de correo en Office 365
RamsonCloud O365: Paga por tus mensajes de correo en Office 365
 
No me indexes que me cacheo
No me indexes que me cacheoNo me indexes que me cacheo
No me indexes que me cacheo
 
Analizando la efectividad de ataques de correlación pasivos en la red de ano...
Analizando la efectividad de ataques de correlación pasivos en la red de ano...Analizando la efectividad de ataques de correlación pasivos en la red de ano...
Analizando la efectividad de ataques de correlación pasivos en la red de ano...
 
Tu iPhone es tan (in)seguro como tu Windows
Tu iPhone es tan (in)seguro como tu WindowsTu iPhone es tan (in)seguro como tu Windows
Tu iPhone es tan (in)seguro como tu Windows
 
De paseo por la Deep Web
De paseo por la Deep WebDe paseo por la Deep Web
De paseo por la Deep Web
 

Plus de Telefónica

Índice de libro "Historias Cortas sobre Fondo Azul" de Willy en 0xWord
Índice de libro "Historias Cortas sobre Fondo Azul" de Willy en 0xWordÍndice de libro "Historias Cortas sobre Fondo Azul" de Willy en 0xWord
Índice de libro "Historias Cortas sobre Fondo Azul" de Willy en 0xWordTelefónica
 
Índice del libro: Máxima Seguridad en Windows: Secretos Técnicos. 6ª Edición ...
Índice del libro: Máxima Seguridad en Windows: Secretos Técnicos. 6ª Edición ...Índice del libro: Máxima Seguridad en Windows: Secretos Técnicos. 6ª Edición ...
Índice del libro: Máxima Seguridad en Windows: Secretos Técnicos. 6ª Edición ...Telefónica
 
Índice del libro "Hacking Web3: Challenge Acepted!" de 0xWord
Índice del libro "Hacking Web3: Challenge Acepted!" de 0xWordÍndice del libro "Hacking Web3: Challenge Acepted!" de 0xWord
Índice del libro "Hacking Web3: Challenge Acepted!" de 0xWordTelefónica
 
Índice del libro "Amazon Web Services: Hardening de Infraestructuras Cloud Co...
Índice del libro "Amazon Web Services: Hardening de Infraestructuras Cloud Co...Índice del libro "Amazon Web Services: Hardening de Infraestructuras Cloud Co...
Índice del libro "Amazon Web Services: Hardening de Infraestructuras Cloud Co...Telefónica
 
Índice del Libro "Ciberestafas: La historia de nunca acabar" (2ª Edición) de ...
Índice del Libro "Ciberestafas: La historia de nunca acabar" (2ª Edición) de ...Índice del Libro "Ciberestafas: La historia de nunca acabar" (2ª Edición) de ...
Índice del Libro "Ciberestafas: La historia de nunca acabar" (2ª Edición) de ...Telefónica
 
Índice del Libro "Storytelling para Emprendedores"
Índice del Libro "Storytelling para Emprendedores"Índice del Libro "Storytelling para Emprendedores"
Índice del Libro "Storytelling para Emprendedores"Telefónica
 
Digital Latches for Hacker & Developer
Digital Latches for Hacker & DeveloperDigital Latches for Hacker & Developer
Digital Latches for Hacker & DeveloperTelefónica
 
Índice del libro "Hardening de servidores GNU / Linux 5ª Edición (Gold Edition)"
Índice del libro "Hardening de servidores GNU / Linux 5ª Edición (Gold Edition)"Índice del libro "Hardening de servidores GNU / Linux 5ª Edición (Gold Edition)"
Índice del libro "Hardening de servidores GNU / Linux 5ª Edición (Gold Edition)"Telefónica
 
WhatsApp INT: OSINT en WhatsApp
WhatsApp INT: OSINT en WhatsAppWhatsApp INT: OSINT en WhatsApp
WhatsApp INT: OSINT en WhatsAppTelefónica
 
Índice del libro "De la Caverna al Metaverso" de 0xWord.com
Índice del libro "De la Caverna al Metaverso" de 0xWord.comÍndice del libro "De la Caverna al Metaverso" de 0xWord.com
Índice del libro "De la Caverna al Metaverso" de 0xWord.comTelefónica
 
20º Máster Universitario de Ciberseguridad UNIR
20º Máster Universitario de Ciberseguridad UNIR20º Máster Universitario de Ciberseguridad UNIR
20º Máster Universitario de Ciberseguridad UNIRTelefónica
 
BootCamp Online en DevOps (and SecDevOps) de GeeksHubs Academy
BootCamp Online en DevOps (and SecDevOps) de GeeksHubs AcademyBootCamp Online en DevOps (and SecDevOps) de GeeksHubs Academy
BootCamp Online en DevOps (and SecDevOps) de GeeksHubs AcademyTelefónica
 
Índice del libro "Ciberseguridad de tú a tú" de 0xWord
Índice del libro "Ciberseguridad de tú a tú"  de 0xWordÍndice del libro "Ciberseguridad de tú a tú"  de 0xWord
Índice del libro "Ciberseguridad de tú a tú" de 0xWordTelefónica
 
Índice del libro "Open Source INTelligence (OSINT): Investigar personas e Ide...
Índice del libro "Open Source INTelligence (OSINT): Investigar personas e Ide...Índice del libro "Open Source INTelligence (OSINT): Investigar personas e Ide...
Índice del libro "Open Source INTelligence (OSINT): Investigar personas e Ide...Telefónica
 
Índice del libro "Social Hunters" de 0xWord
Índice del libro "Social Hunters" de 0xWordÍndice del libro "Social Hunters" de 0xWord
Índice del libro "Social Hunters" de 0xWordTelefónica
 
Índice del libro "Kubernetes para profesionales: Desde cero al despliegue de ...
Índice del libro "Kubernetes para profesionales: Desde cero al despliegue de ...Índice del libro "Kubernetes para profesionales: Desde cero al despliegue de ...
Índice del libro "Kubernetes para profesionales: Desde cero al despliegue de ...Telefónica
 
Los retos sociales y éticos del Metaverso
Los retos sociales y éticos del MetaversoLos retos sociales y éticos del Metaverso
Los retos sociales y éticos del MetaversoTelefónica
 
Índice del Libro "Ciberestafas: La historia de nunca acabar" de 0xWord
Índice del Libro "Ciberestafas: La historia de nunca acabar" de 0xWordÍndice del Libro "Ciberestafas: La historia de nunca acabar" de 0xWord
Índice del Libro "Ciberestafas: La historia de nunca acabar" de 0xWordTelefónica
 
Índice del libro "Docker: SecDevOps" 2ª Edición de 0xWord
Índice del libro "Docker: SecDevOps" 2ª Edición de 0xWordÍndice del libro "Docker: SecDevOps" 2ª Edición de 0xWord
Índice del libro "Docker: SecDevOps" 2ª Edición de 0xWordTelefónica
 
Índice del libro "Malware moderno: Técnicas avanzadas y su influencia en la i...
Índice del libro "Malware moderno: Técnicas avanzadas y su influencia en la i...Índice del libro "Malware moderno: Técnicas avanzadas y su influencia en la i...
Índice del libro "Malware moderno: Técnicas avanzadas y su influencia en la i...Telefónica
 

Plus de Telefónica (20)

Índice de libro "Historias Cortas sobre Fondo Azul" de Willy en 0xWord
Índice de libro "Historias Cortas sobre Fondo Azul" de Willy en 0xWordÍndice de libro "Historias Cortas sobre Fondo Azul" de Willy en 0xWord
Índice de libro "Historias Cortas sobre Fondo Azul" de Willy en 0xWord
 
Índice del libro: Máxima Seguridad en Windows: Secretos Técnicos. 6ª Edición ...
Índice del libro: Máxima Seguridad en Windows: Secretos Técnicos. 6ª Edición ...Índice del libro: Máxima Seguridad en Windows: Secretos Técnicos. 6ª Edición ...
Índice del libro: Máxima Seguridad en Windows: Secretos Técnicos. 6ª Edición ...
 
Índice del libro "Hacking Web3: Challenge Acepted!" de 0xWord
Índice del libro "Hacking Web3: Challenge Acepted!" de 0xWordÍndice del libro "Hacking Web3: Challenge Acepted!" de 0xWord
Índice del libro "Hacking Web3: Challenge Acepted!" de 0xWord
 
Índice del libro "Amazon Web Services: Hardening de Infraestructuras Cloud Co...
Índice del libro "Amazon Web Services: Hardening de Infraestructuras Cloud Co...Índice del libro "Amazon Web Services: Hardening de Infraestructuras Cloud Co...
Índice del libro "Amazon Web Services: Hardening de Infraestructuras Cloud Co...
 
Índice del Libro "Ciberestafas: La historia de nunca acabar" (2ª Edición) de ...
Índice del Libro "Ciberestafas: La historia de nunca acabar" (2ª Edición) de ...Índice del Libro "Ciberestafas: La historia de nunca acabar" (2ª Edición) de ...
Índice del Libro "Ciberestafas: La historia de nunca acabar" (2ª Edición) de ...
 
Índice del Libro "Storytelling para Emprendedores"
Índice del Libro "Storytelling para Emprendedores"Índice del Libro "Storytelling para Emprendedores"
Índice del Libro "Storytelling para Emprendedores"
 
Digital Latches for Hacker & Developer
Digital Latches for Hacker & DeveloperDigital Latches for Hacker & Developer
Digital Latches for Hacker & Developer
 
Índice del libro "Hardening de servidores GNU / Linux 5ª Edición (Gold Edition)"
Índice del libro "Hardening de servidores GNU / Linux 5ª Edición (Gold Edition)"Índice del libro "Hardening de servidores GNU / Linux 5ª Edición (Gold Edition)"
Índice del libro "Hardening de servidores GNU / Linux 5ª Edición (Gold Edition)"
 
WhatsApp INT: OSINT en WhatsApp
WhatsApp INT: OSINT en WhatsAppWhatsApp INT: OSINT en WhatsApp
WhatsApp INT: OSINT en WhatsApp
 
Índice del libro "De la Caverna al Metaverso" de 0xWord.com
Índice del libro "De la Caverna al Metaverso" de 0xWord.comÍndice del libro "De la Caverna al Metaverso" de 0xWord.com
Índice del libro "De la Caverna al Metaverso" de 0xWord.com
 
20º Máster Universitario de Ciberseguridad UNIR
20º Máster Universitario de Ciberseguridad UNIR20º Máster Universitario de Ciberseguridad UNIR
20º Máster Universitario de Ciberseguridad UNIR
 
BootCamp Online en DevOps (and SecDevOps) de GeeksHubs Academy
BootCamp Online en DevOps (and SecDevOps) de GeeksHubs AcademyBootCamp Online en DevOps (and SecDevOps) de GeeksHubs Academy
BootCamp Online en DevOps (and SecDevOps) de GeeksHubs Academy
 
Índice del libro "Ciberseguridad de tú a tú" de 0xWord
Índice del libro "Ciberseguridad de tú a tú"  de 0xWordÍndice del libro "Ciberseguridad de tú a tú"  de 0xWord
Índice del libro "Ciberseguridad de tú a tú" de 0xWord
 
Índice del libro "Open Source INTelligence (OSINT): Investigar personas e Ide...
Índice del libro "Open Source INTelligence (OSINT): Investigar personas e Ide...Índice del libro "Open Source INTelligence (OSINT): Investigar personas e Ide...
Índice del libro "Open Source INTelligence (OSINT): Investigar personas e Ide...
 
Índice del libro "Social Hunters" de 0xWord
Índice del libro "Social Hunters" de 0xWordÍndice del libro "Social Hunters" de 0xWord
Índice del libro "Social Hunters" de 0xWord
 
Índice del libro "Kubernetes para profesionales: Desde cero al despliegue de ...
Índice del libro "Kubernetes para profesionales: Desde cero al despliegue de ...Índice del libro "Kubernetes para profesionales: Desde cero al despliegue de ...
Índice del libro "Kubernetes para profesionales: Desde cero al despliegue de ...
 
Los retos sociales y éticos del Metaverso
Los retos sociales y éticos del MetaversoLos retos sociales y éticos del Metaverso
Los retos sociales y éticos del Metaverso
 
Índice del Libro "Ciberestafas: La historia de nunca acabar" de 0xWord
Índice del Libro "Ciberestafas: La historia de nunca acabar" de 0xWordÍndice del Libro "Ciberestafas: La historia de nunca acabar" de 0xWord
Índice del Libro "Ciberestafas: La historia de nunca acabar" de 0xWord
 
Índice del libro "Docker: SecDevOps" 2ª Edición de 0xWord
Índice del libro "Docker: SecDevOps" 2ª Edición de 0xWordÍndice del libro "Docker: SecDevOps" 2ª Edición de 0xWord
Índice del libro "Docker: SecDevOps" 2ª Edición de 0xWord
 
Índice del libro "Malware moderno: Técnicas avanzadas y su influencia en la i...
Índice del libro "Malware moderno: Técnicas avanzadas y su influencia en la i...Índice del libro "Malware moderno: Técnicas avanzadas y su influencia en la i...
Índice del libro "Malware moderno: Técnicas avanzadas y su influencia en la i...
 

Dernier

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Dernier (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Hacking Google reCaptcha with Google Voice Recognition... and Google Chrome in a Google ChromeBook

  • 1. Automatic solving of Google reCAPTCHA v2 Authors: Ioseba Palop, Óscar Bralo, Álvaro Núñez Redaction: Carmen Torrano 1 Abstract CAPTCHAs are designed to distinguish between machines and human beings. Since automatically solving CAPTCHAs implies that a bot can impersonate a human being, it is very important to guarantee the effectiveness of CAPTCHAs. In this paper a mechanism to automatically solving Google reCAPTCHA v2 is presented. In particular, it automatically solves the audio challenge available for visually impaired individuals. Although this reCAPTCHA is considered the hardest to break, the presented solution achieves a 92% success rate. This shows that Google reCAPTCHA v2 is not secure. Thus, the problem of distinguishing humans from bots is still not properly solved. 2 Context Ever since Alan Turing first proposed his famous Turing test in 1950, the problem of distinguishing between people and robots has been a challenge in the field of artificial intelligence. One of the methods presented for making such tests automatic are CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart). The term CAPTCHA was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford from Carnegie Mellon University [1]. CAPTCHAs [2] are automated tests designed to tell computers and humans apart by presenting users with a problem that humans can solve but current computer programs cannot yet [1]. They are frequently used to prevent automated abusing of online services and secure different applications, such as preventing bots from voting continuously in online polls, automatically registering of millions of spam email accounts, automatically purchasing tickets to buy out an event, etc. CAPTCHA usually consist of a visual challenge, showing an image that the user should recognize, like deciphering distorted characters, or answering questions related to the image shown (such as identifying a house from a given set for example). However, since visual challenges limit access to millions of visually impaired individuals, audio challenges were created. In this case, a set of words, sentences or digits should be recognized from the audio. Audio challenges are less frequent than their visual counterpart. It is estimated that nearly 1% of all CAPTCHAs are delivered as an audio [5]. Additionally, Bursztein et al. [5] affirm that audio challenges are harder to solve than image ones. Von Ahn et al. [3] provide an estimation of the effort that humans spend solving CAPTCHAs. Their results pointed out that humans around the world type more than 100 million CAPTCHAs every day. The authors proposed the idea about turning this big amount of effort productive and employing it for useful tasks, like digitizing books. This is the core philosophy behind the reCAPTCHA project which is implemented by more than 40,000 Web sites. When Google acquired reCAPTCHA in September 2009, they announced that current Artificial Intelligence technology can solve even the most difficult variant of distorted text with 99.8% accuracy. Consequently, in 2014 Google launched a new version of reCAPTCHA [4]. Its main novelty is the distinction between machines and humans with a click. That is why this new version of CAPTCHA is also known as “No CAPTCHA reCAPTCHA”. This distinction relies on
  • 2. several security considerations are introduced in the design. Some of them are further detailed in section “Security measures of Google reCAPTCHA v2”. Their creators claim that it is designed to have anti-bot protection, in fact the slogan is “tough on bots, easy on humans”. Google also affirms that reCAPTCHA is the most widely used in the world, being used by Snapchat or Wordpress among others. 3 Solving Google reCAPTCHA v2 using audio challenge In order to solve reCAPTCHA, the following steps are required: 1. Clicking the “I am not a robot” checkbox. In some cases, the reCAPTCHA will be solved only clicking this checkbox. This behavior is totally random based on Google algorithms. Fig.1. Example of a form with the Google reCAPTCHA v2. 2. If the click has not been classified as human behavior, an image will appear (visual challenge). However, in order to make it accessible (for blind people for example) the user is presented with a headphones button to get an audio challenge. Fig. 2. Example of the visual challenge. The headphone button is located in the bottom left corner.
  • 3. 3. After clicking on the headphones, the user can click the “Play” button and the browser will play the audio challenge. Alternatively, it is also possible to download the audio file as MP3 by clicking the download button. Fig. 3. Example of the screen corresponding to the audio challenge. 4. When playing the audio, only five digits are pronounced by different people, always in English, with different intonations, different accents and different pauses. 5. The user is supposed to type the digits heard into the box. If the digits are correctly introduced, reCAPTCHA considers that the challenge has been solved by a human. For each audio challenge there is only one chance to solve it. Fig. 4. Example of validated reCAPTCHA. 4 Security measures of the reCAPTCHA v2 audio challenge Bursztein et al. [5] show in their study a comparison of the features corresponding to different CAPTCHAs. According to the results, in 2010 the Google audio challenge had the following characteristics: male voice, length of 5 to 15 digits, single digit charset [0-9], the average duration of 37.1 seconds, sample rate 8000Hz, no beep and no repeat. As of this paper, there is no official information about the security mechanisms implemented in Google reCAPTCHA v2. However, five main measures have been deduced: - It detects when a click is simulated, hence distinguishing it from a real mouse click of a user. - Audios are recorded with different speakers: pitch, intensity and accent. - The digits have different pauses between them. - The timing when typing is monitored, so, if the digits are typed too quickly, it is flagged as machine behavior. - Google controls the time spent to click the Verify button. In case it is clicked too quickly, for example before the complete duration of an audio track, it is considered as bot behavior. - If it is considered that a bot is trying to automatically solving Google reCAPTCHA, the IP address is banned for a certain period of time.
  • 4. 5 Related Work In March 2016, Suphannee Sivakorn et al. [17] presented in Black Hat Asia 2016 their paper “I’m not a human: Breaking the Google reCAPTCHA” with the automatic resolution of Google reCAPTCHA using the image challenge. They achieved 70% of success rate, feeding their system previously, storing and tagging all the images for future resolutions. No more automatic solutions have been reported to date. 6 Solution details The solution has been designed as a client-backend service architecture. The client consists of a Chrome extension developed in Javascript and the backend service has been developed in the .NET Framework. The extension is designed so that it is enabled automatically when it detects an instance of reCAPTCHA in the web page the user is currently visiting. The proposed technique for automatically solving Google reCAPTCHA takes advantage of the accessibility option, bypassing the audio reCAPTCHA. The steps to get to the audio challenge and solve it were explained in Sec. “Example of solving Google reCAPTCHA v2”. The goal is to reproduce the steps that a human being would take without being detected as a machine behavior. 6.1 Steps 1. For triggering a click on the “I am not a robot” checkbox, the extension detects the coordinates where the reCAPTCHA iframe is located. To obtain those coordinates, it is necessary that reCAPTCHA appears in the visible part of the DOM. This is expected since human behavior is being simulated, human need to actually see the corresponding checkbox. Once inside the visible DOM, the chrome extension is able to get the coordinates correctly regardless of the window size and position. Then a call to the backend service is made in order to perform the click on the checkbox coordinates. 2. The backend service triggers a click event in the specified position. 3. When the iframe with the image challenge appears, the extension gets the coordinates where the headphones button is located, and make another call to the backend service with the headphones position. 4. The backend triggers a new click event in the headphones button. 5. As soon as the last iframe is loaded, the chrome extension is able to obtain the url corresponding to the audio file, together with the other needed coordinates (textbox, verify button) in order to perform the last step. Then, this information is sent to the backend service. 6. The backend service then processes the audio in order to get the numbers that can be heard from that audio using Google Speech API. The audio file can have one of two contents. If the behavior is judged as machine-like, the audio will play something similar to this: “We are sorry, but we have detected that your computer is sending automatic requests and to protect our users …”. In this case, the process stops. Otherwise, the audio contains five digits and the process continues with the next steps. The audio processing details are explained in Sec. “Voice recognition”.
  • 5. 7. The backend triggers a click event on the textbox and writes the digits. In order to bypass the protection mechanism related to the typing speed and avoid being detected because of typing too quickly, our solution waits for a random time between 0.5 and 1 seconds after typing each digit. This strategy is enough to deceive this protection mechanism and make reCAPTCHA algorithms think that this behavior is human-like. 8. Finally, the backend service triggers a click event on the Verify button, the request for solving reCAPTCHA is sent and Google replies if it has been correctly solved. 6.2 Voice recognition The Google speech recognition API allows the definition of the set of words expected in the audio file. This contributes to the effectiveness of the recognition when phonetically similar words appear, maybe because the pronunciation of the speaker is not clear enough. Since Google reCAPTCHA only uses digits, it is enough to specify a list of numbers from zero to nine. To start working with Google Speech API, first the audio file has to be converted from MP3 to FLAC, because this is one of the formats Google API recognizes. The backend service sends three parallel requests to Google Speech API (one using an unaltered version of the audio file, one reducing the silences between digits and another one reducing the speed of the audio) in order to improve the success rate. Then, it stores the results to decide which one should be used. The criterion to decide which of these three recognition results is the winner is based on the number of digits recognized by each of them. The higher the number of digits recognized, the better the method is considered. In case of a tie (same number of digits recognized), any of the results is taken, in this case, the first one after ordering the alternatives. We realized that introducing only three correct digits (not even sequential but three digits in any position) allows the user to solve reCAPTCHA. We sort the results based on the count of digits recognized, first five, then four and then three. Any count under three is discarded. If got any number with count five, then it uses this one, if not, then it gets the four digits count, and so on. 6.3 Technical anti bot considerations One of the security measures of Google reCAPTCHA is that the user needs to make real clicks. Thus, in order to bypass this protection, and after trying to use a wide variety of programmatically solutions, we decided to simulate the real clicks by making calls to the Windows API. This solution makes possible to trigger a mouse event that is exactly the same that a human being handling a mouse would do. In a random (but small) percentage of the cases, instead of launching an audio challenge when user clicks on the headphones, the Google reCAPTCHA launches a text challenge, where the user should choose between different words proposed. In this case, our system would not be able to automatically solve the reCAPTCHA, since its aim is solving audio reCAPTCHAs. This happens randomly based on the machine learning algorithms within the reCAPTCHA. The solution to this situation is to reload the page or ask for a new audio challenge. The resolution time of the proposed solution is approximately 20 seconds with a 92% of success rate. 6.4 Experiments and Results For the experiments a set of 1172 audio files was collected. From them, 328 were solved automatically when clicking the “I am not a robot” checkbox. We studied the effectiveness of the proposed solution for the remaining cases (844).
  • 6. Table 1 shows the results obtained. Four cases are possible: - Solved means that the proposed solution has resolved the captcha recognizing three or more digits from the audio and Google verified it. - Not Solved refers to two different cases. One is where the speech recognition API is able to recognize three or more digits but these are wrong and Google did not verify the whole number. The second case is when reCAPTCHA detects a bot-like behavior. - Incomplete is where less than three digits are recognized from the audio file. - Fail occurs when zero digits are obtained from the voice system or any error happens. Result Recognized digit count Partial (%) Total (%) Solved 3 11,01 92,064 32,58 5 48,47 Not Solved 2,84 Incomplete 4,74 Fail 0,36 Table 1. Performance results. Table 1 shows that the proposed solution is able to automatically solve Google reCAPTCHA with a 92.06% success rate, detailed with the count of digits recognized. It means that 92% of the times it was able to impersonate a human being. Considering that CAPTCHAs are used to protect against abusing services, this fact implies overkill and important consequences. From those cases where the reCAPTCHA was automatically solved, we studied the effectiveness of each processing audio technique. In 46.98% of cases the winner algorithm was the audio with silence processing. The 38.47% the winner was the raw audio and the 14.29% was the audio with speed processing. We would like to mention that experiments cannot be repeated using the same audio twice, as the verification process can only happen once. 7 Recommendations for strong audio challenges Since one of the weakest points of Google reCAPTCHA audio is using only five digits, the recommendation is using longer sequences. Furthermore, numbers do not need to be reduced to one digit only (for example, numbers from 0 to 999 could be used). Additionally, using the whole alphabet (not only numbers) in order to increase the search field. Increasing the number of possibilities makes it harder for bots to break the CAPTCHA. Even complete words could be introduced and mixed with letters and numbers. Furthermore, the experiments reveal that solving only three out of five digits is enough in order to solve the CAPTCHA. This decreases the search space to one thousand possibilities (three digits). This again brings us to the known principle in security: a system is as secure as its weakest link. Another recommendation would be to introduce distortions in the audio. This would make the understanding of the audio more difficult to machines. The background noise could be another good thing to add.
  • 7. Conclusions Although Google reCAPTCHA was designed to be easy on humans and hard on bots, in this paper it is shown that it is not secure. In fact, the proposed solution is able to break it in 92% of the cases. This fact shows that the challenge of distinguishing humans from bots apart is still an open problem. The proposed solution relies on taking advantage of the audio challenge available for vision impaired individuals. One of the weaknesses of the Google reCAPTCHA v2 is that for the audio challenge it asks for five digits only. Furthermore, even guessing any three out of the five digits, it is possible to solve the challenge. This reduces the scope to only 103 possibilities, which is far from being considered secure. In this paper it is shown that it is possible to automatically solve Google reCAPTCHA v2 in 92% of the times. Considering that this is the strongest CAPTCHA, the situation is alarming. This implies being able to impersonate people in scenarios such as e-voting, spam in mail accounts, performing denial of service attacks and so on. For achieving a more secure digital world stronger CAPTCHAs have to be designed. Attendee Takeaways - Notions about CAPTCHA and reCAPTCHA. - State of the art in CAPTCHAs. - Description of the Google reCAPTCHA v2 and some security measures it applies. - Technique to automatically solving the Google reCAPTCHA v2. Voice recognition processing techniques used. - Recommendations for designing strong audio challenges. What’s new? The proposed solution is a fully automated reCAPTCHA solver that takes advantage of Google speech API and it is the only known solution that is entirely based on the audio challenge. It achieves a 92% success rate which is the highest among any other existing solutions, without previous learning needed and no data storing. Since reCAPTCHA is owned by Google the present proof of concept breaks a Google service by using another Google service. Why Black Hat? The consequences of bypassing CAPTCHAs can be very dramatic since they are designed to distinguish between humans and bots in actions such as voting continuously in online polls, automatically registering for millions of spam email accounts, automatically purchasing tickets to buy out an event, etc. Furthermore, the number of users that interact with reCAPTCHAs is extremely high. We consider that it is vital to protect such scenarios and offer security for preventing these kind of abuses. Given the popularity of Black Hat and the type of public attending, we consider that it is the perfect scenario for presenting our research. Given the importance of the consequences of these attacks and the volume of users affected, we think that it should be presented in a conference such as Black Hat.
  • 8. With this talk, we also expect to create awareness not only about the importance of designing strong CAPTCHAs, which an unsolved challenge nowadays, but also we hope that this helps in the purpose of creating a more secure and trustable society and world. References [1] http://www.CAPTCHA.net [2] L. von Ahn, M. Blum, and J. Langford. “Telling Humans and Computers Apart Automatically,” Communications of the ACM, vol. 47, no. 2, pp. 57-60, Feb. 2004. [3] Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008). reCAPTCHA: Human- based character recognition via web security measures. Science, 321(5895), 1465-1468. [4] https://www.google.com/reCAPTCHA/intro/index.html [5] Bursztein, E., Bethard, S., Fabry, C., Mitchell, J. C., & Jurafsky, D. (2010, May). How Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation. In IEEE Symposium on Security and Privacy (pp. 399-413). [6] Tam, J., Simsa, J., Hyde, S., & Ahn, L. V. (2008). Breaking audio CAPTCHAs. In Advances in Neural Information Processing Systems (pp. 1625-1632). [7] Wilkins, J. (2010). Strong CAPTCHA guidelines. [8] Tam J, Huggins-Daines JD, von Ahn L, Blum M. (2008, July). Improving audio CAPTCHAs. In Proceedings of the 2008 symposium on accessible privacy and security (SOAPS 2008), USA. [9] Houck, C., Lee, J. (2010, August). Decoding reCAPTCHA. DEF CON 18 Hacking Conference. [10] Adam, C-P, Jeffball. (2012 May). Codename Stiltwalker. Layer ONE hacker conference, USA. [11] Cruz-Perez, C., Starostenko, O., Uceda-Ponga, F., Alarcon-Aquino, V., Reyes-Cabrera, L. (2012, June) Breaking reCAPTCHAs with Unpredictable Collapse: Heuristic Character Segmentation and Recognition, Volume 7329 of the series Lecture Notes in Computer Science pp 155-165 [12] Chellapilla, K., and Simard, P. (2004). Using Machine Learning to Break Visual Human Interaction Proofs (HIPs). In Advances in Neural Information Processing Systems 17, Neural Information Processing Systems (NIPS). MIT Press. [13] Chellapilla, K., Larson, K., Simard, P., and Czerwinski, M. (2005). Building Segmentation Based Human-friendly Human Interaction Proofs. In 2nd Int’l Workshop on Human Interaction Proofs, Springer-Verlag. LNCS 3517. [14] Bursztein, E., Matthieu, M., and John M. (2011). Text-based CAPTCHA strengths and weaknesses. In Proceedings of the 18th ACM conference on Computer and communications security. ACM. [15] Ahmad, E., Ahmad S., Yan, J., and Tayara, M. (2011). The robustness of Google CAPTCHA's. Computing Science, Newcastle University. [16] Yan, J. and El Ahmed, A.S. (2008, October). A Low-cost Attack on a Microsoft CAPTCHA. In 15th ACM Conference on Computer and Communications Security (CCS’08). Virginia, USA. ACM Press. pp. 543-554.
  • 9. [17] Suphannee Sivakorn, Jason Polakis, and Angelos D. Keromytis. (2016). I’m not a human: Breaking the Google reCAPTCHA.