[Full-disclosure] defeating voice captchas
Stelian Ene
stelian.ene at gecadtech.com
Tue Feb 14 08:20:14 GMT 2006
Gadi Evron wrote:
> Therefore, how many times does one have to refresh the page and listen
> to the Captcha to be able to simply learn to identify the Captcha by
> say, an MD5 hash of the audio for each letter?
That is just a bad implementation, when done well audio Captchas are
probably as secure as their visual counterparts.
"Done well" means that, besides the 10 digits (and/or 26 letters)
recorded by the sexy voice and replayed in a random order, the audio is
mixed with multiple sound sources, different for each generated Captcha.
For example, you can use a symphony(*), random white noise, the sound of
the street, or all of these, at a level of 3 or 6 dB above the voice.
The brain can easily distinguish the secret code from all the background
noise, but it's much more difficult for a computer.
While I'm not an audio expert either, I'm sure this problem is allot
harder than a simple MD5 - just look how bad state of the art voice
recognition software performs in almost ideal conditions, i.e. no
background noise etc.
(*) Of course, it's better to use sound sources that are hard to
identify, and are ideally not available to the attacker; else he could
obtain the same sounds and subtract them from the audio. I think some
random pitch shifting (tremolo) would help against this.
Full-Disclosure is hosted and sponsored by Secunia.