[Full-disclosure] FW: Introducing a new generic approach to detecting SQL injection
Paul J. Morris
mole at morris.net
Fri Apr 22 21:39:05 BST 2005
On Fri, 22 Apr 2005 15:26:41 -0400
Mohit Muthanna <mohit.muthanna at gmail.com> wrote:
> > Once the allowed character set gets beyond $sanitized =
> > preg_replace("/[^a-zA-Z0-9]/", "", $untrusted) especially into the
> > realm
> Don't use simple regexp matching.
Why not? I am not matching known attacks, I am stripping everything but
a small set of known good characters. How are you going to construct a
sql injection attack using the character set [A-Za-z]? Yes, you can
try to overflow preg_replace (or the dbms if I don't truncate your
input), but the set [A-Za-z] isn't going to enable a sql injection
attack. If I have a single field being submitted from a form where the
characters in a legitimate query will only be in the set [A-Za-z], I
know with certainty that $santized will not contain a sql injection
attack if I filter it with $sanitized = preg_replace("/[^a-zA-Z]/", "",
$untrusted), regardless of any other dependencies (e.g. with php, I am
not dependent on the settings of safe_mode or of magic_quotes_gpc). If
the set of legitimate characters includes quote characters or slashes or
the like, then I entirely agree with you that escaping and encoding
libraries are an important element.
> This technique, though novel, is really
Agreed, most of the time there are better and more efficient ways to
handle the problem. I find it interesting as it appears (and I'm not
sure that this is true), to rely on passing the known good rather than
filtering out a set of known attacks.
> I'll reiterate; unless your regexp is robustly tested don't use it.
> There are many libraries out there for URL/Base64/Unicode/etc. etc.
> encoding, decoding and escaping. Use them to clean up your input.
I have seen too many discussions of ways to get around escaping of
attack characters by interesting twists on encoding to be sure that the
library I choose has though of all of the possible ways around the
decoding and escaping. Encoding/decoding/escaping relies on the
library recognizing known attack characters, something it may be very
good at, but something experience has taught us is hard to do.
> If your database API supports it, use prepared statements and
> parameter binding.
Agreed. Prepared statements are a very powerful tool, when
> Don't use simple string interpolation (without quote handling).
I don't see the rationale for this. The rationale for never filter
out known bad characters is clear, but filtering out all but a small set
of known good characters seems the simplest and surest way of sanitizing
> It's really that easy.
In the realm of multibyte characters with multiple kinds of clients
I'm not at all convinced it is. I don't know that an attacker isn't
going to encode a query terminating character in a way that is going to
get through the decoding and escaping. The fundamental principle of
escaping is that of recognizing known bad characters - something that
experience teaches us is inferior to allowing only known good characters
> Mohit Muthanna [mohit (at) muthanna (uhuh) com]
Paul J. Morris
Biodiversity Information Manager, The Academy of Natural Sciences
1900 Ben Franklin Parkway, Philadelphia PA, 19103, USA
mole at morris.net AA3SD PGP public key available
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://lists.grok.org.uk/pipermail/full-disclosure/attachments/20050422/2987c929/attachment.bin
Full-Disclosure is hosted and sponsored by Secunia.