Taint Mode ¶
Note:
This article is about taint checking in Python. For the taint mode implementation by Juanjo Conti, see taintmode.py.
Taint mode is a language feature which can highlight injection flaws by tracing untrusted user input through the code by tracking the "taintedness" of variables. In taint mode, developers identify untrusted inputs, sanitization functions, and sensitive sinks. User input variables are initially marked as tainted, and the taintedness will be propagated as new variables are created based on already tainted variables. Tainted variables can be sanitized, after which they will be marked as untainted. Taint mode will prevent tainted data from going to identified sensitive sinks such as an interpreter or browser.
Implementations¶
- taintmode.py
- D. Kozlov and A. Petukhov created a taint mode for Python with a modified interpreter in this paper
- Django: django.utils.SafeString (and SafeUnicode) [ Source ]
Untrusted Input¶
Any data that is untrusted should be marked as tainted. In particular, any data coming from a user's request should be marked as tainted. This includes:
- GET and POST parameters
- HTTP Headers
- AJAX requests
You may also consider marking data from a persistence layer as tainted. A database may have been tampered with outside the application, or the data could have been intercepted and modified in transit.
Sanitization Methods¶
They allows escape, encode or validate input data to make them able to be sent todas any sink. Example of a sanitization function is Python cgi.escape:
>>> import cgi
>>> cgi.escape("<script>alert('this is an attack')</script>")
"<script>alert('this is an attack')</script>"
Sensitive Sinks¶
Some examples of sensitive sinks:
- Browser or HTML template engine
- SQL interpreter
- OS interpreter
- LDAP interpreter
- Python interpreter
-
Wiki content is available under a Creative Commons 3.0 License.
