Version 4.10.0

hirondelle.web4j.util
Class EscapeChars

Object
  extended by hirondelle.web4j.util.EscapeChars

public final class EscapeChars
extends Object

Convenience methods for escaping special characters related to HTML, XML, and regular expressions.

To keep you safe by default, WEB4J goes to some effort to escape characters in your data when appropriate, such that you usually don't need to think too much about escaping special characters. Thus, you shouldn't need to directly use the services of this class very often.

For Model Objects containing free form user input, it is highly recommended that you use SafeText, not String. Free form user input is open to malicious use, such as Cross Site Scripting attacks. Using SafeText will protect you from such attacks, by always escaping special characters automatically in its toString() method.

The following WEB4J classes will automatically escape special characters for you, when needed :


Method Summary
static String forHrefAmpersand(String aURL)
          Escape all ampersand characters in a URL.
static String forHTML(String aText)
          Escape characters for text appearing in HTML markup.
static String forJSON(String aText)
          Escapes characters for text appearing as data in the Javascript Object Notation (JSON) data interchange format.
static String forRegex(String aRegexFragment)
          Replace characters having special meaning in regular expressions with their escaped equivalents, preceded by a '\' character.
static String forReplacementString(String aInput)
          Escape '$' and '\' characters in replacement strings.
static String forScriptTagsOnly(String aText)
          Disable all <SCRIPT> tags in aText.
static String forURL(String aURLFragment)
          Synonym for URLEncoder.encode(String, "UTF-8").
static String forXML(String aText)
          Escape characters for text appearing as XML data, between tags.
static String toDisableTags(String aText)
          Return aText with all '<' and '>' characters replaced by their escaped equivalents.
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

forHTML

public static String forHTML(String aText)
Escape characters for text appearing in HTML markup.

This method exists as a defence against Cross Site Scripting (XSS) hacks. The idea is to neutralize control characters commonly used by scripts, such that they will not be executed by the browser. This is done by replacing the control characters with their escaped equivalents. See SafeText as well.

The following characters are replaced with corresponding HTML character entities :

Character Replacement
< &lt;
> &gt;
& &amp;
" &quot;
\t &#009;
! &#033;
# &#035;
$ &#036;
% &#037;
' &#039;
( &#040;
) &#041;
* &#042;
+ &#043;
, &#044;
- &#045;
. &#046;
/ &#047;
: &#058;
; &#059;
= &#061;
? &#063;
@ &#064;
[ &#091;
\ &#092;
] &#093;
^ &#094;
_ &#095;
` &#096;
{ &#123;
| &#124;
} &#125;
~ &#126;

Note that JSTL's <c:out> escapes only the first five of the above characters.


forHrefAmpersand

public static String forHrefAmpersand(String aURL)
Escape all ampersand characters in a URL.

Replaces all '&' characters with '&amp;'.

An ampersand character may appear in the query string of a URL. The ampersand character is indeed valid in a URL. However, URLs usually appear as an HREF attribute, and such attributes have the additional constraint that ampersands must be escaped.

The JSTL <c:url> tag does indeed perform proper URL encoding of query parameters. But it does not, in general, produce text which is valid as an HREF attribute, simply because it does not escape the ampersand character. This is a nuisance when multiple query parameters appear in the URL, since it requires a little extra work.


forURL

public static String forURL(String aURLFragment)
Synonym for URLEncoder.encode(String, "UTF-8").

Used to ensure that HTTP query strings are in proper form, by escaping special characters such as spaces.

It is important to note that if a query string appears in an HREF attribute, then there are two issues - ensuring the query string is valid HTTP (it is URL-encoded), and ensuring it is valid HTML (ensuring the ampersand is escaped).


forXML

public static String forXML(String aText)
Escape characters for text appearing as XML data, between tags.

The following characters are replaced with corresponding character entities :

Character Encoding
< &lt;
> &gt;
& &amp;
" &quot;
' &#039;

Note that JSTL's <c:out> escapes the exact same set of characters as this method. That is, <c:out> is good for escaping to produce valid XML, but not for producing safe HTML.


forJSON

public static String forJSON(String aText)
Escapes characters for text appearing as data in the Javascript Object Notation (JSON) data interchange format.

The following commonly used control characters are escaped :

Character Escaped As
" \"
\ \\
/ \/
back space \b
form feed \f
line feed \n
carriage return \r
tab \t

See RFC 4627 for more information.


toDisableTags

public static String toDisableTags(String aText)
Return aText with all '<' and '>' characters replaced by their escaped equivalents.


forRegex

public static String forRegex(String aRegexFragment)
Replace characters having special meaning in regular expressions with their escaped equivalents, preceded by a '\' character.

The escaped characters include :


forReplacementString

public static String forReplacementString(String aInput)
Escape '$' and '\' characters in replacement strings.

Synonym for Matcher.quoteReplacement(String).

The following methods use replacement strings which treat '$' and '\' as special characters:

If replacement text can contain arbitrary characters, then you will usually need to escape that text, to ensure special characters are interpreted literally.


forScriptTagsOnly

public static String forScriptTagsOnly(String aText)
Disable all <SCRIPT> tags in aText.

Insensitive to case.


Version 4.10.0

Copyright Hirondelle Systems. Published October 19, 2013 - User Guide - All Docs.