org.weblab_project.core.jaxb
Class XMLStringCleaner

java.lang.Object
  extended by org.weblab_project.core.jaxb.XMLStringCleaner

public class XMLStringCleaner
extends java.lang.Object

Some unicode valid (and UTF-8) valid chars are not valid in XML. Using the information given on W3C website (W3C XML reference), this class enable to clean Strings carefully.

Author:
EADS WebLab Team
Date:
2008-05-28

Method Summary
static java.lang.String getXMLRecommendedString(java.lang.String input)
          Removes the chars that are not recommended in XML according to the W3C XML reference.
static java.lang.String getXMLRecommendedString(java.lang.String input, char replacement)
          Replaces the chars in input that are not recommended in XML according to the W3C XML reference by the replacement character.
static java.lang.String getXMLValidString(java.lang.String input)
          Removes the chars that are not valid in XML according to the W3C XML reference.
static java.lang.String getXMLValidString(java.lang.String input, char replacement)
          Removes the chars that are not valid in XML according to the W3C XML reference.
static boolean isXMLRecommended(char c)
          The XML reference said that the use of the following chars is discouraged:
[#x7F-#x84], [#x86-#x9F], [#xFDD0-#xFDDF], [#x1FFFE-#x1FFFF], [#x2FFFE-#x2FFFF], [#x3FFFE-#x3FFFF], [#x4FFFE-#x4FFFF], [#x5FFFE-#x5FFFF], [#x6FFFE-#x6FFFF], [#x7FFFE-#x7FFFF], [#x8FFFE-#x8FFFF], [#x9FFFE-#x9FFFF], [#xAFFFE-#xAFFFF], [#xBFFFE-#xBFFFF], [#xCFFFE-#xCFFFF], [#xDFFFE-#xDFFFF], [#xEFFFE-#xEFFFF], [#xFFFFE-#xFFFFF], [#x10FFFE-#x10FFFF].
static boolean isXMLValid(char c)
          The W3C XML reference said that a valid XML char is one of:
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
Since chars in Java can't be bigger than #xFFFF (not part of the BMP), the last block is not tested.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

getXMLRecommendedString

public static java.lang.String getXMLRecommendedString(java.lang.String input)
Removes the chars that are not recommended in XML according to the W3C XML reference.

Parameters:
input - The String to be cleaned.
Returns:
The same String but cleaned from the not recommended in XML characters. null if input was null.
See Also:
isXMLRecommended(char), getXMLRecommendedString(String, char)

getXMLRecommendedString

public static java.lang.String getXMLRecommendedString(java.lang.String input,
                                                       char replacement)
Replaces the chars in input that are not recommended in XML according to the W3C XML reference by the replacement character.

Parameters:
input - The String to be cleaned.
replacement - The character to replace any non-recommended character found.
Returns:
The same String but cleaned from the not recommended in XML chars. null if input was null.
See Also:
isXMLRecommended(char), getXMLRecommendedString(String)

getXMLValidString

public static java.lang.String getXMLValidString(java.lang.String input)
Removes the chars that are not valid in XML according to the W3C XML reference.

Parameters:
input - The String to be cleaned.
Returns:
The same String but cleaned from the invalid in XML characters. null if input was null.
See Also:
isXMLValid(char), getXMLValidString(String, char)

getXMLValidString

public static java.lang.String getXMLValidString(java.lang.String input,
                                                 char replacement)
Removes the chars that are not valid in XML according to the W3C XML reference.

Parameters:
input - The String to be cleaned.
replacement - The character to replace any invalid character found.
Returns:
The same String but cleaned from the invalid in XML characters. null if input was null.
See Also:
isXMLValid(char), getXMLValidString(String)

isXMLRecommended

public static boolean isXMLRecommended(char c)
The XML reference said that the use of the following chars is discouraged:
[#x7F-#x84], [#x86-#x9F], [#xFDD0-#xFDDF], [#x1FFFE-#x1FFFF], [#x2FFFE-#x2FFFF], [#x3FFFE-#x3FFFF], [#x4FFFE-#x4FFFF], [#x5FFFE-#x5FFFF], [#x6FFFE-#x6FFFF], [#x7FFFE-#x7FFFF], [#x8FFFE-#x8FFFF], [#x9FFFE-#x9FFFF], [#xAFFFE-#xAFFFF], [#xBFFFE-#xBFFFF], [#xCFFFE-#xCFFFF], [#xDFFFE-#xDFFFF], [#xEFFFE-#xEFFFF], [#xFFFFE-#xFFFFF], [#x10FFFE-#x10FFFF].

Since chars in Java can't be bigger than #xFFFF (not part of the BMP), only the third first blocks are considered.

Parameters:
c - The char to test.
Returns:
true if the c is XMLWritable and is not part of:
[#x7F-#x84], [#x86-#x9F], [#xFDD0-#xFDDF].
See Also:
isXMLValid(char)

isXMLValid

public static boolean isXMLValid(char c)
The W3C XML reference said that a valid XML char is one of:
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
Since chars in Java can't be bigger than #xFFFF (not part of the BMP), the last block is not tested.

Parameters:
c - The char to test.
Returns:
true if c is one of:
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD].


Copyright © 2004-2009. All Rights Reserved.