indexer understands the following special HTML characters:
< > & "
All SGML ISO-8859-1 entities: ä ü and other.
Characters in their ASCII code notation: ê