Class HTMLPurifier_Lexer_PEARSax3

Proof-of-concept lexer that uses the PEAR package XML_HTMLSax3 to parse HTML.

PEAR, not suprisingly, also has a SAX parser for HTML. I don't know very much about implementation, but it's fairly well written. However, that abstraction comes at a price: performance. You need to have it installed, and if the API changes, it might break our adapter. Not sure whether or not it's UTF-8 aware, but it has some entity parsing trouble (in all areas, text and attributes).

Quite personally, I don't recommend using the PEAR class, and the defaults don't use it. The unit tests do perform the tests on the SAX parser too, but whatever it does for poorly formed HTML is up to it.

HTMLPurifier_Lexer

HTMLPurifier_Lexer_PEARSax3

Warning: Entity-resolution inside attributes is broken.
Located at xoops_trust_path/libs/htmlpurifier/library/HTMLPurifier/Lexer/PEARSax3.php

Methods summary
`public HTMLPurifier_Token`	# `tokenizeHTML( $string $string, $config, $context )` Lexes an HTML string into tokens. Lexes an HTML string into tokens. Parameters `$string` `$string` String HTML. `$config` `$context` Returns `HTMLPurifier_Token` array representation of HTML. Overrides `HTMLPurifier_Lexer::tokenizeHTML`
`public`	# `openHandler( & $parser, $name, $attrs, $closed )` Open tag event handler, interface is defined by PEAR package. Open tag event handler, interface is defined by PEAR package.
`public`	# `closeHandler( & $parser, $name )` Close tag event handler, interface is defined by PEAR package. Close tag event handler, interface is defined by PEAR package.
`public`	# `dataHandler( & $parser, $data )` Data event handler, interface is defined by PEAR package. Data event handler, interface is defined by PEAR package.
`public`	# `escapeHandler( & $parser, $data )` Escaped text handler, interface is defined by PEAR package. Escaped text handler, interface is defined by PEAR package.
`public`	# `muteStrictErrorHandler( $errno, $errstr, $errfile = null, $errline = null, $errcontext = null )` An error handler that mutes strict errors An error handler that mutes strict errors

Methods inherited from HTMLPurifier_Lexer
`CDATACallback(), __construct(), create(), escapeCDATA(), escapeCommentedCDATA(), extractBody(), normalize(), parseData(), removeIEConditional()`

Properties summary
`protected array`	`$tokens`	`array()`	# Internal accumulator array for SAX parsers. Internal accumulator array for SAX parsers.
`protected`	`$last_token_was_empty`		#

Properties inherited from HTMLPurifier_Lexer
`$_special_entity2str, $tracksLineNumbers`

Packages

Classes

Interfaces

Exceptions

Functions

Class HTMLPurifier_Lexer_PEARSax3

Parameters

Returns

Overrides