In response to mirod:
The module is not meant to adhere to any standard. It is specifically and purposefully coded to be fast and practical. I purposefully allow very loose xml to allow it to be more useful.
Note that I intend to add full mixed content support in a future release; it is just going to require a fair amount of rewrite to ensure no parsing speed is lost.
If you have any questions please feel free to ask and I will respond and attempt to add the answer into the documentation for the next release.
In response to Peter Edwards:
If you find some xml files you think are good tests and not private information, please pass them along. I had difficulty finding non-mixed xml files to test that were large enough to be fair tests.
( using a mixed xml file would give my parser an obvious unfair advantage )
As for invalid XML, the parser is pretty blind. I would like to address that in future releases. For now, closing tags are not checked against the last opened xml tag; results in an incorrect tree. Note that parsing will only fail in rare cases, which I can list in the docs if desired. Parsing is meant to complete with pretty much any sequence of text, valid or invalid. Whether the resulting tree is useful to you is what I think is important.
I agree more tests need to be added to ensure the behavior is consistent in future versions. I've already noticed certain perl versions store hashes in a different order... which caused a fail in 0.26.
To all wondering about the consistency and reliability of the module, I am attempting to make it as stable as possible while continuing to make a lot of changes to it. It is beta, but I am trying to make each release fully usable.
Once version 1.0 rolls around I am going to break away from some of the defaults, and add an option for 0.X compatibility mode. That will be a while though. No version 1.0 till the module is rock solid.