GitHub - ndepend/SimpleXmlParser: Simple C# and Java code to parse XML
source link: https://github.com/ndepend/SimpleXmlParser
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Simple Xml Parser (C# and Java)
Why Simple Xml Parser
Recently (July 2022) we faced one of the most dreaded situation for an ISV: several users reported simultaneously that some code gets broken. The concerned code was working fine and left untouched for 8 years!
The code is the NDepend TeamCity plugin. The break occurs when upgrading to Team City 2022.04.2 (build 108655) or higher. This is because the Java version used by this TC version is 11. The JAXB API has been removed in Java 11 to reduce the framework footprint as explained in JEP 320 proposal and our TeamCity plugin needs to parse some XML. If you are concerned by this bug just download the latest NDepend version and re-install the TC plugin.
To solve this issue we had the choice to:
- Embed our own copy of the Java EE APIs on the classpath or module path.
- Parse the XML ourselves. This is the option we choose because the one above could lead to all sort of collisions depending on the TeamCity and Java versions installed.
Simple Xml Parser Usage and Capabilities
Being much more proficient with C# than with Java, it made sense to first write the code and tests in C# and then convert it to Java. The code mostly use string
, char
, StringBuilder
and List<T>
that are quite similar in both platforms. Here is the C# version and the Java version.
The idea is to produce our custom DOM: a hierarchy of custom XmlElement
and XmlAttribute
objects. This hierarchy can then be consumed into your own code to populate your model, like we do with FillInspectionModel
.
This XML parser supports:
<Foo>...</Foo>
tags<Foo />
tags<Foo Attr="Value" />
attributes<Foo>< > & " </Foo>
special characters which translate to< > & " \r \n
<![CDATA[raw content]]>
sections<!--xyz-->
comments
We believe that for our usage it is bug free because it is fully tested for all XML documents our users provided us with. It is also 100% covered by the test suite. However the purpose was not to support the entire XML specification so it is certainly buggy for more advanced usages.
Simple Xml Parser Design
Our plan is to only fix potential bugs users might face in the context of the NDepend TeamCity plugin but not to improve the overall XML support. Hence the design and performance were not a priority and they could be improved in numerous ways.
Here is the overall Architecture:
- The parser does two passes. The first pass produces some
XmlRow
objects and the second pass fill theXmlElement
andXmlAttribute
model from the rows. - Then
FillInspectionModel
fills our own model from the DOM.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK