Free lightweight XML Reader (needs road testing)

I wrote this today after looking for something to read XML data but didn’t want to use C Sharp’s 1MB XML library just for loading in some preferences. Was also a bit shocked that no one seems to have written a lightweight C# parser. I saw there was one on the wiki but it didn’t have attribute support which IMO is needed if you want to store more complex types of data.

So, I dragged out my perl archives (floppies FTW!) and updated my old XML token parser for C#.

The result is a tiny 20K script (10K without comments so it ought to be a spec when compiled) that reads an XML formatted string into a simple object list hierarchy.

Of course, this needs to be battle tested but so far so good. It parsed the entire script of The Tragedy of Richard the Third (link) which is an 8,600+ line (270K) XML document in around 0.1 second within Unity so speed shouldn’t be an issue. :smile:

Download the Script here: XMLParser.cs (right click and save as)

Usage Example:
Loads XML text into object hierarchy then loops though opjects and re-writes XML as string to console.

using UnityEngine;
using System.Collections;

public class XMLTest : MonoBehaviour {

	
	// Use this for initialization
	void Start () {
		
		// Load a text asset XML file from the assets/resources folder
		TextAsset xmlAsset = (TextAsset)Resources.Load("test", typeof(TextAsset));

		// get xml formatted string from text asset
		string xmlString = xmlAsset.text;

		// create XMLParser instance
		XMLParser xmlParser = new XMLParser(xmlString);

		// call the parser to build the IXMLNode objects
		XMLElement xmlElement = xmlParser.Parse();

		// test string to re-build XML from XMLNode objects
		string xmlOutputString = "";

		// recursively re-construct XML string
		WriteXMl(xmlElement, ref xmlOutputString, 0);

		// log re-constructed xml string to the console.
		Debug.Log(xmlOutputString);
	}
	
	// rebuilds xml string in output, ugly little method but it works
	public void WriteXMl(IXMLNode element, ref string output, int depth) {
		
		// tab strings for nicer formatting...
		int i = 0;
		string tabs = "";
		while(i < depth) {
			tabs += "\t";
			i++;	
		}
		
		// if textnode add content to output return early...
		if(element.type == XMLNodeType.Text) {
			output += tabs + element.value + "\n";
			return;
		}
		
		// add opening tag to output
		output += tabs + "<" + element.value;
		
		// add attributes to opening tag
		i = 0;
		int attributeCount = element.Attributes.Count;
		while(i < attributeCount) {
			output += " " + element.Attributes[i].name + " = \"" + element.Attributes[i].value + "\" ";
			i++;
		}
		
		// close opening tag
		output += ">\n";
		
		// recurse through all child elements
		i = 0;
		int childCount = element.Children.Count;
		while(i < childCount) {
			WriteXMl(element.Children[i], ref output, depth+1);
			i++;	
		}
		
		// add closing tag to output string
		output += tabs + "</" + element.value + ">\n";
		
	}
	
}

Bit of a pointless example but it demonstrates both reading XML strings and how to use the XML class objects to get at the data.

The XML Object hierarchy is composed of 2 classes that share an interface and 1 simple Struct:

IXMLNode is the main XML hierarchy interface. XMLText and XMLElement use it and nothing more.

Accessable properties:
string value - either the tag name or the text content
enum type - an enum to tell you if it’s a a Text node or an Element node that could have child nodes.
IXMLNode Parent - the parent node in the hierarchy (read only)
List Children - a list of child nodes. Text nodes will always return an empty list here.
List Attributes - a list of attributes. Text nodes will always return an empty list here.

XMLAttribute is a simple struct with two public fields:
string name - name of the attribute.
string value - value of the attribute.

I also implemented the XMLParser class to be used as an object instead of a static class so that it’s easier to extend and modify. There’s heaps of comments all the way thought it so it shouldn’t be too difficult to follow and modify if you need extra features.

XMLParser will break (and cry) if you feed it malformed XML documents!

This shouldn’t be an issue for games though. It also won’t strip extraneous white-space, again shouldn’t be an issue for games as you have tighter management over resources than other web/feeder based XML tasks.

Requires HTML reserved characters (< > ’ " ) to use entity references if you want to use those characters in the content (non-markup) or attribute values of your documents. XMLParser has some static class functions for converting these back and forth. The parser automatically handles and translates entity name references in xml documents for the above listed entities (it converts from " to " ). It does not support entity number references (such as " ). Any other entities will be stripped and replaced with a null char unless you modify the parser to handle them.

Please report any bugs here and I’ll try by best to fix them up as soon as possoble. :smile:

Bah… why didn’t i check the forums before writing my localization routines… i used the attributeless TinyXMLReader from the wiki…

Very nice, I was just looking for something to parse my mocked database data and properties - will check it out!

Thanks for your great work! It really helps a lot and I think it’s much easier to use than TinyXML on the wiki.

Looks useful! Thanks for sharing.

Sorry to ask, but how do i go about using it? I have currently added the scripts to my unity project folder but how do i start parsing XML files. Really sorry because i am new to XML parsing and c#.

Thanks for this, was intending to expand the Tiny Reader from the wiki… Now I dont need to…

I’ve found a bug when parsing an element with attributes and both opening and closing signs, like this:

it doesn’t “read” the closing sign, it seems. I’ve sent a PM to Cameron detailing the problem, hope he gets it soon. Great and very useful tool btw :slight_smile:

I encountered this bug too and have made a fix for it. Now the XML reader can parse element in these formats:

<node/>
<node height="2" width="2"/>

627093–22353–$XMLParser solo tag fixed.zip (4.78 KB)

Only one question: why not xml serialization?

Hi i’m trying this but I got : null

It print null in stead of sun

Where am I wrong ?

GetValue() is not a method of XMLElement, perhaps it’s some extension method you have defined?

What you want is node.Children[0].value

Have also updated this to include spotlightor’s fix. This has since been use in a published game so consider it fairly well tested.

Can this import a feed as

image
title
description
image
title
description
etc.