html parser for Unity

Does anybody know of an html parser I can use for Unity?
Something along the lines of
GitHub - AngleSharp/AngleSharp: 👼 The ultimate angle brackets parser library parsing HTML5, MathML, SVG and CSS to construct a DOM based on the official W3C specifications. Which will parse the html tags to retrieve information. I’d hate to have to write my own if there is one already out there that does what I need.

Thanks!

What is wrong with that one?

hmm…I haven’t been able to get it to play nice in Unity due to some of the code it uses. It claims to have two versions though, so I’ll poke around more on it. Just figured if someone already had one that have used, it would be easier than writing my own or dealing with getting this one working.

it appears that one targets .Net 4.5 or Core (and they have a 4.0 version, but has a weird dependency).

Unity needs .Net 3.5 or earlier… so look for a parser that targets 3.5, or has an old stable version that supports 3.5

Otherwise, you’ll have to wait/use the beta version of the newest Unity, and resolve dependencies yourself.

1 Like

Yep. That’s why I was asking for if someone had an html parser that they were using or knew of one that worked with Unity. Was hopeful!

hi folks - i’m going to update this old thread because i found literally no information other than two threads on html tag parsing, both leading to inconclusive results.

so my needs were very simple - all i needed was for a function to go to a website and retrieve the names of all the files on a webserver - basically parsing tags.

i ended up settling on the HtmlAgilityPack - website here: https://html-agility-pack.net
normally the way it’s distributed is through a NuGet package, which is easy to install on Windows, not quite as easy on Macs. since i’m using my Mac to make Windows builds, i went to a DLL only website and searched for it and found a semi oldish version of 1.11 to download and simply dropped it into the Plugins folder and in my script i was able to access it via the using HtmlAgilityPack directive. this is on Unity 2018.3 and using .NET 4.X Scripting Runtime Version

so, for parsing you don’t use Unity’s WebRequest method. You create a HtmlWeb object which requests the URL and create a HtmlDocument object to load the HtmlWeb object into it. at that point, you can select specific tags to search for and get the strings from that. the library can likely do more but i only needed it to go to a file server and look for audio file names so strings were just fine for me.

theres documentation on the functions on the website, and additionally there are a few videos on setting up a parser using C# and HtmlAgilityPack, but the scripts are straight C# in Visual Studio which means there are things like Console.ReadLine(),which has no direct equivalent in Unity. so there’s still a bit of head scratching to make it work in Unity as a MonoBehavior.

i’m happy to post the script i wrote if anyone’s interested, but it’s really simple. i just thought devs would like to know that there is a free and open source method for html parsing that seems to work very well. CAVEAT - i’ve not tested this on iOS or Android, or Linux.

2 Likes

I forgot about this post. That’s cool to see.

I actually ended getting AngleSharp to work in Unity, so I stuck with it.

1 Like

hi there. It would be very helpful if you could provide some information on how you managed Anglesharp to work with unity.

Wow, this is a blast from the past. :smile: Over 2 years ago and I don’t even work for the company that I originally did this for anymore.

So, I had a look at the anglesharp repo and it has certainly changed over the years. So I took a look at what I originally used and there are changes that seem to break it in the current repo.

It looks like the implementation has changed as well. If you read the docs in the anglesharp repo, there is this section.

8745378--1184520--upload_2023-1-20_12-25-57.png

So, in your Unity project, you can open any script, use the Nuget Package manager, install the AngleSharp package, run Build Solution and then copy the folder mentioned into your project. Seemed to work for me at least.

1 Like

I haven’t used this library, though the readme on the github repo states, that it works with Unity 2018.1 (and probably above). So yes, it seems it should work out of the box.

@antmann280 Have you actually tried using the library before you posted here?

1 Like