hmm…I haven’t been able to get it to play nice in Unity due to some of the code it uses. It claims to have two versions though, so I’ll poke around more on it. Just figured if someone already had one that have used, it would be easier than writing my own or dealing with getting this one working.
hi folks - i’m going to update this old thread because i found literally no information other than two threads on html tag parsing, both leading to inconclusive results.
so my needs were very simple - all i needed was for a function to go to a website and retrieve the names of all the files on a webserver - basically parsing tags.
i ended up settling on the HtmlAgilityPack - website here: https://html-agility-pack.net
normally the way it’s distributed is through a NuGet package, which is easy to install on Windows, not quite as easy on Macs. since i’m using my Mac to make Windows builds, i went to a DLL only website and searched for it and found a semi oldish version of 1.11 to download and simply dropped it into the Plugins folder and in my script i was able to access it via the using HtmlAgilityPack directive. this is on Unity 2018.3 and using .NET 4.X Scripting Runtime Version
so, for parsing you don’t use Unity’s WebRequest method. You create a HtmlWeb object which requests the URL and create a HtmlDocument object to load the HtmlWeb object into it. at that point, you can select specific tags to search for and get the strings from that. the library can likely do more but i only needed it to go to a file server and look for audio file names so strings were just fine for me.
theres documentation on the functions on the website, and additionally there are a few videos on setting up a parser using C# and HtmlAgilityPack, but the scripts are straight C# in Visual Studio which means there are things like Console.ReadLine(),which has no direct equivalent in Unity. so there’s still a bit of head scratching to make it work in Unity as a MonoBehavior.
i’m happy to post the script i wrote if anyone’s interested, but it’s really simple. i just thought devs would like to know that there is a free and open source method for html parsing that seems to work very well. CAVEAT - i’ve not tested this on iOS or Android, or Linux.
Wow, this is a blast from the past. Over 2 years ago and I don’t even work for the company that I originally did this for anymore.
So, I had a look at the anglesharp repo and it has certainly changed over the years. So I took a look at what I originally used and there are changes that seem to break it in the current repo.
It looks like the implementation has changed as well. If you read the docs in the anglesharp repo, there is this section.
So, in your Unity project, you can open any script, use the Nuget Package manager, install the AngleSharp package, run Build Solution and then copy the folder mentioned into your project. Seemed to work for me at least.
I haven’t used this library, though the readme on the github repo states, that it works with Unity 2018.1 (and probably above). So yes, it seems it should work out of the box.
@antmann280 Have you actually tried using the library before you posted here?