Don’t know about anyone else, but I seem to frequently be finding issues with the transfer of both Unity Answers and the old forum to the new Discussions site ( also the Blogs - so many seem to be missing ).
Here is an example I just ran into:
This is how a page generated from a Unity Answers post looks on the Discussion site.
Here is the original Unity Answers page from The WayBackMachine
There are several issues
Source code that was fully placed within code tags on the original is broken on the Discussions page. See users ‘matteo1000g’ post.
A reply at the end of the page from Bunny83 is completely missing! This happens to be after the reply where the source code breaks, so i wonder if that has something to do with it being missing?
The source code breaking is something I’ve seen numerous times, but I just assumed the original post didn’t use code tags correctly. This is the first time I looked into the issue and surprised to see the code on the original page was perfectly fine. This suggests that Discussions site itself is somehow breaking old code even when it was placed in code tags.
I discovered the above pages from a different post
Here you can see other issues, such as the ‘over here’ link from Bunny83 which is meant to link to their post on the first page I mentioned above is broken - ‘page not found’ on Discussions!
Not sure what is going on here and surprised how frequently I seem to find these problems. What really concerns me is that we may not even be aware of the severity of the problems if a page doesn’t have any obvious issue, especially with regard to missing posts in threads.
Blogs is a different team sorry can’t help at moment
But for your two points
The code issue was a known problem with answers migration. The old answers site allowed multiple ways to do code, and the nee one didn’t catch them all. As we found broken ones we fixed them.
The missing comment is one will investigate, I think we had some issues with missing comments and again re restored as they were identified.
I’m not sure this is the cause. As I mentioned the code transfer breaks right in the middle of a code block, and not because of some weird usage of code tags.
That suggests to me that it is a fundamental problem with the parser used. However as I only have access to the front end html and not how the original code was stored server side I can’t be sure, though I suspect the cause will still be the same issue that of a parsing error.
I do think its very weird that the migration tool converted
o += vertices[triangles[i]];
to
o += vertices[triangles*];*
After which point the code migration was broken and the rest of the code is displayed as normal text in the post.
Relying on fixing broken code as its found doesn’t seem very reliable, at least the way its said seems to imply that code in posts is fixed by hand on a case by case basis, instead of fixing the root causes?
I don’t like sounding rude here, but I’m of the opinion that it is vital that such a massive change to a new discussions board it must be imperative that we do not lose decades of experience and knowledge through bad migration.
I’m happy you will investigate, though again I have to emphasis my greater concern here that we appear to have potential to lose decades of historically important information from the web.
EDIT:
By the way this new discussion forum seems rather brittle too. I have noticed when quoting, if I want to break up the quote into sections you must place the closing quote tag on a newline, otherwise it breaks. Might be true for the opening tag too.
Yeah but that;s not really my point. If failing to place the closing tag of a quote on a newline breaks the appearance of the quote then the system is brittle and prone to failure.
To be specific i’m talking about what happens if you type the below in the reply. It will break the quote and you must place the closing tag on a new line for it to work.
e.g
[quote=“Brach_Unity, post:5, topic:1502539, full:true”]
I tend to use the copy quote to do split up quotes[/quote]
How is this accomplished? Is there somewhere users can report posts with messed up code tagging becuase it seems to me every other page I visit has these problems.
Case in point and why I made this reply
It starts off ok, but the more posts you look through the more messed up code tagging issues become apparent, with several posts just broken.
I do appreciate you taking the time to answer my questions on these points, but I just don’t understand the ‘case-by-case’ approach when there appears to be a fundamental parsing issue from the source material that should be fixed - unless its too late now?
So there is about 1-2 million posts from older answers migration of which some have this issue. We don’t know which posts are impacted until it’s reported For answers we had a broken code tag which alerted us, that was turned off after we migrated forums.
It may be that if we need to we will re-enable the tag
You should have an automated way of checking the pre page versus the post page to validate content. It would be a mandatory step for any migration.
I mean, migrating millions of pages and not knowing which and how many are broken would pollute your learning base and completely undermine the goal of migrating the content. An unvalidated migration for millions of pages would be willful negligence at that point.
The team surely should have a list already if any effort was spent to validate the migration? You cant be relying on your users to visually identify every page for you?
So it is true …… answers allowed many ways to enter code some of which were no longer supported but were rendered. We validated a many as possible, with the answers migration the data quality was bad many posts where badly formatted. We have recovered about 90% of those.
The validation said it was a match…… however if the original was broken so that won’t mean much
For most migrations you tend to have a threshold of what is acceptable, for answers it was about 90% which was met. For forums at the moment we achieved around 95% converted with no issues.
For a migration this size that’s pretty good result.
Most of those failures are as I said where the original was corrupted. We added a manual system to fix those as they were identified. Any migration will have things that don’t work and you do those by hand as they are identified.
It’s like bugs you tend to prioritise the effort into fixing the most important or impactful first…… otherwise you waste time and effort …
We are working through posts with issues as they are identified both by automation or by users.
Sorry, I cant figure out quoting on this platform from mobile.
I dont agree that any migration requires manually fixing (let alone hundreds of thousands of entries)… but I will concede as I dont want to talk about Unity anymore and who am I to say what is acceptable for you.
I accept that we disagree and that is OK.
Now to figure out how to unsubscribe from all the notitications.
Edit: figured it out, I can open each thread, scroll all the way to the bottom, wait for like 800 entries to load 10 at a time because no paging system, and then i can unsubscribe.
Found another concerning failure of the transfer.
This post has links to Unity;s hwstats page.
While it seems the information is no longer correct, the problem is the url link got completely destroyed. I know this, becuase I was directed to that post from this thread on stackexchange.
The top answer there includes a quote of the original post and there the url is
https://operate.dashboard.unity3d.com/organizations/<YOUR ORG ID HERE>/projects/<YOUR PROJECT ID HERE>/hwstats
Which even now is getting messed up to Unity Cloud/projects//hwstats
It completely loses vital information in the construction of the url.
Just another thing to put on the list of things to try and fix.
Yes I noticed that too, but its only for the very first comment. The immediate comment after that talks about stencils, which is a term that cannot be found anywhere in this post or that manual page.
Once you get to the third post, the comments are exclusively about stencil and multi-pass rendering! Again neither of those topics are present in the posts or that manual page. Neither is the new commentator with 3 comments talking to any other commentator, in fact none of the commentators appear to be replying to each other so its not like one of them went off-topic and others engaged with them.
Of course as I can’t see the original Answers page any more I can be sure, but this just seems very weird in how off-topic (yet focused on shaders) the comments are beyond that first one, which is an obvious misunderstanding.
The only thing that would make sense is if these comments were in reply to a commentator who took the post off-topic, but then deleted their account and that somehow removed all trace of their comments.