Hi, I was wondering if I could get some advice on how to structure my content architecture. (Disclaimer: this is just in terms of how the content gets managed within the editor itself across multiple developers, for the builds I have a different strategy in mind which will feed off this).
Our software is a content viewer/configurator of sorts where users can load in different products and environments and export images/videos. We are starting to have some teething pains with our current strategy of building all the content every time we need to update so Iām building a new system to manage this all in a more modular way.
One problem we are dealing with at the moment is that weāve got several hundred gigabytes worth of content currently which we expect to grow by an order of magnitude within the next few years (fingers crossed). This is starting to become a problem for the developers working on content creation as at the moment everything just exists in one big project (each developer of course works within their own git branch). Import times are a killer and the project is just getting cumbersome to work with.
My plan for this is to divide the content up into arbitrary git submodules which the content creators can easily manage themselves. (Iāve managed to integrate LibGit2Sharp so that I can do basic submodule adding/removal (merges/conflicts will still be handled externally) within an editor window).
So we have a build server which houses the project files and all of the content container submodule repos; and then each developer (working on their own branch of the main repo) has a selection of these content containers which they manage. The changes to the submodules are then fed back to the main branch which detects the changes, pulls the submodules and does periodic builds of the content.
Mock interface for managing content within editor:
(Originally I had planned to have separate unity projects for systems and content however the content needs to be tested within the main project without having to build between iterations.)
Whilst the work is progressing, I keep getting stuck on things and thought perhaps I should take a step back and see if the community could help validate my thinking before sinking more time into it.
Progress so far:
Iāve structured the submodule repos in a way that makes them modular but still compatible with our current build system
Iām able to manually add and remove the submodules in a branch
Iāve successfully integrated LibGit2Sharp and am able to execute basic git commands from within Unity
Iāve build out the basics of the UI which lists all the content in the loaded submodule repos
Questions:
Am I being stupid, is there a much better way to be handling all this? (or perhaps an industry convention for handling problems like this)
Is Git submodules the right tool for the job?
Iāve heard that people use a mixture of git and subversion/perforce, what would be the advantages of this over keeping everything handled by git?
Are there obvious limitations to this architecture that Iām ignoring/ignorant of?
What roadblocks am I likely to hit going down this route?
Concerns
The master list of available submodules needs to be housed somewhere. It could possibly be just a json file which I store in an s3 bucket, writing to it from the build server, and reading from it on each dev brach, or perhaps I could build a simple server db which handles adding/querying them. I could also just have the remote addresses listed somewhere and the devs copy paste the addresses in to load new repos)
Iām struggling to list out the available content within each repo to help the devs choose which submodules they want loaded into their project.
Having different submodules in each users branches might get a bit messy with āempty foldersā being tracked or not being tracked anyway.
The files are large so Iām having to use LFS which can be⦠well, you know, lol
Merge conflicts in containers being used by multiple devs might prove tricky
Havenāt read all of it but didnāt see any mention of packages. Ever thought about that?
You can put content in separate packages, and you can even put each package in a seperate repo if you like.
No need dealing with a complex submodule architecture that your non-technical staff will hate you for.
Packages also bring along dependencies, so you can ensure older versions of the code still work with older versions of the assets.
Donāt forget to add āhideInEditorā:false to package.json otherwise asset pickers wonāt show the assets in that package.
Hey, thanks for the response! I did actually take a look into using the package manager however weāre unfortunately stuck on Unity 2019.4 at the moment and the git support in the package manager is still a little sketchy⦠I will take another look at using the package manager though because you are right, it would take a lot of the management off my plate, thanks.
[edit] Okay, I had a look back through my notes and the reason I moved away from using the package manager was that pre-2021.2 there is no LFS support for git based packages. We are trying to move up past 2019.4 but we have an essential post processing effect which requires ppv2 for which we have the last compatible version.
Hi @robrab2000-aa ,
Looking at your solution I think it looks plausible
I can see what youāre trying to do with submodules, where the content is only available for whoever needs it. This is quite similar to the approach CD Projekt Red mentioned at Unite 2019 in their talk Scaling Up Unity where they break their project down into subprojects. Might be worth it to watch it and see if thereās some ideas in there that you can extract. I can also see that it was part of your original plan, but maybe thereās some value in revisiting that approach based on what youāve done already?
Another thought that would help is if you could figure out how often content actually changes. If you donāt change it much, then you could benefit from using the Unity Accelerator, as that would help you speed up the time it takes for importing assets quite a bit. It can be leveraged by anyone in your team, and it is available for Unity 2019 and up.
The next thing that could add a new dimension of complexity would be to add asset bundles, so that all you content is already imported, but for a particular platform. That would remove import times, but then youād need to have a bundle per platform and depending on what platforms you want to ship for, it might not be worth it (plus rebuilding all the asset bundles when a change is introduced).
Having different submodules in each users branches might get a bit messy with "empty folders" being tracked or not being tracked anyway.```
One work around could be that uninitialized modules could end in a tilde "~" as those folders are ignored by the Unity File System, so that we never pick them up inside the asset database. Then when you initialize them the name root name could change. It sounds a bit hacky, but it may work address this issue with tracking of empty folders.
Thanks! Iām super happy to hear that its a valid approach and that Iām not going completely in the wrong direction! (its just me doing all the Unity dev work so second opinions mean a lot!)
I will give this a watch, thanks!
From my understanding, this needs to be run over a local network, unfortunately we are all working remotely. Perhaps itās possible to run it remotely over a VPN but my concern would be a) it sounds hard to set up lol, and b) the latency in syncing that much data over the web might negate any performance benefits the accelerator would bring.
I fear that this might add even more complexity, we already use the addressables system for the deployment of content to the end platform (WebGL, and then a Standalone build which we use for rendering). So managing an additional set of content for use within the editor might be quite a lot⦠Also one of the goals is to reduce the amount of physical space used to hold the project on each dev machine⦠the project already takes up about 400gb per machine, so if we 10x the amount of content over the next year or two then regardless of fast import times, we still need a 4tb partition to even hold the project. Thus, I want to try and split the content. I may have misunderstood your suggestion here btw, reading it again, you might be advocating hot swappable asset bundles rather than static ones.)
Also, I have a base ID class which registers all the content into the system⦠sometimes that class changes and then I would need to rebuild all the bundles for each change.
This is super useful! I didnāt know that, I could definitely see this being very helpful!
Hello! I stepped away from this for a few months but Iām back on it now. Weāve upgraded our project to 2021 LTS and thus can look at using the package manager (it now supporting lfs).
I can see how being able to load and unload our content in packages for a given project would be super useful however Iām concerned over how tricky it might be to push updates to that content. From reading a few blogs/posts it looks like in order to update a package stored in a git repo, one must crease a āreleaseā. surely this means that you have to log into the GitHub interface each time you want to update any content and give access to other developers on the team?
Maybe Iāve misunderstood this?
Another question is around the project manifest. This is where the list of installed packages resides from what I can see. This file gets tracked by our main git system so that we can ensure that any packages are at the same version level across developer machines.
If we are using the package manager to control which content each developer has access to locally then surely we will end up with (admittedly easy to resolve) conflicts in this file each time anyone anywhere commits?
Only regarding the last item: there is no concept of ālocal packagesā. There is one project under source control and any package or asset added to the project must always be committed to the repository for everyone else.
You can use a branching strategy to have individual developers use packages that are only needed in this particular branch, but it must be clear to the developer that any commit must not rely on said package, otherwise merge results are likely going to cause issues in other branches. But without relying on the contents of a package, the package itself can only provide things that extend the editor, like specific debugging functionality or an editor tool to work with specific asset types. But then again such a package would be safe to add to the main branch, wouldnāt it?
At the moment I donāt see a valid reason why having different packages installed per developer would be necessary or useful, other than test-driving new or updated packages before adding them to the main branch.
Thanks for the response. What Iām proposing is to use the package manager to manage content, not code. If we have 400gb worth of content and an artist is only working on 5gb of that data then its very inconvenient for them to have to download and import all 400gb in order to work on the 5gb assets. Better to break the content up into smaller packages that can be hot loaded as needed.
Something that came to mind is Gitās sparse checkout idea:
(granted Twitterās engineers are using Bazel, but that can be overcome).
You can essentially filter out entire directories from a userās machine, if you specify which things to include.
So, one idea could be a developer with Profile 1 has Packages A,B and C available on their machine, and a developer with Profile 2 has Packages B and F checked out.
This can of course be tricky to debug, and needs a bit of training and/or tooling to make it easy to use, but it could be an approach which ājust worksā !
Thanks! This feature looks perfect! Unfortunately from looking around a but, its not available yet in any of the major git clients, not is it available in LibGit2Sharp (which is probably why its not available from the git clients) which means Iād need to either subject our artists to using the git cli (which I donāt think is an option) or build a cli wrapper which I think could be a bit messy.
Iāve been looking at PlasticSCM and from what I can see, youāre able to selectively checkout just the content you need⦠Iāve emailed sales to set up a meeting and hopefully this could work
Hey, Iām still tinkering with this. I keep getting sidetracked by other pressing issues but Iām working my way back to this at the moment. My current thinking is that seeing as PlasticSCM is quite expensive (Ā£275pm), we might try again to get by just with git. The plan is that Iām going to manage the submodules manually for now and then build a more automated system once I get that working (fingers crossed).
To give a bit more detail, Iām going to forget integrating libgit2sharp into unity for now and instead use source tree to add and remove the git submodules. This is less than ideal because it means that a) we need to keep track of what content is in which submodule manually and go fetch the git repo links off GitHub manually too, and b) when we remove the submodules, they leave behind the folder metafiles, meaning that unity will recreate the empty folders (which git will ignore too).
By āautomatedā I mean that I will probably build a Xamarin application (weāre on Mac and windows) which acts as a sub-module browser. it would then add and remove the submodules for us, doing any clean up needed.
Iām not quite sure how Iām going to handle push/pulls in the submodules yet⦠I think maybe the artists will just use source tree, essentially treating each submodule as a separate repoā¦