Am I getting it right, that when I’ll make a single mesh that represents terrain using unity’s built-in 2d perlinnoise function, then I’ll be able to somehow add caves using whatever (perlin worms for example)? Or do I need to find solution with 3d noise?
That depends on what kind of ‘cave’ you are going for. In short, no - a 2D noise function will not be able to represent a cave as it can only have a single height value for any given position. A cave inherently runs ‘inside’ of the terrain which invalidates this. Using Unity’s terrain system, it is possible to make caves another way - by painting holes and filling in the gap with a separate cave mesh. If you want fully 3D terrain with multiple layers then your best bet would be to look at voxels.