DX11 - Geometry Shader, Write to Volume Texture

Hi!

I would like to use the SV_RENDERTARGETARRAYINDEX in a geometry shader to use a volume (3D) RenderTexture as a RenderTarget.

The following example fails in Unity 4 (does not write to the texture, as confirmed by debugging with PIX). It creates a volume texture, sets it as a RT, then uses DrawProcedural and a geometry shader. It compiles and runs fine, without obvious errors.

Component:

using UnityEngine;
using System.Collections;

public class Test : MonoBehaviour {
	
	public Shader shader;
	RenderTexture tex;
	private Material material;
	public int size = 16;
	
	void CreateResources()
	{
		if (material == null)
		{
			material = new Material(shader);
			material.hideFlags = HideFlags.HideAndDontSave;
		}

		if (tex == null)
		{
			tex = new RenderTexture (size, size, 0, RenderTextureFormat.ARGB32);
			//tex = new RenderTexture (size, size, 0, RenderTextureFormat.ARGBFloat);
			tex.volumeDepth = size;
			tex.isVolume = true;
			tex.Create();

		//tex  = new RenderTexture(size, size, 32, RenderTextureFormat.ARGB32, RenderTextureReadWrite.Default);
		}
	}
	
	void Start () 
	{
		CreateResources();

		material.SetFloat("Size", (float)size);
	}
	
	// Update is called once per frame
	void Update () 
	{
		if (!shader)
			return;

		CreateResources();

		material.SetPass(0);
		Graphics.SetRenderTarget(tex);
		Graphics.DrawProcedural(MeshTopology.Points, size);
	}

}

Shader:

Shader "Custom/Volume Tex (GS)" {
	Properties {
		_MainTex ("Base (RGB)", 2D) = "white" {}
	}
	SubShader 
	{
		Pass
		{
			Tags { "RenderType"="Opaque" }
			ZWrite Off
			Blend One One
			ZTest Always
			Cull Off
		
			CGPROGRAM
				#pragma target 5.0
				#pragma vertex VS_Main
				#pragma fragment FS_Main
				#pragma geometry GS_Main
				#include "UnityCG.cginc" 
				
				struct GS_INPUT
				{
					float4	pos		: SV_POSITION;
				};

				struct FS_INPUT
				{
					float4	pos		: SV_POSITION;
					uint    layer   : SV_RENDERTARGETARRAYINDEX;
				};

				float Size;

				GS_INPUT VS_Main(uint id: SV_VertexID)
				{
					GS_INPUT output = (GS_INPUT)0;
					output.pos.x = (float)id;
					return output;
				}

				[maxvertexcount(6)]
				void GS_Main(point GS_INPUT p[1], inout TriangleStream<FS_INPUT> triStream)
				{
					FS_INPUT output;
					
					uint layer = p[0].pos.x;
					
					output.layer = layer;
				
					output.pos = float4(-1.0f, 1.0f, 0.0f, 1.0f);
					triStream.Append(output);
					output.pos = float4(1.0f, 1.0f, 0.0f, 1.0f);
					triStream.Append(output);
					output.pos = float4(1.0f, -1.0f, 0.0f, 1.0f);
					triStream.Append(output);
					
					output.pos = float4(1.0f, -1.0f, 0.0f, 1.0f);
					triStream.Append(output);
					output.pos = float4(-1.0f, -1.0f, 0.0f, 1.0f);
					triStream.Append(output);
					output.pos = float4(-1.0f, 1.0f, 0.0f, 1.0f);
					triStream.Append(output);
				}

				float4 FS_Main(FS_INPUT input) : SV_Target
				{
					return float4(0.0f, 1.0f, 0.0f, 1.0f);
				}

			ENDCG
		}
	} 

	Fallback Off
}

I rewrote the code in a small DX11 c++ app and it works fine. I can’t tell with PIX what flags the 3D Texture is created with, so that might be a difference. Also I noticed CreateRenderTargetView fails if the format parameter is a “TYPELESS” type, as with which PIX reports the type of the Unity “tex” volume texture. So maybe that’s the difference? Or could it be some silly thing I am forgetting? Changing to a 2D texture allows this code to work fine (the commented line).

Mr Aras, if you see this, consider this either a cry for help, bug report, or feature request. :slight_smile: It is really nice to be able to use normal shaders for 3D rendertargets, instead of compute shaders, because of the speed of hardware blending, compared to atomic ops. Those will be my fallback – if I have time I’ll compare the performance with an optimized implementation in a DX11 c++ app.

Thanks!

I suspect you can’t do this. Check out Aras reply to this thread

I did notice your code is missing ‘tex.enableRandomWrite = true;’ but I doubt that will fix it?

However i’m finding it difficult to work out what exactly your volumeTex (GS) shader is writing as i’ve only seen drawProcedural done with computebuffers providing the vertex/input data. Unless you have another script that is creating the point mesh topology or something?

I did see this thread, and I tried anyway. I suspect it’s possible with very few tweaks on the Unity side. Worst case consider this post a feature request. :slight_smile:

That was in before, it did not fix it.

The draw call generates 16 vertices with no vertex data, and the vertex id (SV_VertexID) provides an integer slice index layer in the volume texture. The GS writes 16 big triangles that cover the viewport ( (-1,-1) to (1,1) in NDC), one to each layer of the 3D render target. Then the fragment shader draws a solid green colour.

Well feature requests can be hard to get implemented or noticed in the forum, though one would hope with the dx11 competition that some of the Unity devs would be checking the shaderLab forum a bit more frequently than normal :wink:

Your best bet is to add a feature request to the ‘wishlist’ forum or to the Feedback site. Alternatively if you feel this could be a bug, you can ‘report a bug’ from within Unity and attach the project. Unity always like example projects, it helps them track stuff down quickly :wink:

Ah I see, I think, the drawprocedural call itself is passing the number of vertices to be drawn (and the topology type), but they don’t actually have to exist (e.g. as a populated computeBuffer for example, which itself would be indexed via SV_VertexID). Thus giving you the SV_VertexID to use as the slice index. Cool, thx for explaining.

Really wish Unity could have made a crash course on dx/dx11 in Unity, if it wasn’t for the few Aras samples floating about on the forums it would be almost impossible to fathom.

Thanks… they should. It’s really a bug, I probably will send a bug report.

You have it right. In the end I haxxed a solution; I used at c++ DX11 plugin and pass the RenderTexture’s native handle, then call then shader from c++. Another bug/feature request is that ComputeBuffers don’t have a native handle get function.