Is it possible to multithread www url requests

function getData(myurl)
{
var www : WWW = new WWW(myurl);
yield www;
if (www.error != null)
Debug.LogError("An error occurred: " + www.error);
else
{
thedata += www.text;
}
}

Is it possible to thread stuff like this so that 100’s of requests happen simultaneously. Let’s say the results of each are placed into a variable or written to a file?

Thanks,

Dan

whimsica, you can initiate many WWW requests before you begin waiting (yielding) for the responses. I currently have a WWW reader (using coroutine) that initiates 32 separate connections…

Note, I would be cautious about creating too many simultaneous connections. The HTTP 1.1 spec only suggests two at a time, and modern web browsers stay below 16. At some number, servers begin to revolt.

Hi. Can you give me a hint on how to set something like that up. I want to keep feeding in url requests simultaneously. As they are returned I want the data to be processed. But I don’t want one request to interrupt another or yielding to one request to stop the others from processing.

Would I just initiate the requests? like

var www1 : WWW = new WWW(myurl1);
var www2 : WWW = new WWW(myurl2);
var www3 : WWW = new WWW(myurl3);
then yield for each one.
yield www1, yield www2,???

In this case does yield mean that yield until there is some data in www1?

thedata = www1.text;
when each data arrives I want to process it with
processdata(thedata);

I’m still confused about how to set this up. Yielding will halt everything right?

Dan

You do not need to yield. Instead on each frame (say on an object’s Update() function) you can check each WWW’s “isDone” property. If it’s true, then it’s done downloading and you can retrieve the information.

Okay I got something to work although I’m getting some errors randomly on different sites. I’m sending a list of different urls to 32 WWW variables and then checking to see if they’re done or not in the update loop. If they are I send another url.

Using this scheme sometimes a few of them fail. Sometimes it works fine.

I get this error.
Resolving host timed out: www.amazon.com
The url changes every time randomly.

What determines the amount of time it tries to resolve a host?
Is there a setting somewhere to say try to resolve a host for at least X seconds?
I don’t think yield will work here.

Thanks for any suggestions,

Dan

I believe this is up to the user’s browser. As Ostagar pointed out, you should only run a couple requests simultaneously. So what you should do is have some kind of counter going along, say 4. Only start/initiate a WWW request when that counter is above zero. For each request you start decrement the counter by 1. As requests come back in, you can increment the counter by 1. This way the next request in your queue can start (because the counter is greater than 0)

Though, perhaps you should first write it so only one WWW request runs at once; that way you can confirm whether or not this is an issue with your errors. If it works perfectly for 1 request at once, then you can modify your script to allow X number simultaneously.

Actually I’m running this from the Editor. I haven’t tried the browser yet.
I tried standalone and the same thing happens.

At 1 request it works but it is slow.
At 32 it is about 20x faster which is great.
If I could adjust the timeout it would still be 20x faster with less errors?

Dan

I do not believe you can adjust the time-out. I may be wrong on this, but I suspect it is operating system and browser dependent.

Try increasing the simultaneous requests (to like, 4 at a time) and see if it still works with acceptable speed and reliability.

Short of limiting your requests, perhaps you can re-run requests if they timeout. If any given request times out more than say, 5 times, then you might consider it a lost cause (that is, the targeted website is simply offline).

Again, I recommend against hammering and more for queuing your requests. If necessary, update your UI to accommodate a longer load time and communicating its progress (or at least a “loading” screen) to the user. This should be done anyway as you have no guarantee how fast the user’s connection is or how fast/available the targeted website is.

Good idea about the 5x and you’re out suggestion.
I was thinking I could resend a timed out request, but then I was worried what would happen if it just kept timing out – due to a bad url.
So that solves the problem

Thanks guys I think I can get it to work now.

Dan

Just curious: Why do you need to access 100s of different URLs at the same time?

Probably for downloading Asset Bundles or config data.

Because you can never download 3D porn fast enough.

This is why God made zip files.

It comes in 3D now?!

You misquoted me! You bastard! And liek… welcome to the 21st Century, eh?

Just to bring up a point that so far was ignored:

WWW is by always asyncronous, loading the assets that come down happens in an own thread since 2.6

So its generally no problem.
Problematic it will become if you happen to hit multiple download complete at the same time, cause then you tend to see the nice get thread context error when they try to concurrently load data and inject into the rendering realm