• 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
FumeFX Network Render Problems
#1
Hello,

We're having some problems with network rendering FumeFX...

1 - The major issue right now is that there are several frames where the fume system does not advance between frames. In other words, two frames in a sequence look identical, as if some machines are randomly reading the cache file from the wrong frame (but not from a frame that that specific machine has rendered). Then in the same sequence, those same machines will render other frames just fine, so it's a very random error, but it happens consistently with every render we submit.

2 - We are also getting several frames where the fume does not render (the frames are black). I have not tried to troubleshoot that problem yet, so I don't have any specific information about that.

3 - And the last problem is that we are getting some flickering with the lighting (shadows maybe) on the fume atmospherics. In one case, this was fixed by unchecking "read/write illumination map to disk." In another case, it seemed to be caused by having more than one fume system in the scene. Rendering the systems separately got rid of the flickering. But clearly we have some cases where we will need to be able to render two fume systems together, so they will blend correctly.


Any ideas?

Thanks.
  Reply
#2
Hi,

1 - Did you mean that render node sometimes uses the fumefx cache from the last frame it rendered, instead of loading the new cache?

2 - It could be the memory issue - memory can get fragmented and fume cannot allocate large enough chunks. Restarting max helps sometimes. Future improvement will be cancelling the render on that node so that max can restart.

3 - If you use 2+ fumefx in the same scene, and they use the same cache files, you shouldn't generally use read / write from disk. Is this only a network rendering problem?

How big are the caches? And memory and OS on the render nodes?

Vjekoslav
  Reply
#3
Hi, thanks for the reply...

1 - no, that's the strange thing -- there are no cases where the cache is being reused from a previous frame that a particular node already rendered. What I mean is that it just simply seems to read the wrong cache frame (1 frame earlier than it should).

Other things we have noticed... so far it seems to only do this when you render a sequence with matte geometry in the scene. I cannot be sure that is the case, but we submitted 3 renders with only FumeFX and no geometry at all and had no bad frames for those sequences so far.

Also, it seems that each machine will only render 1 bad frame at most per rendered sequence. For example if there are 3 machines rendering you will generally get at most 3 bad frames for that sequence, but sending to over 200 machines, you get a LOT of bad frames.

The cache sizes for these sequences are generally very large, but it happens with very small cache sizes also (like a simple flaming torch or whatever). The machines vary, and it is totally inconsistent as to which machines will render a bad frame. Some are WinXP 64bit, 4Gigs Ram, Max 8, Dual Opteron CPU's. But also seeing the same problem on Intel 32bit XP machines, etc. It's a mix of all types but I cannot see any pattern.


2 - the blank frames problem is not a big deal right now for us, because those frames are easy to catch, and re-render. Also, this does not happen often.


3 - I should have stated that this is not a network problem. The light flickering hapens locally also. The case with 2 FumeFX systems did not have read/write to disk checked. We have also seen this problem with only 1 system and read/write not checked. It looks like the shadows are maybe turning on and off between frames.


Thanks,
-Kirby.
  Reply
#4
Hello Kirby,

If you have a scene where this light flickering problem can be reproduced, is it possible that you send it to mailto:kresimir@afterworks.com ? We'd be happy to take a look at it, reproduce flickerng and fix a bug.

Thank you

Kresimir
  Reply
#5
Next time this "wrong frame" occurs, please send us the log file (max/network/max.log) from that render node. Mark the frame that rendered wrong, and if possible, see what frame fumefx loaded instead of the one it should have. You said (1 frame earlier than it should). Does this mean that for rendering frame 20, ffx uses the cache for 21? Is it always like that?
In NR, a node is assigned first a job with 1 or 2 frames, than another job with more frames, then again more etc.. maybe some pattern there?
Vjeko
  Reply
#6
Dobar dan Wink

Has there been found any solution for this flickering problem?
  Reply
#7
Dobar dan !

So far we didn't receive any reproducible max scene or fxd file to check, so we had nothing to work on - unfortunately.

Regards,

Kresimir
  Reply
#8
vjeko Wrote:Next time this "wrong frame" occurs, please send us the log file (max/network/max.log) from that render node. Mark the frame that rendered wrong, and if possible, see what frame fumefx loaded instead of the one it should have. You said (1 frame earlier than it should). Does this mean that for rendering frame 20, ffx uses the cache for 21? Is it always like that?
In NR, a node is assigned first a job with 1 or 2 frames, than another job with more frames, then again more etc.. maybe some pattern there?
Vjeko

We are seeing this same issue here. In our case, all frames are batched in single frames, i.e. a single machine is rendering a single frame. What we are seeing is that, for example, frame 20 and 21 will both be using the cache from frame 20, making it look like the sim freezes for a frame and then jumps ahead 2 frames. Requeueing the frame generally fixes the problem, as the frames often render correctly the second time. But, as mentioned earlier in the thread, these types of render errors are tougher to catch and very time consuming to check everything that renders.

I will attempt to post the logs here when I can isloate a bad frame, although, since we are using Rush as the queueing system here, I'm not sure exactly what I can get that would be useful.

-nathan
  Reply
#9
Here is the chunk of the log file (max.log off of a render machine) that deals with a bad frame:

Code:
2007/05/01 09:47:17 DBG: Starting network
2007/05/01 09:47:17 INF: Loaded c:\rushtemp\v217_008_TD_006.max
2007/05/01 09:47:21 INF:    Job:  c:\rushtemp\v217_008_TD_006.max
2007/05/01 09:47:23 INF: Starting network rendering
2007/05/01 09:47:34 INF: Max is ready to begin render
2007/05/01 09:47:45 INF: Frame 1027 assigned
2007/05/01 09:47:45 INF: Frame started at: 09:47:45
2007/05/01 10:06:44 INF: Frame completed
2007/05/01 10:06:54 DBG: Stop network

and from a different machine:
Code:
2007/05/01 09:47:14 DBG: Starting network
2007/05/01 09:47:14 INF: Loaded c:\rushtemp\v217_008_TD_006.max
2007/05/01 09:47:19 INF:    Job:  c:\rushtemp\v217_008_TD_006.max
2007/05/01 09:47:20 INF: Starting network rendering
2007/05/01 09:47:31 INF: Max is ready to begin render
2007/05/01 09:47:42 INF: Frame 1019 assigned
2007/05/01 09:47:42 INF: Frame started at: 09:47:42
2007/05/01 09:59:03 INF: Frame completed
2007/05/01 09:59:14 DBG: Stop network

So not really much to work with there, sorry.
  Reply
#10
interesting.. Maybe its Something to do with Some of the Machines On the Farm Reading the Offset Time Differently for some reason .. The Playback might be breacking on some machines ?

Do u have a List of Machine Makers that Errors Out ?
Is it Always the Same Machines ?
Or Its Randomdly happening on Different Machines ?
Or its Always the Same Machines For the Same Job Sending ?
Or its Always the Same Frames for the Same Job Sending ? ( i think this one u answered as ur saying that requeuing the same "Freezed Bad Frame" generally Fixes it.. Witch i m assuming Most Likely Goes to A Different Machine ..

If its machine Number Repeating , check :
All the MAchine loading the Opie Paclage Of Fume Fx 1.0a Has the Latest Turbosquid Stuffs in their roots of max ? ( TS Register , and Dcpflics) , cause their might be some Old plugins on some Old Sprays i use to do when i was up their..


I have the Feeling that this is Machine Related .. i Hope it is , it will be easier to fix Wink
<!-- w --><a class="postlink" href="http://www.cgfluids.com">http://www.cgfluids.com</a><!-- w -->
  Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)