5. ADM Render Farm
- Latest Notice
- Distributed Rendering Introduction
- Requirements
- Rules
- Deadline Parameters Explained
- Priorities, Machine Limit and Task Size
- Maya Job Submission
- Nuke Job Submission
- Deadline Applications
- Deadline Launcher
- Deadline Monitor
- How to Monitor?
- Job Properties, Black & Whitelist
- Deadline Slave – Join your Workstation
- Render Queue Order, Exceptions and FAQ
- Optimizing Render Times
- Report Problems
Latest Notice
Software version currently supported:
Maya 2020, 2022, 2023 with Arnold
Maya 2024 only on classroom workstations (still use the farm to distribute)
Nuke 14.0 and 12.2
Houdini 19.5 with Mantra: Job submission only through Deadline monitor, from within Houdini currently not supported. Redshift hopefully coming soon.
Red Button on Desktop: Please use instead of log-off or shutdown. The workstation will automatically restart and join the network.
Distributed Rendering Introduction
We are using Thinkbox Deadline for distributed rendering
Submitting a job means sending your Maya or Nuke script to a Render Manager
The Render Manager will distribute your job to a network of computers, called the Clients, Nodes or Slaves
Your submitting workstation doesn’t have to be turned on, but every PC workstation in the Animation area (FYP and classrooms) can join the network and become a render node. This is important to consider during crunch time, we only have 24 dedicated render nodes but 120 workstations
After submitting your job, follow the progress and manage your job with the Deadline Monitor
Requirements
Video files such as QuickTime can only be rendered by one machine. Render image sequences to utilize distributed rendering, or if video file is needed, limit to one machine
Jobs can be submitted directly from Maya and Nuke
All source files such as footage, textures, referenced models etc., which are used in your setup have to be located in a network location. If one source file is inaccessible, i.e. still on your desktop, the render job will fail
Accepted source locations are all network shares: animation, share, resources, projects, projects-fyp
The render destination has to be on the network, too. Make sure the Maya project path is set correctly for a network share
Follow the File Naming Convention and avoid space or special characters in all folder and file names
Rules
The render farm is a shared resource for all animation projects and during crunch time the render queue can get longer
You are required to monitor your job
Do not just submit and walk away, hoping everything will be fine. Monitor closely and wait for first frames to finish to confirm the job is working. Suspend failing jobs and check error reports
Don’t start a higher priority race
A few days in, everyone will submit with high priority. Increasing priority without limiting number of nodes is not allowed
Be considered, don’t waste resources while others are queuing
Wastefully is:
- Blocking farm with a loop of failing tasks instead of suspending
- A first test render at full-size quality
- Long render without having done a short test first
- Extremely long render times per task on many slaves
Join your workstation to the render network
If the queue is long, let your workstation join the render network before going home
How to in Deadline Slave – Join your Workstation
FYP projects granted priority
Every FYP team has one nominated member with elevated permissions to control jobs
Deadline Parameters Explained
A brief introduction to the most relevant parameters, which you need to set during job submission. They can be changed in the Monitor after submission while rendering or queuing
Priority
Higher priority comes first, but it’s not the only factor determining the queue order, submission time and task size will weight in as well
We have rules on setting the job priority, i.e. average Maya job: Priority 50
More in Priorities, Machine Limit and Task Size
Machine Limit
The maximum number of slaves to render the job. The 24 dedicated render slaves have to be shared. If other jobs are in the queue, limit is 8
Pool
Select the software package to render with, i.e. maya, nuke, houdini, etc.
Not every render farm slave or workstation has all software installed, selecting the Pool ensures your job is not sent to the wrong machine without the software needed
Group
Select the group of slaves to render on
Recommended selection: farmplus
farm: The 24 dedicated render slaves ADM 1 – 24 only
farmplus: 24 farm nodes and all rarely used workstations
64gb: Only machines with 64 GB Ram
all: All workstations including classroom, ie B1-5G
Priorities, Machine Limit and Task Size
First come first serve does not always work on a render farm. Some 3D job takes an hour per frame, another just 10 minutes and a simple compositing job maybe only one minute
Your job has a long task time – play fair and don’t block the entire farm
Lower your job’s Priority and Machine Limit and accept that other jobs need to render too. Submitting a render with many tasks and long task time, and high priority, without machine limit, is unfair
Which Priority, Machine Limit and Task Size should I use?
Calculate your total render time: 10 minutes per frame x 100 frames = 1000 minutes
Task Size: Divide 10 by time per frame, 10 / 5 min = Task Size 2
Fast
<200 min
2 minutes per frame x 100 frames = 200 min
On 6 slaves done in 30 minutes
Average compositing job
Priority >60
Machine Limit 6
Task Size 5-10
Average
<2000 min
20 minutes per frame x 100 frames = 2000 min
On 10 slaves done in 3 hours
Priority 50
Machine Limit 10
Task Size 1-5
Slow
>2000 min
30 minutes per frame x 100 frames = 3000 min
On 8 slaves done in 6 hours
Priority <50
Machine Limit 8
Task Size 1
Maya Job Submission
Locate Deadline Shelf
Green icon is the Deadline Submitter
Submit to Deadline
Be patient, it might take a few seconds to open window
1 – Pool: Software + version used, i.e. maya, nuke
2 – Group: Select farmplus or during crunch time all.
3 – Priority: Set to 50. Average 3D jobs shall be submitted with priority 50. Slow jobs have to go lower!
4 – Machine Limit: Set to 8 to fairly share the farm with others. If the render queue is empty, you are allowed to set to 0, which uses all nodes
5 – Comment: Let others know that you are using only a certain number of nodes or that you have an urgent but very fast render job and therefore increased the priority
6 – Frames Per Task: Set to 1. Increase if render time per frame is very short, i.e. 2 minutes per frame, set Frames Per Task to 5, one task will then take 10 minutes
7 – Project and Output Path: Confirm these are network paths. If not, change
8 – Submit Maya Scene File: Yes, enable. Don’t if your setup contains relative paths
9 – Strict Error Checking: Disable
Submit Job: Send job and close window manually
Nuke Job Submission
Locate Render Menu
Select Submit Nuke Job To Deadline
Be patient, might take a few seconds to open window
Submit to Deadline
1 – Pool: Select nuke, or if available, the specific version
2 – Group: Select farmplus
3 – Priority: Set to 60 – 70. Nuke jobs are usually fast and are allowed to use higher priority
4 – Machine Limit: Set to 6 – 8 to fairly share the farm with others. If the render queue is empty, you are allowed to set to 0, which uses all nodes
5 – Comment: Let others know that you are using only a certain number of nodes or that you have an urgent but very fast render job and therefore increased the priority
6 – Frames Per Task: Set to 10. If render time per frame is 1 minute, one task will take 10 minutes
7 – Submit Nuke Script File: Yes, enable. Don’t if your setup contains relative paths
Deadline Applications
These are the 3 Deadline apps on your workstation
Monitor: Control jobs and monitor render queue
Slave: Make your workstation a render node
Launcher: Icon in the notification area to control Deadline settings and start above apps
Start Deadline Launcher
Deadline Launcher is a Taskbar icon only, might be hidden if you haven’t changed behavior
Launcher Taskbar Icon
Click on Show hidden icons up-arrow if icons not visible
Change Icon Behavior
Change to show icon and notifications
Way more convenient with icons visible
Note
If you can’t start the Deadline Launcher, means another user is still signed-in the workstation, as only one account can run Deadline apps. Restart workstation, or if you have admin permissions, you can Sign Off the other user
Task Manager
In the Task Manager > Users tab: Confirmed, another user is still signed-in
With admin permissions, right-click on user and Sign Off
Deadline Launcher
Right-click on Launcher icon
3 menu items are useful for us
Launch Monitor
Launch Slave
Launch Slave at Startup: The green box here means it’s enabled. On your workstation, you usually want it disabled, without green box
Deadline Monitor
Job Window
Showing all jobs rendering and queuing, called the Render Queue
Also showing completed, suspended and failed
Completed jobs are auto-archived after 3 days, suspended and failed jobs: Please delete or archive yourself
Locate your job and monitor how many jobs are queuing before you
If slaves are available, your job should start immediately
Job Control
Right-click on job
Suspend Job: Pauses job
Menu will then offer Resume Job
Modify Job Properties: Opens Job Properties window to change many submission parameters such as Pool, Group, Priority, Machine Limit, Machine Whitelist and Blacklist etc.
View Job Report: If your job creates error count, read the error reports to debug
Job Output: Jump right to the output folder
Task Window
How long does one task render?
Are these expected or suspiciously long times?
Do certain slaves fail?
Right-click on task to access task report, jump to output folder or re-queue task
How to Monitor?
Do not just submit and walk away, hoping everything will be fine. Monitor closely and wait for first frames to finish and confirm the job is working
Is my job rendering?
If there is already a finished frame, check the output immediately to confirm the render is good
Does my job create errors?
Or even fail, maybe on certain slaves? Read the error report and investigate
Suspend your job and avoid blocking the farm with a loop of failing tasks
How long are the render times per task?
Enough time for the number of free machines to finish the job in time? If not, resubmit with smaller render size and lower quality settings
Any tasks unusually longer than others? Maybe that slave is hanging or it’s a slower classroom workstation. Remove that slave from your job by adding to Blacklist (see below) and re-queue task
Monitor Error Count
This job has created 13 errors. The limit is 100, after that, the job will fail
Job Report
If your job creates errors, read job or task report
Here it’s workstation S3D70 causing problems
Bad slaves are marked and excluded automatically, but it’s better to add them to the blacklist to not spoil the error count unnecessary
Job Properties, Black & Whitelist
Open Job Properties with right-click on job
In Machine Limit, select slaves from the Slave List and add them to right-side list, then specify whether this is a Black or Whitelist
Other useful settings in Job Properties
Machine Limit: Set limit how many machines render the job, here 8
General: Change Pool or Group
Dependencies: Link to another render job which has to finish first before this one starts
Failure Detection: List of slaves marked bad
Deadline Slave – Join your Workstation
During crunch time it is essential to join your workstation to the render farm. More slaves are faster. We have 120 workstations, start as many as you can
Two easy ways to join a workstation to the farm
1. Start Slave from Launcher
If you don’t want to sign-out of your account, this is the way
If you don’t see the Launcher icon, refer to section Deadline Application to change visibility behavior
Or simply search Start Menu for Deadline Slave
Deadline Slave Window
Shows rendering status and progress
Simply close window to un-join your workstation
If render is close to 100%, don’t close window, use
Control menu > Stop After Current Task Completion
2. Restart in Render Account
Click the RenderStart icon on desktop
With one click, everything is done
The workstation will restart, automatically sign into our render account and join the render farm
Locked Workstation
The workstation is still signed-in our render account
If you want to use the workstation again, you need to restart
Select Switch Users
Restart Workstation
Click options arrow of Shut Down button
and Restart
Render Queue Order, Exceptions and FAQ
Render Queue Order
The render order is determined by Pool, Weight, and First-In-First-Out
Weight is calculated by Priority – Number of Task + Submission Time – Errors
So, Deadline is smart, Priority is not everything, job submission time and errors factor in, too
Priorities, Machine Limit and Task Size
Calculate your total render time: 10 minutes per frame x 100 frames = 1000 minutes
Task Size: Divide 10 by time per frame, 10 / 5 min = Task Size 2
Fast
<200 min
2 minutes per frame x 100 frames = 200 min
On 6 slaves done in 30 minutes
Average compositing job
Priority >60
Machine Limit 6
Task Size 5-10
Average
<2000 min
20 minutes per frame x 100 frames = 2000 min
On 10 slaves done in 3 hours
Priority 50
Machine Limit 10
Task Size 1-5
Slow
>2000 min
30 minutes per frame x 100 frames = 3000 min
On 8 slaves done in 6 hours
Priority <50
Machine Limit 8
Task Size 1
Exceptions
Exceptions for 3D jobs to go higher with priority can only be made for extremely short renders, such as a test, with Machine Limit below 4
Someone in front of you with a super long render and yours is a really quick one, limit to a few machines and go higher in priority
FAQ
What if there’s no other job in the queue, do I still need Machine Limit?
The next job can only go higher in priority to get any slaves at all
Still, if no one else is rendering at all, take all machines with Machine Limit to 0 but use priority not higher than 50. Also, please monitor the queue, if other jobs appear, decrease your Machine Limit in the Job Properties
Some slow jobs are in front of me, my job is not fast either, but I don’t want to wait 1 day, can I go ahead with higher priority?
Check what’s the expected finishing time of the jobs in front. Is it really 1 day or maybe just a few hours and wouldn’t it be good enough to have your result next morning, if so, why bother?
Else, if you know who submitted, talk to them and get a few slaves
There are too many equally important jobs of several FYP teams
Firstly talk to each other and complain about the crazy deadline 😉
If everything is same crazy urgent, please share the farm equally by assigning each team an amount of slaves. If 24 slaves are available, and 3 teams want to render, each team simply gets 8 slaves
You can always add more slaves by starting the Deadline Slave on classroom and FYP workstations. We have 120 workstations, start as many as you can. How to in Deadline Slave – Join your Workstation
Optimizing Render Times
Don’t just render your first test sequence with the highest settings
Render size, quality, complex materials and motion blur are the most time-costly settings
What is the render for?
Keep in mind, most of the time the first render is not the final. Depending on the project, several versions will be rendered before it’s really final
Does the first render really have to be full resolution with best quality settings? Most likely not
Combining CG with live action in VFX projects, the CG will be slightly defocused in compositing, rendering the full resolution is almost never necessary
Rendering 720p instead of Full HD might already cut the render time by half
Decide first what you try to achieve with the render, which aspects you want to inspect before pushing the settings to full quality
Is it to check the animation or is it a first lighting pass for compositing?
Motion Blur
Enabling Motion Blur in Maya causes much slower renders, and you really have to push the quality settings to the max to get rid of the noise. In most cases, it’s better and much faster to render the beauty without motion blur but render the vector motion pass and use it to create the motion blur in Nuke. You can find many tutorials on this out there, here is one: Vector Motion Blur Tutorial on YouTube
Report Problems
If you see any problems, i.e. a slave is offline or hanging, please report to Prof Ben, Naga or your render farm work-study