Question 1

How many jobs can one worker handle?

Accepted Answer

A single worker handles approximately 26 million jobs per month at ~100ms per job. Actual throughput depends on job duration. Shorter jobs mean higher throughput, longer jobs mean lower throughput but the same linear scaling applies.

Question 2

Can Windmill scale to zero?

Accepted Answer

Yes. Autoscaling supports scale-to-zero configurations. When the job queue is empty and the cooldown period expires, workers are terminated. New workers spin up automatically when jobs arrive.

Question 3

What is the overhead per job?

Accepted Answer

Standard workers have approximately 50ms of overhead per job for pulling from the queue and setting up the execution environment. Dedicated workers reduce this to ~12ms by keeping the runtime warm.

Question 4

Can I mix different worker sizes?

Accepted Answer

Yes. Worker groups let you run different instance types with different tags. Route GPU jobs to GPU instances, memory-intensive scripts to high-memory workers, and lightweight scripts to smaller instances.

Question 5

What is the difference between dedicated and regular workers?

Accepted Answer

Regular workers execute any job matching their tags, with a cold start for each execution. Dedicated workers are pinned to a single script and keep the runtime warm permanently, eliminating cold starts and reducing overhead to ~12ms. Dedicated workers require Enterprise.

Isolated workers that scale with your workload

Where your code actually runs

How workers work

Worker groups

Dedicated workers

Agent workers

Init scripts

Autoscaling

Concurrency and priority

Benchmarks

Frequently asked questions

Build your internal platform on Windmill