Learn how to control concurrency and apply rate limits using Prefect’s provided utilities.
Global concurrency limits allow you to manage execution efficiently, controlling how many tasks, flows, or other operations can run simultaneously. They are ideal for optimizing resource usage, preventing bottlenecks, and customizing task execution.
Clarification on use of the term ‘tasks’
In the context of global concurrency and rate limits, “tasks” doesn’t specifically refer to Prefect tasks, but to concurrent units of work in
general—such as those managed by an event loop or TaskGroup
in asynchronous programming. These general “tasks” could include Prefect
tasks when they are part of an asynchronous execution environment.
Rate Limits ensure system stability by governing the frequency of requests or operations. They are suitable for preventing overuse, ensuring fairness, and handling errors gracefully.
When selecting between global concurrency and rate limits, consider your primary goal:
Choose global concurrency limits for resource optimization and task management.
Choose rate limits to maintain system stability and fair access to services.
The core difference between a rate limit and a concurrency limit is the way slots are released. With a rate limit, slots are released at a
controlled rate determined by slot_decay_per_second
. With a concurrency limit, slots are released when the concurrency manager exits.
You can create, read, edit, and delete concurrency limits through the Prefect UI or Python SDK.
When creating a concurrency limit, you can specify the following parameters:
/
, %
, &
, >
, <
, are not allowed.rate_limit
function.Global concurrency limits can be in an active
or inactive
state:
To implement rate limiting, you can configure “slot decay”, which determines how rapidly used slots are freed up for new tasks.
When you set up a concurrency limit with slot decay:
slot_decay_per_second
).To configure slot decay, set the slot_decay_per_second
parameter when creating or updating a concurrency limit. This value determines how quickly slots refresh:
For example:
Choose a decay rate that balances your required frequency of task execution with the acceptable limit of overall system load. This allows you to fine-tune your workflow’s performance and resource usage.
You can manage global concurrency limits in the Concurrency section of the Prefect UI.
You can manage global concurrency with the Prefect CLI.
To create a new concurrency limit, use the prefect gcl create
command. You must specify a --limit
argument, and can optionally specify a
--slot-decay-per-second
and --disable
argument.
Inspect the details of a concurrency limit using the prefect gcl inspect
command:
To update a concurrency limit, use the prefect gcl update
command. You can update the --limit
, --slot-decay-per-second
, --enable
,
and --disable
arguments:
To delete a concurrency limit, use the prefect gcl delete
command:
See all available commands and options by running prefect gcl --help
.
You can manage global concurrency with the Terraform provider for Prefect.
You can manage global concurrency with the Prefect API.
concurrency
context managerThe concurrency
context manager allows control over the maximum number of concurrent operations. Select either the synchronous (sync
)
or asynchronous (async
) version, depending on your use case. Here’s how to use it:
Concurrency limits are not implicitly created
When using the concurrency
context manager, if the provided names
of the concurrency limits don’t already exist, then no limiting is enforced and a warning will be logged.
For stricter control, use strict=True
to instead raise an error if no matching limits exist to block execution of the task.
Sync
Async
prefect.concurrency.sync
module for sync usage
and the prefect.concurrency.asyncio
module for async usage.process_data
task, taking x
and y
as input arguments. Inside this task, the concurrency context manager controls
concurrency, using the database
concurrency limit and occupying one slot. If another task attempts to run with the same limit and no
slots are available, that task is blocked until a slot becomes available.my_flow
is defined. Within this flow, it iterates through a list of tuples, each containing pairs of x and y values.
For each pair, the process_data
task is submitted with the corresponding x and y values for processing.rate_limit
The Rate Limit feature provides control over the frequency of requests or operations, ensuring responsible usage and system stability.
Depending on your requirements, you can use rate_limit
to govern both synchronous (sync) and asynchronous (async) operations.
Here’s how to make the most of it:
Slot decay
When using the rate_limit
function, the concurrency limit must have a slot decay configured.
Sync
Async
rate_limit
function. Use the prefect.concurrency.sync
module for sync usage and the
prefect.concurrency.asyncio
module for async usage.make_http_request
task. Inside this task, the rate_limit
function ensures that the requests are made at a controlled pace.my_flow
is defined. Within this flow the make_http_request
task is submitted 10 times.concurrency
and rate_limit
outside of a flowUseconcurrency
and rate_limit
outside of a flow to control concurrency and rate limits for any operation.
Throttling task submission helps avoid overloading resources, complying with external rate limits, or ensuring a steady, controlled flow of work.
In this scenario the rate_limit
function throttles the submission of tasks. The rate limit acts as a bottleneck, ensuring
that tasks are submitted at a controlled rate, governed by the slot_decay_per_second
setting on the associated concurrency limit.
Manage the maximum number of concurrent database connections to avoid exhausting database resources.
This scenario uses a concurrency limit named database
. It has a maximum concurrency limit that matches the maximum number
of database connections. The concurrency
context manager controls the number of database connections
allowed at any one time.
Limit the maximum number of parallel processing tasks.
This scenario limits the number of process_data
tasks to five at any one time. The concurrency
context manager requests five slots on the data-processing
concurrency limit. This blocks until five slots are free and then
submits five more tasks, ensuring that the maximum number of parallel processing tasks is never exceeded.
In addition to global concurrency limits, Prefect provides several other ways to limit concurrency for fine-grained control.
Unlike global concurrency limits, which are a more general way to control concurrency for any Python-based operation, the following concurrency limit options are specific to Prefect objects: