Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
ead17f9
Batch job POC
jpcamara Feb 2, 2024
bc56edd
Use ActiveSupport::IsolatedExecutionState to honor user isolation lev…
jpcamara Feb 5, 2024
504042b
Ability to retrieve batch from a job
jpcamara Feb 5, 2024
d72d42e
Allow batch jobs to be instances
jpcamara Feb 8, 2024
6ceca41
Use text so the jobs store properly on mysql
jpcamara Mar 23, 2024
953bb32
Handle on_failure and on_success
jpcamara Sep 24, 2024
bd16f4a
Allow enqueueing into a batch instance
jpcamara Sep 24, 2024
2998d74
Block enqueueing if the batch is finished
jpcamara Sep 24, 2024
7c60234
Migration to allow nesting batches
jpcamara Sep 24, 2024
fc319c9
Expanded batch readme
jpcamara Sep 26, 2024
871aef2
Force an initial batch check
jpcamara Sep 26, 2024
2f05ba9
Initial batch lifecycle tests
jpcamara Sep 26, 2024
7274e93
Add job batches to queue_schema.rb as well
jpcamara Nov 22, 2024
3ad729f
Refactor internals and api namespace of batches
jpcamara Aug 29, 2025
bc1efa4
Move away from a batch_processed_at to batch_execution model
jpcamara Sep 5, 2025
bd9a781
Reduce complexity of batches implementation
jpcamara Sep 8, 2025
af0c583
Test updates
jpcamara Sep 8, 2025
55abeaf
Create batch executions alongside ready and scheduled executions
jpcamara Sep 9, 2025
3e24358
Leftover from previous implementation
jpcamara Sep 10, 2025
0a8598a
Move batch completion checks to job
jpcamara Sep 11, 2025
0761fd2
Support rails versions that don't have after_all_transactions_commit
jpcamara Sep 11, 2025
64c3dda
Remove support for nested batches for now
jpcamara Sep 13, 2025
60424d9
Fix starting batch in rails 7.1
jpcamara Sep 13, 2025
58a236f
Helper status method
jpcamara Sep 15, 2025
6ad1be1
Remove parent/child batch relationship, which simplifies the logic
jpcamara Sep 15, 2025
7b8462a
Performance improvements
jpcamara Sep 16, 2025
6effa16
We no longer need to keep jobs
jpcamara Sep 16, 2025
554afd5
Removing pending_jobs column
jpcamara Sep 16, 2025
80af4e0
Update doc to reflect current feature state
jpcamara Sep 16, 2025
a195e25
We always save the batch first now, so we don't need to upsert
jpcamara Sep 16, 2025
6da0e9f
Rubocop
jpcamara Sep 16, 2025
8e583f1
Accidental claude.md
jpcamara Sep 16, 2025
46e117c
Allow omitting a block, which will just enqueue an empty job
jpcamara Sep 17, 2025
fc2f227
Switch batch_id to active_job_batch_id
jpcamara Oct 11, 2025
437a780
Make it so metadata is more ergonomic to include
jpcamara Oct 11, 2025
ca61ca5
Bad query field
jpcamara Oct 11, 2025
cde32c3
Update metadata interface
jpcamara Oct 11, 2025
130ea3b
Give more breathing room for CI test runs
jpcamara Oct 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Solid Queue can be used with SQL databases such as MySQL, PostgreSQL, or SQLite,
- [Performance considerations](#performance-considerations)
- [Failed jobs and retries](#failed-jobs-and-retries)
- [Error reporting on jobs](#error-reporting-on-jobs)
- [Batch jobs](#batch-jobs)
- [Puma plugin](#puma-plugin)
- [Jobs and transactional integrity](#jobs-and-transactional-integrity)
- [Recurring tasks](#recurring-tasks)
Expand Down Expand Up @@ -584,6 +585,66 @@ class ApplicationMailer < ActionMailer::Base
Rails.error.report(exception)
raise exception
end
```

## Batch jobs

SolidQueue offers support for batching jobs. This allows you to track progress of a set of jobs,
and optionally trigger callbacks based on their status. It supports the following:

- Relating jobs to a batch, to track their status
- Three available callbacks to fire:
- `on_finish`: Fired when all jobs have finished, including retries. Fires even when some jobs have failed.
- `on_success`: Fired when all jobs have succeeded, including retries. Will not fire if any jobs have failed, but will fire if jobs have been discarded using `discard_on`
- `on_failure`: Fired when all jobs have finished, including retries. Will only fire if one or more jobs have failed.
- If a job is part of a batch, it can enqueue more jobs for that batch using `batch#enqueue`
- Attaching arbitrary metadata to a batch

```rb
class SleepyJob < ApplicationJob
def perform(seconds_to_sleep)
Rails.logger.info "Feeling #{seconds_to_sleep} seconds sleepy..."
sleep seconds_to_sleep
end
end

class BatchFinishJob < ApplicationJob
def perform(batch) # batch is always the default first argument
Rails.logger.info "Good job finishing all jobs"
end
end

class BatchSuccessJob < ApplicationJob
def perform(batch) # batch is always the default first argument
Rails.logger.info "Good job finishing all jobs, and all of them worked!"
end
end

class BatchFailureJob < ApplicationJob
def perform(batch) # batch is always the default first argument
Rails.logger.info "At least one job failed, sorry!"
end
end

SolidQueue::Batch.enqueue(
on_finish: BatchFinishJob,
on_success: BatchSuccessJob,
on_failure: BatchFailureJob,
user_id: 123
) do
5.times.map { |i| SleepyJob.perform_later(i) }
end
```

### Batch options

In the case of an empty batch, a `SolidQueue::Batch::EmptyJob` is enqueued.

By default, this jobs run on the `default` queue. You can specify an alternative queue for it in an initializer:

```rb
Rails.application.config.after_initialize do # or to_prepare
SolidQueue::Batch.maintenance_queue_name = "my_batch_queue"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we don't need a custom attribute for this 🤔 Since we're setting it in the initialiser, we can also set it like this:

SolidQueue::Batch::EmptyJob.queue_as "my_batch_queue"

And we can delete the maintenance_queue_name everywhere in Batch.

end
```

Expand Down
12 changes: 12 additions & 0 deletions app/jobs/solid_queue/batch/empty_job.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# frozen_string_literal: true

module SolidQueue
class Batch
class EmptyJob < (defined?(ApplicationJob) ? ApplicationJob : ActiveJob::Base)
def perform
# This job does nothing - it just exists to trigger batch completion
# The batch completion will be handled by the normal job_finished! flow
end
end
end
end
157 changes: 157 additions & 0 deletions app/models/solid_queue/batch.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# frozen_string_literal: true

module SolidQueue
class Batch < Record
include Trackable

has_many :jobs
has_many :batch_executions, class_name: "SolidQueue::BatchExecution", dependent: :destroy

serialize :on_finish, coder: JSON
serialize :on_success, coder: JSON
serialize :on_failure, coder: JSON
Comment on lines +10 to +12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for these 3, I'd remove some repetition here and below:

    %w[ finish success failure ].each do |callback_type|
      serialize "on_#{callback_type}", coder: JSON

      define_method("on_#{callback_type}=") do |callback|
        super serialize_callback(callback)
      end
    end

serialize :metadata, coder: JSON

after_initialize :set_active_job_batch_id
after_commit :start_batch, on: :create, unless: -> { ActiveRecord.respond_to?(:after_all_transactions_commit) }
Copy link
Contributor Author

@jpcamara jpcamara Sep 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple places that use after_commits (or just do things after all transactions have committed using ActiveRecord.after_all_transactions_commit), which means they are susceptible to intermitten errors causing them to never fire. Ideally I would update the concurrency maintenance task to also manage checking that batches actually initialize properly. But I didn't want to add anything like that until I get an overall ok about the PRs approach.


mattr_accessor :maintenance_queue_name
self.maintenance_queue_name = "default"

def enqueue(&block)
raise "You cannot enqueue a batch that is already finished" if finished?
Copy link
Member

@rosa rosa Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use here a new error class, like SolidQueue::BatchAlreadyFinished? Or something like that.


transaction do
save! if new_record?

Batch.wrap_in_batch_context(id) do
block&.call(self)
end

if ActiveRecord.respond_to?(:after_all_transactions_commit)
ActiveRecord.after_all_transactions_commit do
start_batch
end
end
end
end

def on_success=(value)
super(serialize_callback(value))
end

def on_failure=(value)
super(serialize_callback(value))
end

def on_finish=(value)
super(serialize_callback(value))
end

def metadata
(super || {}).with_indifferent_access
end

def check_completion!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd name this just check_completion, without the !, to follow the convention that bang-methods have a counterpart without bang that perform the same action but without raising errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, I see it calls update!, maybe that's why you added the !... hmmm... even with that, since there's no counterpart that doesn't raise, I'd remove the !.

return if finished? || !ready?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under which circumstance could this be called for a batch that's not ready? (i.e., for which start_batch hasn't been called?)? Trying to figure out if we could remove that condition from here.

return if batch_executions.limit(1).exists?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the limit(1).exists? here vs. batch_executions.any? for example?


rows = Batch
.where(id: id)
.unfinished
.empty_executions
.update_all(finished_at: Time.current)

return if rows.zero?

with_lock do
failed = jobs.joins(:failed_execution).count
finished_attributes = {}
if failed > 0
finished_attributes[:failed_at] = Time.current
finished_attributes[:failed_jobs] = failed
end
finished_attributes[:completed_jobs] = total_jobs - failed

update!(finished_attributes)
execute_callbacks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to rewrite this method to be more clear and to make it perfectly obvious what it does. But then I realise I'm not quite sure what the two main checks here do, before we update the batch. I mean these two checks:

      return if batch_executions.limit(1).exists?
      rows = Batch.where(id: id).unfinished.empty_executions.update_all(finished_at: Time.current)
      return if rows.zero?

Could you perhaps explain these again? 🙏🏻

end
end

private

def set_active_job_batch_id
self.active_job_batch_id ||= SecureRandom.uuid
end

def as_active_job(active_job_klass)
active_job_klass.is_a?(ActiveJob::Base) ? active_job_klass : active_job_klass.new
end

def serialize_callback(value)
return value if value.blank?
active_job = as_active_job(value)
# We can pick up batch ids from context, but callbacks should never be considered a part of the batch
active_job.batch_id = nil
active_job.serialize
end
Comment on lines +91 to +97
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd inline as_active_job here (as it doesn't feel that important of a step to have its own method, vs. the other steps here), and remove the early return (we'd have either a job class or instance or nil, so we can just return nil). Something like this:

Suggested change
def serialize_callback(value)
return value if value.blank?
active_job = as_active_job(value)
# We can pick up batch ids from context, but callbacks should never be considered a part of the batch
active_job.batch_id = nil
active_job.serialize
end
def serialize_callback(value)
if value.present?
active_job = value.is_a?(ActiveJob::Base) ? value : value.new
# We can pick up batch ids from context, but callbacks should never be considered a part of the batch
active_job.batch_id = nil
active_job.serialize
end
end


def perform_completion_job(job_field, attrs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename job_field here, because we refer to this as callback elsewhere. And since these are job callbacks, perhaps this could be called peform_callback_job. So, perform_callback_job(callback_name, attrs). One question about the attrs: is this method always called with attrs = {}? If so, could we remove that parameter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking more about the names here, we have execute_callbacks, which calls this one for each callback, and this one enqueues the callback jobs. Then, let's just name them by what they do: enqueue_callback_jobs and enqueue_callback_job.

active_job = ActiveJob::Base.deserialize(send(job_field))
active_job.send(:deserialize_arguments_if_needed)
active_job.arguments = [ self ] + Array.wrap(active_job.arguments)
SolidQueue::Job.enqueue_all([ active_job ])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Any reason we can't just use enqueue here, so we don't have to go and find the job ID afterwards?


active_job.provider_job_id = Job.find_by(active_job_id: active_job.job_id).id
attrs[job_field] = active_job.serialize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't do anything with the attrs later, right? But just wondering if I'm missing something.

end

def execute_callbacks
if failed_at?
perform_completion_job(:on_failure, {}) if on_failure.present?
else
perform_completion_job(:on_success, {}) if on_success.present?
end

perform_completion_job(:on_finish, {}) if on_finish.present?
end

def enqueue_empty_job
Batch.wrap_in_batch_context(id) do
EmptyJob.set(queue: self.class.maintenance_queue_name || "default").perform_later
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, since the maintenance_queue_name already is default by default, we could skip that here:

Suggested change
EmptyJob.set(queue: self.class.maintenance_queue_name || "default").perform_later
EmptyJob.set(queue: self.class.maintenance_queue_name).perform_later

end
end

def start_batch
enqueue_empty_job if reload.total_jobs == 0
update!(enqueued_at: Time.current)
end

class << self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move this to the beginning of the file, just to be consistent with other classes in the gem (nothing wrong with having it in the end, it's just for consistency 😅).

def enqueue(on_success: nil, on_failure: nil, on_finish: nil, **metadata, &block)
new.tap do |batch|
batch.assign_attributes(
on_success: on_success,
on_failure: on_failure,
on_finish: on_finish,
metadata: metadata
)

batch.enqueue(&block)
end
end

def current_batch_id
ActiveSupport::IsolatedExecutionState[:current_batch_id]
end

def wrap_in_batch_context(batch_id)
previous_batch_id = current_batch_id.presence || nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
previous_batch_id = current_batch_id.presence || nil
previous_batch_id = current_batch_id.presence

ActiveSupport::IsolatedExecutionState[:current_batch_id] = batch_id
yield
ensure
ActiveSupport::IsolatedExecutionState[:current_batch_id] = previous_batch_id
end
end
end
end
68 changes: 68 additions & 0 deletions app/models/solid_queue/batch/trackable.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# frozen_string_literal: true

module SolidQueue
class Batch
module Trackable
extend ActiveSupport::Concern

included do
scope :finished, -> { where.not(finished_at: nil) }
scope :succeeded, -> { finished.where(failed_at: nil) }
scope :unfinished, -> { where(finished_at: nil) }
scope :failed, -> { where.not(failed_at: nil) }
scope :empty_executions, -> {
where(<<~SQL)
NOT EXISTS (
SELECT 1 FROM solid_queue_batch_executions
WHERE solid_queue_batch_executions.batch_id = solid_queue_batches.id
LIMIT 1
)
SQL
}
end

def status
if finished?
failed? ? "failed" : "completed"
elsif enqueued_at.present?
"processing"
else
"pending"
end
end

def failed?
failed_at.present?
end

def succeeded?
finished? && !failed?
end

def finished?
finished_at.present?
end

def ready?
enqueued_at.present?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something a bit confusing with the status here is that ready? means there's enqueued_at but #status returns processing for that case 🤔 I think ready for me suggests not yet processing, but ready to be processed or ready to start being processed I think... Maybe we could forget about ready and processing here and use something that refers directly to the enqueued_at attribute: enqueued?, and status could also return enqueued.

end

def completed_jobs
finished? ? self[:completed_jobs] : total_jobs - batch_executions.count
end

def failed_jobs
finished? ? self[:failed_jobs] : jobs.joins(:failed_execution).count
end

def pending_jobs
finished? ? 0 : batch_executions.count
end

def progress_percentage
return 0 if total_jobs == 0
((completed_jobs + failed_jobs) * 100.0 / total_jobs).round(2)
end
end
end
end
32 changes: 32 additions & 0 deletions app/models/solid_queue/batch_execution.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# frozen_string_literal: true

module SolidQueue
class BatchExecution < Record
belongs_to :job, optional: true
belongs_to :batch

after_commit :check_completion, on: :destroy

private
def check_completion
batch = Batch.find_by(id: batch_id)
batch.check_completion! if batch.present?
end

class << self
def create_all_from_jobs(jobs)
batch_jobs = jobs.select { |job| job.batch_id.present? }
return if batch_jobs.empty?

batch_jobs.group_by(&:batch_id).each do |batch_id, jobs|
BatchExecution.insert_all!(jobs.map { |job|
{ batch_id:, job_id: job.respond_to?(:provider_job_id) ? job.provider_job_id : job.id }
})

total = jobs.size
SolidQueue::Batch.where(id: batch_id).update_all([ "total_jobs = total_jobs + ?", total ])
end
end
end
end
end
23 changes: 23 additions & 0 deletions app/models/solid_queue/execution/batchable.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# frozen_string_literal: true

module SolidQueue
class Execution
module Batchable
extend ActiveSupport::Concern

included do
after_create :update_batch_progress, if: -> { job.batch_id? }
end

private
def update_batch_progress
if is_a?(FailedExecution)
# FailedExecutions are only created when the job is done retrying
job.batch_execution&.destroy!
end
rescue => e
Rails.logger.error "[SolidQueue] Failed to notify batch #{job.batch_id} about job #{job.id} failure: #{e.message}"
end
end
end
end
2 changes: 1 addition & 1 deletion app/models/solid_queue/failed_execution.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

module SolidQueue
class FailedExecution < Execution
include Dispatching
include Dispatching, Batchable

serialize :error, coder: JSON

Expand Down
5 changes: 3 additions & 2 deletions app/models/solid_queue/job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ module SolidQueue
class Job < Record
class EnqueueError < StandardError; end

include Executable, Clearable, Recurrable
include Executable, Clearable, Recurrable, Batchable

serialize :arguments, coder: JSON

Expand Down Expand Up @@ -62,7 +62,8 @@ def attributes_from_active_job(active_job)
scheduled_at: active_job.scheduled_at,
class_name: active_job.class.name,
arguments: active_job.serialize,
concurrency_key: active_job.concurrency_key
concurrency_key: active_job.concurrency_key,
batch_id: active_job.batch_id
}
end
end
Expand Down
Loading