SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
to: Resque
from: Backgroundrb
@kbrock
2010-10-12
Summary

!   Background queues let us defer logic outside the browser request and
    response.

!   Background.rb was crashing for us often. Moved to resque and it
    hasn't crashed since.

!   Background.rb is easier to run out of the box.

!   Adding just a little code makes Resque just as easy without sacrificing
    all the added flexibility.
Why we upgraded?

!   bdrb pages Boss 4 times my first weekend

    !   memory leaks caused crashes

    !   monit can't restart workers in backgroundrb

!   move to active project (ala heroku, github, redis)
What do each bring to the table

                                                        bdrb        resque

   adhoc (out of request)                                !            !

   delay (run/remind)                                    !     resque-schedule

   schedule (cron)                                       !     resque-schedule

   mail (invisible/out of req)                          code    resque_mailer

   status reporting                                     code   resque-meta, web

backgroundrb does most of what we need out of the box
resque has plugins to make up the difference
Bdrb Components

                                                                                      scheduler




                                                                                                         workers




                                                 rails                       main      queue
                                                                                                  work
                                               enqueue                      queue     manager




                                                                                                         mailer



                                              we started       Monitored       data
                                                                                      bdrb yml




simple w/ 1 queue (add started_at for delayed jobs)
scheduler is a special worker - managed by 1 process (is a runner/worker)
Resque Components

                                                                                 delayed          scheduler   schedule
                                                                                  queue
                                                                                                                        2

                                                    rails
                                                  enqueue

                                                             1

                                                                                                  workers        rake


                                                                                                                            4
                                                   resque                         main
                                                                                   main    work
                                                     web                            main
                                                                                 queue
                                                                                  queue
                                                                                   queue                       workers
                                                             6
                                                                                                                            3
                                                                                                   mailer
                                              we started         Monitored        data
                                                                             5




many moving parts
simplified in all workers are the same
scheduler simply adds entries in the queue (instead of MetaWorker/running jobs)
web ui is a nice touch
1. Ad-hoc Enqueuing

                                                                           bdrb      resque


                           args                                            hash   ruby, checked


    enqueue AR objects                                                      !


             mail(invisible)                                                !          !

AR objects - creeped up in the action_mailer deliver calls
Looks like bdrb wins here, but not enqueuing AR objects is best practice
Ad-hoc/Delayed (bdrb)
    class JobWorker < BackgrounDRb::MetaWorker
      set_worker_name :job_worker
      def purge_job_logs()
        JobLog.purge_expired!
        persistent_job.finish!
      end
      def self.perform_later(*args)
        MiddleMan.worker(:job_worker).enq_purge_job_logs(
          :job_key => new_job_key, :arg => args)
      end
      def self.perform_at(*args)
        time=args.shift
        MiddleMan.worker(:job_worker).enq_purge_job_logs(
          :job_key => new_job_key, :arg => *args,:scheduled_at => time)
      end
      def self.new_job_key()
        "purge_job_logs_#{ActiveSupport::SecureRandom.hex(8)}"
      end
    end
don't need to do a command pattern (our code didn't)
scheduled_at = beauty of SQL
parent class
enqueue knows queue name (code not loaded)
Ad-hoc/Delayed (resque)
    class PurgeJobLogs
      @queue = :job_worker
      def self.process()
        JobLog.purge_expired!
      end

      def self.perform_later(*args)
        Resque.enqueue(self, *args)
      end
      def self.perform_at(*args)
        time=args.shift
        Resque.enqueue_at(time, self, *args)
      end
    end




Enqueue needs worker class to know the name of the queue
(even if called directly into Resque)
interface only (perform_{at,later}) -> abstracted out to parent?
2. Scheduled Enqueuing

                                                                                 bdrb    resque


      sched any method                                                           !x2    command


                   scheduler                                                      !       !+


                 adhoc jobs                                                                !

Need to define schedule in 2 places. yml and ruby.
We ran into case where this caused a problem
web ui for easy adhoc kicking off of resque commands. (very useful in staging)
Scheduled (bdrb)
    :backgroundrb:
      :ip: 127.0.0.1
      :port: 11006
      :environment: development

    :schedules:
      :scheduled_worker:
        :purge_job_logs:
          :trigger_args: 0 */5 * * * *




Evidence of framework - scheduled_worker defined here, need meta worker (so it can be run)
Scheduled (bdrb)

    class ScheduledWorker < BackgrounDRb::MetaWorker
      extend BdrbUtils::CronExtensions
      set_worker_name :scheduled_worker

      threaded_cron_job(:purge_job_logs) { JobLog.purge_expired! }
    end




scheduler = MetaWorker. Defined 2 times - so it calls your code, so can call "any static method"
Scheduled (resque)
     ---
     clear_logs:
       cron: "*/10 * * * *"
       class: PurgeJobLogs
       queue: job_worker
       description: Remove old logs




queue_name (so scheduler does not need to load worker into memory to enqueue)
cron is standard format (remove 'seconds') - commands
scheduler in separate process. (can run when workers are stopped / changed) - minimal env
scheduler injects into queue (vs runs jobs) - so can adhoc inject via web
no ruby code for this
3. Processes/Worker management

                                                                                         bdrb            resque

            knows queues                                                                        !   us, command, web

                           pids                                                                 !         us+

     mem leak resistant                                                                                    !

          workers/queue                                                                         1        <1 - ∞

            pause workers                                                                                  !

Discover previous queues (not all) via 'resque list' / web
bdrb: creates 1 worker/queue (creates pid file + 1 pid for backgroundrb) - monit can't restart
we manage processes: 1+ workers/queue - 1+ queues / worker
pause/restart workers
worker list (resque)
    primary:
      queues: background,mail
    secondary:
      queues: mail,background




can have multiple workers running the same queues
can have multiple queues in 1 worker
worker pool can be * generalized, * response focused, * schedule focused, *changed at runtime
inverted priority list - prevents starvation
4. Running Workers

namespace :resque do
  desc 'start all background resque daemons'
  task :start_daemons do
    mrake_start "resque_scheduler resque:scheduler"
    workers_config.each do |worker, config|
      mrake_start "resque_#{worker} resque:work QUEUE=#{config['queues']}"
    end
  end
  desc 'stop all background resque daemons'
  task :stop_daemons do
    sh "./script/monit_rake stop resque_scheduler"
    workers_config.each do |worker, config|
      sh "./script/monit_rake stop resque_#{worker} -s QUIT"
    end
  end
  def self.workers_config
    YAML.load(File.open(ENV['WORKER_YML'] || 'config/resque_workers.yml'))
  end
  def self.mrake_start(task)
    sh "nohup ./script/monit_rake start #{task} RAILS_ENV=#{ENV['RAILS_ENV']} >> log/daemons.log &"
  end
end
Deploying (cap)
namespace :resque do
  desc "Stop the resque daemon"
  task :stop, :roles => :resque do
    run "cd #{current_path} && RAILS_ENV=#{rails_env} WORKER_YML=#{resque_workers_yml} rake
resque:stop_daemons; true"
  end

  desc "Start the resque daemon"
  task :start, :roles => :resque do
    run "cd #{current_path} && RAILS_ENV=#{rails_env} WORKER_YML=#{resque_workers_yml} rake
resque:start_daemons"
  end
end
5. Monitoring Workers (monit.erb)
    check process resque_scheduler
        with pidfile <%= @rails_root %>/tmp/pids/resque_scheduler.pid
        group resque
        alert errors@domain.com
        start program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake
    start resque_scheduler resque:scheduler'"
        stop program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake
    stop resque_scheduler'"

    <% YAML.load(File.open(Rails.root+'/config/production/resque/resque_workers.yml')).each_pair do
    |worker, config| %>
    check process resque_<%=worker%>
        with pidfile <%= @rails_root %>/tmp/pids/resque_<%=worker%>.pid
        group resque
        alert errors@domain.com
        start program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake
    start resque_<%=worker%> resque:work QUEUE=<%=config['queues']%>'"
        stop program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake
    stop resque_<%=worker%>'"

    <% end %>




use template to generate monit file
Monitoring Rake Processes
#!/bin/sh
# wrapper to daemonize rake tasks: see also http://mmonit.com/wiki/Monit/FAQ#pidfile

usage() {
  echo "usage: ${0} [start|stop] name target [arguments]"
  echo "tname is used to create or read the log and pid file names"
  echo "tfor start: target and arguments are passed to rake"
  echo "tfor stop: target and arguments are passed to kill (e.g.: -n 3)"
  exit 1
}
[ $# -lt 2 ] && usage

cmd=$1
name=$2
shift ; shift

pid_file=./tmp/pids/${name}.pid
log_file=./log/${name}.log

# ...
Monitoring Processes
case $cmd in
  start)
     if [ ${#} -eq 0 ] ; then
        echo -e "nERROR: missing targetn"
        usage
     fi
     pid=`cat ${pid_file} 2> /dev/null`
     if [ -n "${pid}" ] ; then
        ps ${pid}
        if [ $? -eq 0 ] ; then
           echo "ensure process ${name} (pid: ${pid_file}) is not running"
           exit 1
        fi
     fi
     echo $$ > ${pid_file}
     exec 2>&1 rake $* 1>> ${log_file} ;;
  stop)
     pid=`cat ${pid_file} 2> /dev/null`
     [ -n "${pid}" ] && kill $* ${pid}
     rm -f ${pid_file} ;;
  *) usage ;;
esac
Monitoring Web
6. Running Web
namespace :resque do
  task :setup => :environment

  desc 'kick off resque-web'
  task :web => :environment do
    $stdout.sync=true
    $stderr.sync=true
    puts `env RAILS_ENV=#{RAILS_ENV} resque-web #{RAILS_ROOT}/config/initializers/resque.rb`
  end
end
initializer
#this runs in sinatra and rails - so don't use Rails.env
rails_env = ENV['RAILS_ENV'] || 'development'
rails_root=ENV['RAILS_ROOT'] || File.join(File.dirname(__FILE__),'../..')

redis_config = YAML.load_file(rails_root + '/config/redis.yml')
Resque.redis = redis_config[rails_env]

require 'resque_scheduler'
require 'resque/plugins/meta'
require 'resque_mailer'

Resque.schedule = YAML.load_file(rails_root+'/config/resque_schedule.yml')
Resque::Mailer.excluded_environments = [:test, :cucumber]
5. Monitoring Work

                                                                               bdrb       resque
           ad-hoc queries                                                      SQL      redis query
                  did it run?                                                 custom    resque-meta
                  did it fail?                                                hoptoad       !
                         rerun                                                              !
                      have id                                                   !       resque-meta
                  que health                                        sample controller       !
Did the job run?
resque assumes all worked - only tells you failures. not good enough for us
Pausing Workers

   signal         what happens                  when to use
    quit           wait for child & exit       gracefully shutdown


  term / int   immediately kill child & exit     shutdown now


    usr1          immediately kill child           stale child


    usr2         don't start any new jobs


    cont         start to process new jobs
Testing Worker

                    bdrb        resque

 testing queue     mid-easy   resque_unit

testing command                   !

all workers same                  !

 interface only                   !
Mail

Resque::Mailer.excluded_environments = [:test, :cucumber]
Extending with Hooks

                             resque hooks

                        around_enqueue                                "

                            after_enqueue                            !

                          before_perform                             !

                        around_perform                               !/"

                            after_perform                            !

all plugins want to extend enqueue - not compatible
need to be able to alter arguments (e.g.: add id for meta plugins)
Conclusion

!   Boss got no pages in first month of implementation

    !   no memory leaks, great uptime (don't need monit...)

!   Fast

    !   generalized workers increases throughput (nightly vs 1 hour)

!   minimal custom code

!   still some intimidation

!   Eating flavor of the month
References

!   coders: @kbrock and @wpeterson

!   great company: PatientsLikeMe (encouraged sharing this)

!   resque_mailer

!   resque-scheduler

!   resque-meta

!   monit, hoptoad, rpm_contrib

Contenu connexe

En vedette

Gearman, Supervisor and PHP - Job Management with Sanity!
Gearman, Supervisor and PHP - Job Management with Sanity!Gearman, Supervisor and PHP - Job Management with Sanity!
Gearman, Supervisor and PHP - Job Management with Sanity!Abu Ashraf Masnun
 
Gearman and asynchronous processing in PHP applications
Gearman and asynchronous processing in PHP applicationsGearman and asynchronous processing in PHP applications
Gearman and asynchronous processing in PHP applicationsTeamskunkworks
 
Scale like a pro with Gearman
Scale like a pro with GearmanScale like a pro with Gearman
Scale like a pro with GearmanAmal Raghav
 
Distributed Queue System using Gearman
Distributed Queue System using GearmanDistributed Queue System using Gearman
Distributed Queue System using GearmanEric Cho
 
Gearman: A Job Server made for Scale
Gearman: A Job Server made for ScaleGearman: A Job Server made for Scale
Gearman: A Job Server made for ScaleMike Willbanks
 

En vedette (6)

Gearman, Supervisor and PHP - Job Management with Sanity!
Gearman, Supervisor and PHP - Job Management with Sanity!Gearman, Supervisor and PHP - Job Management with Sanity!
Gearman, Supervisor and PHP - Job Management with Sanity!
 
Gearman and asynchronous processing in PHP applications
Gearman and asynchronous processing in PHP applicationsGearman and asynchronous processing in PHP applications
Gearman and asynchronous processing in PHP applications
 
Scale like a pro with Gearman
Scale like a pro with GearmanScale like a pro with Gearman
Scale like a pro with Gearman
 
Gearman for MySQL
Gearman for MySQLGearman for MySQL
Gearman for MySQL
 
Distributed Queue System using Gearman
Distributed Queue System using GearmanDistributed Queue System using Gearman
Distributed Queue System using Gearman
 
Gearman: A Job Server made for Scale
Gearman: A Job Server made for ScaleGearman: A Job Server made for Scale
Gearman: A Job Server made for Scale
 

Similaire à Migrating from Backgroundrb to Resque

FireWorks workflow software
FireWorks workflow softwareFireWorks workflow software
FireWorks workflow softwareAnubhav Jain
 
Introduction to Python Celery
Introduction to Python CeleryIntroduction to Python Celery
Introduction to Python CeleryMahendra M
 
To Batch Or Not To Batch
To Batch Or Not To BatchTo Batch Or Not To Batch
To Batch Or Not To BatchLuca Mearelli
 
Spring Batch Behind the Scenes
Spring Batch Behind the ScenesSpring Batch Behind the Scenes
Spring Batch Behind the ScenesJoshua Long
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftLee Stott
 
High Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of ViewHigh Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of Viewaragozin
 
Perly Parsing with Regexp::Grammars
Perly Parsing with Regexp::GrammarsPerly Parsing with Regexp::Grammars
Perly Parsing with Regexp::GrammarsWorkhorse Computing
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aSchubert Zhang
 
Async and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NLAsync and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NLArie Leeuwesteijn
 
Big data unit iv and v lecture notes qb model exam
Big data unit iv and v lecture notes   qb model examBig data unit iv and v lecture notes   qb model exam
Big data unit iv and v lecture notes qb model examIndhujeni
 
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...Paolo Negri
 
Hanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Inc.
 
Background processing with Resque
Background processing with ResqueBackground processing with Resque
Background processing with ResqueNicolas Blanco
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce ParadigmDilip Reddy
 
Closing the DevOps gaps
Closing the DevOps gapsClosing the DevOps gaps
Closing the DevOps gapsdev2ops
 
Background Jobs - Com BackgrounDRb
Background Jobs - Com BackgrounDRbBackground Jobs - Com BackgrounDRb
Background Jobs - Com BackgrounDRbJuan Maiz
 
A Tale of a Server Architecture (Frozen Rails 2012)
A Tale of a Server Architecture (Frozen Rails 2012)A Tale of a Server Architecture (Frozen Rails 2012)
A Tale of a Server Architecture (Frozen Rails 2012)Flowdock
 

Similaire à Migrating from Backgroundrb to Resque (20)

FireWorks workflow software
FireWorks workflow softwareFireWorks workflow software
FireWorks workflow software
 
Introduction to Python Celery
Introduction to Python CeleryIntroduction to Python Celery
Introduction to Python Celery
 
To Batch Or Not To Batch
To Batch Or Not To BatchTo Batch Or Not To Batch
To Batch Or Not To Batch
 
Spring Batch Behind the Scenes
Spring Batch Behind the ScenesSpring Batch Behind the Scenes
Spring Batch Behind the Scenes
 
MEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop MicrosoftMEW22 22nd Machine Evaluation Workshop Microsoft
MEW22 22nd Machine Evaluation Workshop Microsoft
 
High Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of ViewHigh Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of View
 
Perly Parsing with Regexp::Grammars
Perly Parsing with Regexp::GrammarsPerly Parsing with Regexp::Grammars
Perly Parsing with Regexp::Grammars
 
Hanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221aHanborq optimizations on hadoop map reduce 20120221a
Hanborq optimizations on hadoop map reduce 20120221a
 
Async and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NLAsync and parallel patterns and application design - TechDays2013 NL
Async and parallel patterns and application design - TechDays2013 NL
 
Big data unit iv and v lecture notes qb model exam
Big data unit iv and v lecture notes   qb model examBig data unit iv and v lecture notes   qb model exam
Big data unit iv and v lecture notes qb model exam
 
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
Distributed and concurrent programming with RabbitMQ and EventMachine Rails U...
 
Hanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduceHanborq Optimizations on Hadoop MapReduce
Hanborq Optimizations on Hadoop MapReduce
 
Background processing with Resque
Background processing with ResqueBackground processing with Resque
Background processing with Resque
 
MySQL Proxy tutorial
MySQL Proxy tutorialMySQL Proxy tutorial
MySQL Proxy tutorial
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
MapReduce Paradigm
MapReduce ParadigmMapReduce Paradigm
MapReduce Paradigm
 
Closing the DevOps gaps
Closing the DevOps gapsClosing the DevOps gaps
Closing the DevOps gaps
 
Background Jobs - Com BackgrounDRb
Background Jobs - Com BackgrounDRbBackground Jobs - Com BackgrounDRb
Background Jobs - Com BackgrounDRb
 
Kanban vs scrum
Kanban vs scrumKanban vs scrum
Kanban vs scrum
 
A Tale of a Server Architecture (Frozen Rails 2012)
A Tale of a Server Architecture (Frozen Rails 2012)A Tale of a Server Architecture (Frozen Rails 2012)
A Tale of a Server Architecture (Frozen Rails 2012)
 

Dernier

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Dernier (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Migrating from Backgroundrb to Resque

  • 2. Summary ! Background queues let us defer logic outside the browser request and response. ! Background.rb was crashing for us often. Moved to resque and it hasn't crashed since. ! Background.rb is easier to run out of the box. ! Adding just a little code makes Resque just as easy without sacrificing all the added flexibility.
  • 3. Why we upgraded? ! bdrb pages Boss 4 times my first weekend ! memory leaks caused crashes ! monit can't restart workers in backgroundrb ! move to active project (ala heroku, github, redis)
  • 4. What do each bring to the table bdrb resque adhoc (out of request) ! ! delay (run/remind) ! resque-schedule schedule (cron) ! resque-schedule mail (invisible/out of req) code resque_mailer status reporting code resque-meta, web backgroundrb does most of what we need out of the box resque has plugins to make up the difference
  • 5. Bdrb Components scheduler workers rails main queue work enqueue queue manager mailer we started Monitored data bdrb yml simple w/ 1 queue (add started_at for delayed jobs) scheduler is a special worker - managed by 1 process (is a runner/worker)
  • 6. Resque Components delayed scheduler schedule queue 2 rails enqueue 1 workers rake 4 resque main main work web main queue queue queue workers 6 3 mailer we started Monitored data 5 many moving parts simplified in all workers are the same scheduler simply adds entries in the queue (instead of MetaWorker/running jobs) web ui is a nice touch
  • 7. 1. Ad-hoc Enqueuing bdrb resque args hash ruby, checked enqueue AR objects ! mail(invisible) ! ! AR objects - creeped up in the action_mailer deliver calls Looks like bdrb wins here, but not enqueuing AR objects is best practice
  • 8. Ad-hoc/Delayed (bdrb) class JobWorker < BackgrounDRb::MetaWorker set_worker_name :job_worker def purge_job_logs() JobLog.purge_expired! persistent_job.finish! end def self.perform_later(*args) MiddleMan.worker(:job_worker).enq_purge_job_logs( :job_key => new_job_key, :arg => args) end def self.perform_at(*args) time=args.shift MiddleMan.worker(:job_worker).enq_purge_job_logs( :job_key => new_job_key, :arg => *args,:scheduled_at => time) end def self.new_job_key() "purge_job_logs_#{ActiveSupport::SecureRandom.hex(8)}" end end don't need to do a command pattern (our code didn't) scheduled_at = beauty of SQL parent class enqueue knows queue name (code not loaded)
  • 9. Ad-hoc/Delayed (resque) class PurgeJobLogs @queue = :job_worker def self.process() JobLog.purge_expired! end def self.perform_later(*args) Resque.enqueue(self, *args) end def self.perform_at(*args) time=args.shift Resque.enqueue_at(time, self, *args) end end Enqueue needs worker class to know the name of the queue (even if called directly into Resque) interface only (perform_{at,later}) -> abstracted out to parent?
  • 10. 2. Scheduled Enqueuing bdrb resque sched any method !x2 command scheduler ! !+ adhoc jobs ! Need to define schedule in 2 places. yml and ruby. We ran into case where this caused a problem web ui for easy adhoc kicking off of resque commands. (very useful in staging)
  • 11. Scheduled (bdrb) :backgroundrb: :ip: 127.0.0.1 :port: 11006 :environment: development :schedules: :scheduled_worker: :purge_job_logs: :trigger_args: 0 */5 * * * * Evidence of framework - scheduled_worker defined here, need meta worker (so it can be run)
  • 12. Scheduled (bdrb) class ScheduledWorker < BackgrounDRb::MetaWorker extend BdrbUtils::CronExtensions set_worker_name :scheduled_worker threaded_cron_job(:purge_job_logs) { JobLog.purge_expired! } end scheduler = MetaWorker. Defined 2 times - so it calls your code, so can call "any static method"
  • 13. Scheduled (resque) --- clear_logs: cron: "*/10 * * * *" class: PurgeJobLogs queue: job_worker description: Remove old logs queue_name (so scheduler does not need to load worker into memory to enqueue) cron is standard format (remove 'seconds') - commands scheduler in separate process. (can run when workers are stopped / changed) - minimal env scheduler injects into queue (vs runs jobs) - so can adhoc inject via web no ruby code for this
  • 14. 3. Processes/Worker management bdrb resque knows queues ! us, command, web pids ! us+ mem leak resistant ! workers/queue 1 <1 - ∞ pause workers ! Discover previous queues (not all) via 'resque list' / web bdrb: creates 1 worker/queue (creates pid file + 1 pid for backgroundrb) - monit can't restart we manage processes: 1+ workers/queue - 1+ queues / worker pause/restart workers
  • 15. worker list (resque) primary: queues: background,mail secondary: queues: mail,background can have multiple workers running the same queues can have multiple queues in 1 worker worker pool can be * generalized, * response focused, * schedule focused, *changed at runtime inverted priority list - prevents starvation
  • 16. 4. Running Workers namespace :resque do desc 'start all background resque daemons' task :start_daemons do mrake_start "resque_scheduler resque:scheduler" workers_config.each do |worker, config| mrake_start "resque_#{worker} resque:work QUEUE=#{config['queues']}" end end desc 'stop all background resque daemons' task :stop_daemons do sh "./script/monit_rake stop resque_scheduler" workers_config.each do |worker, config| sh "./script/monit_rake stop resque_#{worker} -s QUIT" end end def self.workers_config YAML.load(File.open(ENV['WORKER_YML'] || 'config/resque_workers.yml')) end def self.mrake_start(task) sh "nohup ./script/monit_rake start #{task} RAILS_ENV=#{ENV['RAILS_ENV']} >> log/daemons.log &" end end
  • 17. Deploying (cap) namespace :resque do desc "Stop the resque daemon" task :stop, :roles => :resque do run "cd #{current_path} && RAILS_ENV=#{rails_env} WORKER_YML=#{resque_workers_yml} rake resque:stop_daemons; true" end desc "Start the resque daemon" task :start, :roles => :resque do run "cd #{current_path} && RAILS_ENV=#{rails_env} WORKER_YML=#{resque_workers_yml} rake resque:start_daemons" end end
  • 18. 5. Monitoring Workers (monit.erb) check process resque_scheduler with pidfile <%= @rails_root %>/tmp/pids/resque_scheduler.pid group resque alert errors@domain.com start program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake start resque_scheduler resque:scheduler'" stop program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake stop resque_scheduler'" <% YAML.load(File.open(Rails.root+'/config/production/resque/resque_workers.yml')).each_pair do |worker, config| %> check process resque_<%=worker%> with pidfile <%= @rails_root %>/tmp/pids/resque_<%=worker%>.pid group resque alert errors@domain.com start program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake start resque_<%=worker%> resque:work QUEUE=<%=config['queues']%>'" stop program = "/bin/sh -c 'cd <%= @rails_root %>; RAILS_ENV=production ./script/monit_rake stop resque_<%=worker%>'" <% end %> use template to generate monit file
  • 19. Monitoring Rake Processes #!/bin/sh # wrapper to daemonize rake tasks: see also http://mmonit.com/wiki/Monit/FAQ#pidfile usage() { echo "usage: ${0} [start|stop] name target [arguments]" echo "tname is used to create or read the log and pid file names" echo "tfor start: target and arguments are passed to rake" echo "tfor stop: target and arguments are passed to kill (e.g.: -n 3)" exit 1 } [ $# -lt 2 ] && usage cmd=$1 name=$2 shift ; shift pid_file=./tmp/pids/${name}.pid log_file=./log/${name}.log # ...
  • 20. Monitoring Processes case $cmd in start) if [ ${#} -eq 0 ] ; then echo -e "nERROR: missing targetn" usage fi pid=`cat ${pid_file} 2> /dev/null` if [ -n "${pid}" ] ; then ps ${pid} if [ $? -eq 0 ] ; then echo "ensure process ${name} (pid: ${pid_file}) is not running" exit 1 fi fi echo $$ > ${pid_file} exec 2>&1 rake $* 1>> ${log_file} ;; stop) pid=`cat ${pid_file} 2> /dev/null` [ -n "${pid}" ] && kill $* ${pid} rm -f ${pid_file} ;; *) usage ;; esac
  • 22. 6. Running Web namespace :resque do task :setup => :environment desc 'kick off resque-web' task :web => :environment do $stdout.sync=true $stderr.sync=true puts `env RAILS_ENV=#{RAILS_ENV} resque-web #{RAILS_ROOT}/config/initializers/resque.rb` end end
  • 23. initializer #this runs in sinatra and rails - so don't use Rails.env rails_env = ENV['RAILS_ENV'] || 'development' rails_root=ENV['RAILS_ROOT'] || File.join(File.dirname(__FILE__),'../..') redis_config = YAML.load_file(rails_root + '/config/redis.yml') Resque.redis = redis_config[rails_env] require 'resque_scheduler' require 'resque/plugins/meta' require 'resque_mailer' Resque.schedule = YAML.load_file(rails_root+'/config/resque_schedule.yml') Resque::Mailer.excluded_environments = [:test, :cucumber]
  • 24. 5. Monitoring Work bdrb resque ad-hoc queries SQL redis query did it run? custom resque-meta did it fail? hoptoad ! rerun ! have id ! resque-meta que health sample controller ! Did the job run? resque assumes all worked - only tells you failures. not good enough for us
  • 25. Pausing Workers signal what happens when to use quit wait for child & exit gracefully shutdown term / int immediately kill child & exit shutdown now usr1 immediately kill child stale child usr2 don't start any new jobs cont start to process new jobs
  • 26. Testing Worker bdrb resque testing queue mid-easy resque_unit testing command ! all workers same ! interface only !
  • 28. Extending with Hooks resque hooks around_enqueue " after_enqueue ! before_perform ! around_perform !/" after_perform ! all plugins want to extend enqueue - not compatible need to be able to alter arguments (e.g.: add id for meta plugins)
  • 29. Conclusion ! Boss got no pages in first month of implementation ! no memory leaks, great uptime (don't need monit...) ! Fast ! generalized workers increases throughput (nightly vs 1 hour) ! minimal custom code ! still some intimidation ! Eating flavor of the month
  • 30. References ! coders: @kbrock and @wpeterson ! great company: PatientsLikeMe (encouraged sharing this) ! resque_mailer ! resque-scheduler ! resque-meta ! monit, hoptoad, rpm_contrib