Parallel Each (for ruby with threads)

It is pretty common to have iterations over Arrays that can be safely run in parallel. With multicore chips becoming pretty common, single threaded processing is about as cool as Pog. Unfortunately, standard Ruby hates real threads pretty hardcore at the present time; however, for some ruby projects alternate VMs like JRuby do give multicores some lovin'. Peach exists to make this power simple to use with minimal code changes.

Functions like map, each, and delete_if are often used in a functional, side-effect free style. If the operation in the block is computationally intense, performance can often be gained by multithreading the process. That's where Peach comes in. In the simplest case, you are one letter away from harnessing the power of parallelism and unlocking the secret of a guilt-free tan. At this stage, the goggles are purely optional.

Using Peach

Suppose you are going about your day job hacking away at code for the WOPR when you stumble upon the code:

cities.each {|city| thermonuclear_war(city)}

Clearly, the only winning move is to declare war in parallel. With Peach, the new code is:

require 'peach'

cities.peach {|city| thermonuclear_war(city)}

Requiring peach.rb monkey patches Array into submission. Currently Peach provides peach, pmap, and pdelete_if. Each of these functions takes an optional argument n, which represents the desired number of worker threads with the default being one thread per Array element. For cheaper operations on a large number of elements, you probably want to set n to something reasonably low.

(0...10000).to_a.pmap(4) {|x| process(x)}

Constructing the threads and adding on a few layers of indirection does add a bit of overhead to the iteration especially on MRI. Keep this in mind and remember to benchmark when unsure.

Syntax (without all the words)

require 'peach'

[1,2,3,4].peach{|x| f(x)} #Spawns 4 threads, => [1,2,3,4]
[1,2,3,4].pmap{|x| f(x)} #Spawns 4 threads, => [f(1),f(2),f(3),f(4)]
[1,2,3,4].pdelete_if{|x| x > 2} #Spawns 4 threads, => [3,4]

[1,2,3,4].peach(2){|x| f(x)} #Spawns 2 threads, => [1,2,3,4]
[1,2,3,4].pmap(2){|x| f(x)} #Spawns 2 threads, => [f(1),f(2),f(3),f(4)]
[1,2,3,4].pdelete_if(2){|x| x > 2} #Spawns 2 threads, => [3,4]


Q: I use normal ruby (MRI 1.8 or 1.9), will Peach confer superpowers and great performance upon my code?
A: No, on MRI your code will be slightly slower because of the increased overhead for Thread creation. MRI is singlethreaded so Peach will not make it magically parallel.

Q: Why should I switch to JRuby to get the benefits of Peach?
A: Switching to JRuby for code that needs better performance is a good idea even without Peach. JRuby is insanely fast and a good idea. The multithreading and ability to use this humble utility is just another feature.

Q: Benchmarks?
A: I am pretty bad at benchmarking code, but I do have a simple test comparing performance between map and pmap on MRI and JRuby. Headius helped in the preparation of these materials. Check it out. JRuby on Java 1.6 is even faster. If you come up with any benchmarks, do let me know.

Eat a Peach

Peach is distributed as a gem from github, so:
gem install schleyfox-peach --source=