ssh-keygen -t dsa (no pass phrase unless you want to use ssh-agent).
cp ~/.ssh/id_dsa.pub ~/.ssh/authorized_keys
.ssh/known_hosts Easiest way: just log in once to
the hosts that you will use.
for n in hippo newt python rhino; do ssh $n hostname; done
#As usual: export PYTHONPATH=~/myInstalls/python/lib/python #Needed to find start up scripts: export PATH=~/myInstalls/python/bin:$PATH
>>> from nws.sleigh import Sleigh, sshcmd >>> s = Sleigh()This creates three “generic” workers. Let's see where the processes are actually running:
>>> from socket import gethostname >>> s.eachWorker(gethostname) ['newt', 'newt', 'newt']
eachWorker runs the specified function once on each worker process.
We see, not surprisingly, that by
default the three worker processes are started on the local machine (why three?
(i) good test number; (ii) two CPUs with
two cores are common on clusters and they or quad cores will soon be common on desktops).
Obviously, we would like to be able to use other machines too. So let's shut this down, and try something a little different.
>>> s.stop() >>> s = Sleigh(nodeList=['hippo', 'newt', 'python', 'rhino'], launch=sshcmd) >>> s.eachWorker(gethostname) ['rhino', 'hippo', 'newt', 'python']To take a peek under the covers for more details, browse around with the web interface.
Note that eachWorker is not usually run to carry out a computation per se,
so no effort is made to return the results in any particular order. E.g.:
>>> s.eachWorker(gethostname) ['python', 'hippo', 'rhino', 'newt']We can use
eachWorker to build up appropriate state and to start
compute servers (a bit like a specialized Linda eval).
We can use a sleigh to carry out a master/worker-style computation by using
eachElem. This method takes a function and a list, and returns a
list of the results of applying the function to each element in the input list.
>>> r = s.eachElem(lambda x: x*x*x, range(100)) >>> len(r) 100 >>> r[2] 8 >>> r = s.eachElem(lambda x: x*x*x, range(100)) >>> len(r) 100 >>> r[2:5] [8, 27, 64]We see that the results are returned in order (and, for that matter, the tasks are evaluated in order too — write a little code that checks this). We won't go into the details here, but
eachElem is capable of handling quite general functions
with interleaved varying and fixed arguments.
This example illustrates how sleigh can be used to simplify running computations like that seen in the previous lecture:
# worker
def f(x): return x*x*x
ws = nws.client.NetWorkSpace('table test')
while 1: ws.store('r', f(ws.fetch('x')))
# master
ws = nws.client.NetWorkSpace('table test')
for x in range(10): ws.store('x', x)
for x in range(10): print 'f(%d) = %d'%(x, ws.fetch('r'))
Let's now look at an example with a more variable (synthetic) workload.
>>> from time import sleep
>>> from random import randint
>>> def nap(x):
... t = randint(0, 5)
... sleep(t)
... return (x, t)
...
>>> r = s.eachElem(nap, range(10))
>>> r
[SleighTaskException("Task invocation failed: global name 'randint' is not defined"), ... , SleighTaskException("Task invocation failed: global name 'randint' is not defined")]
That's not what we wanted.
The workers need to import added functionality just
like the master (there are exception — we've already seen one:
gethostname; since this was the function invoked, sleigh took care
of the “bookkeeping” for us, but it doesn't do this recursively).
We need to have each worker import appropriate support functions:
>>> s.eachWorker('from time import sleep')
[None, None, None, None]
>>> s.eachWorker('from random import randint')
[None, None, None, None]
Two notes:
>>> r = s.eachElem(nap, range(10)) >>> r [(0, 2), (1, 5), (2, 0), (3, 0), (4, 0), (5, 3), (6, 3), (7, 0), (8, 4), (9, 4)]That's better! But who wants to wait around?
>>> r = s.eachElem(nap, range(10), blocking=False) >>> r.check() 7 >>> r.check() 6 >>> r.check() 4 >>> r.check() 1 >>> r.check() 0 >>> r = r.wait() >>> r [(0, 4), (1, 2), (2, 0), (3, 2), (4, 4), (5, 5), (6, 1), (7, 3), (8, 5), (9, 3)]When run with
blocking=False, the sleigh immediately returns a
SleighPending object. Its check method returns the
count of outstanding tasks. Its wait method returns the results
(blocking if need be to wait for laggards).
Workers have available a unique identifier, their rank:
>>> s.eachWorker('SleighRank')
[1, 2, 3, 0]
>>> s.eachWorker('SleighRank*111')
[111, 333, 222, 0]
This can be used for a variety of purposes. It is a global variable and may be
referenced in the code snippet or function of either a
eachWorker or eachElem computation.
Use SleighRank with the example above to test out dynamic load
balancing of eachElem computations.
Here's an example illustrating coordination between a non-blocking (or
asynchronous) sleigh execution and the controlling python session that
makes use of another global, SleighNws, the object encapsulating
the workspace in which the sleigh runs:
>>> r = s.eachWorker('SleighNws.find("wake up!")', blocking=False)
>>> r.check()
4
>>> r.check()
4
>>> s.nws.store('wake up!', 123)
>>> r.check()
0
>>> r.wait()
[123, 123, 123, 123]
The combination of eachWorker or eachElem (with a one
element vector) and blocking=False can
be used to provide functionality very similar to Linda's eval.
Using eachWorker and SleighNws, write a new version
of the cubing example from the previous lecture — one that makes use of the
sleigh's workers to carry out the computation (do not simply copy the
eachElem above: make explicit use of store and
fetch, but use your sleigh's workspace and compute servers).