1. Joshua Shinavier
Linked Process
An Internet-scale distributed computing
framework
Center For Nonlinear Studies
August 5th, 2009
2. Overview
• Internet-scale distributed computing
• eXtensible Messaging and Presence Protocol
(XMPP)
• Linked Process specification
• Current Linked Process implementation
• Demos
2
3. Abstract
The LANL-based Linked Process project takes a new approach to
Internet-scale distributed computing. While existing large-scale
grid computing projects are typically very constrained in the kinds
of computational tasks which can be performed, the kinds of
devices which can participate in computation, and in the overall
architecture of the system, the Linked Process specification
provides the foundation for a much larger and more general-
purpose distributed computing platform. Any device supporting
the eXtensible Messaging and Presence Protocol (XMPP), be it a
supercomputer or a cellular phone, is a potential node in a global
compute cloud, communicating with other nodes in a manner
similar to human chat. The implementation currently under
development provides a simple API and supports a number of
popular scripting languages, allowing software developers to write
distributed applications with ease. This presentation will provide
an overview of the Linked Process specification and discuss a range
of potential uses of the technology.
3
4. Contributors (to date)
• Marko A. Rodriguez (LANL)
• http://markorodriguez.com/
• Joshua Shinavier (RPI / LANL)
• http://fortytwo.net/
• Peter Neubauer (Neo Technology)
• http://www.linkedin.com/neubauer/
• Max O. Bond (Santa Fe Complex)
• Mick Thompson (Santa Fe Complex)
• http://davidmichaelthompson.com
4
5. Internet-scale distributed computing
• distributed computing
• combines computational power of multiple
machines
• makes effective use of local resources
• Berkeley Open Infrastructure for Network
Computing (BOINC)
• supports SETI@home, PrimeGrid, etc.
• cloud computing
• Amazon EC2
• Google App Engine
5
6. • eXtensible Messaging
and Presence Protocol
• deals with “presence”
and asynchronous
message passing
among clients and
servers
• open standard
• based on machine-
independent Jabber
identifiers (JIDs)
6
8. Linked Process
• uses XMPP messaging for inter-machine
communication
• any XMPP-enabled device may participate
• bring mobile devices into the cloud
• augment the compute power of a single device
• grid computing
• specification is called LoP, for “Linking Open
Processors”
8
12. Jobs: units of computation
• a job is a task to be performed by a virtual
machine, e.g.
• computationally intensive operations
• manipulation of local resources
• LoP allows you to:
• submit a job -- <submit_job/>
• check on the status of a job -- <job_status/>
• abort a job -- <abort_job/>
12
13. Example: submitting a job
st
ue
<iq from="lp1@gmail.com/LoPVillein/1234"
q
to="lp2@gmail.com/LoPVM/EFGH"
re
type="get" id="xxxx">
<submit_job xmlns="http://linkedprocess.org/2009/06/VirtualMachine#"
vm_password="abc123pass">
var temp=0;
for(i=0; i<10; i++) {
temp = temp + 1;
}
temp;
</submit_job>
</iq>
e
ns
po
<iq from="lp2@gmail.com/LoPVM/EFGH"
to="lp1@gmail.com/LoPVillein/1234"
s
re
type="result" id="xxxx">
<submit_job xmlns="http://linkedprocess.org/2009/06/VirtualMachine#">
10
<submit_job/>
</iq>
13
14. Virtual machines: addressable
“computers” in the cloud
• VM is maintained by an XMPP client
• manages jobs and data-typed
“bindings” (variables) -- <manage_bindings/>
• provides a scripting environment using a
particular language (e.g. JavaScript, Ruby, etc.)
• may be terminated at any time --
<terminate_vm/>
14
15. Example: spawning a VM
st
ue
<iq from="lp1@gmail.com/LoPVillein/1234"
q
re
to="lp2@gmail.com/LoPFarm/ABCD"
type="get" id="xxxx">
<spawn_vm xmlns="http://linkedprocess.org/2009/06/Farm#"
vm_species="javascript" />
</iq>
se
on
<iq from="lp2@gmail.com/LoPFarm/ABCD"
sp
to="lp1@gmail.com/LoPVillein/1234"
type="result" id="xxxx">
re
<spawn_vm xmlns="http://linkedprocess.org/2009/06/Farm#"
vm_jid="lp2@gmail.com/LoPVM/EFGH"
vm_password="abc123pass"
vm_species="javascript" />
</iq>
15
16. Farms: LoP service providers
• farm is maintained by an XMPP client
• provides access to virtual machines
• farm allows you to:
• spawn a virtual machine -- <spawn_vm/>
• query for information about the environment
(e.g. language support, security restrictions,
etc.) -- disco#info
• multiple farms may share the same “bare” JID
16
18. Security
• jobs operate within a VM sandbox
• subject to named permissions, e.g.
• file I/O
• network I/O
• introspection
• password-protection of VMs
• can specify limits on VMs per farm, number of
jobs in a queue, VM and job timeouts, etc.
• XMPP supports SSL (Secure Socket Layer)
18
19. Implementation (to date)
• based on Java 1.6
• takes advantage of built-in Java security,
scripting framework
• job scheduler serves as a miniature operating
system
• supports a number of scripting languages
• JavaScript, Ruby, Python, Groovy
• support for additional languages is easy to add
• https://scripting.dev.java.net/
• deployed farms in New Mexico, New York, Sweden
19
21. Demos
• distributed primality testing
• Linked Data
• for the future:
• LoP API for Google’s MapReduce
• computational support for mobile devices
• distributed matrix operations, image
processing, etc.
21
22. See also
• http://linkedprocess.org/
• Rodriguez, M.A., “A Reflection on the Structure and Process of the
Web of Data,” Bulletin of the American Society for Information Science
and Technology, American Society for Information Science and
Technology, volume 35, number 6, ISSN: 1550-8366, LA-
UR-09-03724, pages 38-43, August 2009.
• XMPP Core spec: http://xmpp.org/rfcs/rfc3920.html
• XMPP Instant Messaging and Presence spec: http://xmpp.org/rfcs/
rfc3921.html
• XMPP Extensions: http://xmpp.org/extensions/
22