SlideShare une entreprise Scribd logo
1  sur  8
1




                      Parallel and Distributed Computing
                         BOINC Grid Implementation
                                                                                 ˜
                            Rodrigo Neves, Nuno Mestre, Francisco Machado, and Joao Lopes

      Abstract—With the development of communications and Internet, distributed computing became an everyday reality for everyone
      rather than just for a limited group of IT specialists and investors. This development allowed the emerging of several new computational
      concepts, some even at the cost of non-consensus. This paper intends to make a brief approach at some of the current paradigms
      like cloud and grid computing, peer-to-peer and client-server methods. Afterwards, it will go deeper into a detailed review of the Public
      Resource Computing concept and the BOINC software implementation. Finally, a case study on the Extended BOINC System created
      by Søttrup and Pederson [18] will present a simple solution to integrate the concepts of PRC and Grid in order to provide simple,
      scalable, volunteer based computer power to process QoS dependent jobs.

      Index Terms—parallel computing, distributed computing, grid computing, cloud computing, client server, peer to peer, boinc, extended
      boinc, public resource computing, quality of service

                                                                                 !



1     I NTRODUCTION                                                                  work hardware, operating systems and programming
                                                                                     languages, the term middleware has been created. It is
D      ISTRIBUTED  systems have been given many defini-
      tions throughout the years but none of these has
been consistent with each other.
                                                                                     a software layer that provides abstraction, setting up a
                                                                                     uniform computational model for software developers to
  As Andrew S. Tanenbaum and Maarten van Steen                                       work on. One of the most widespread middleware soft-
suggested [1]:                                                                       wares available is the Common Object Request Broker
                                                                                     (CORBA) [2].
     A distributed system is a collection of independent
     computers that appears to its users as a single
     coherent system.                                                                2   I MPORTANT C ONCEPTS
The first part of this definition deals with hardware                                  In order to better understand the remaining of this
while the second focus specially on software.                                        paper, it might be interesting to clear up some technical
  Distributed systems must present a transparent work-                               nomenclature in the Parallel and Distributed Systems
flow to the user regardless of the differences in hardware                            environment.
and communication methods throughout the connected                                     • Servers:    Typically high-powered workstations,
computers. Other important characteristics of these sys-                                 minicomputers or mainframes that hold the infor-
tems are the ease of scalability and the high availability.                              mation and provide it to clients through request
  Such systems should easily connect users to resources                                  handling;
while hiding the fact that these may be distributed across                             • Clients:   Computers or mainframes that request
a network. It should also be open and respect the avail-                                 services from the servers;
able standards in order to facilitate future development                               • API: An Application Programming Interface is a
and scalability and ensure the data and communication                                    tool used to provide the end-user with an abstract
security. All these specific issues shall be addressed in                                 set of operations. Its main attraction is the ability
further detail later on this paper.                                                      to hide from the user the implementation of such
  In order to create the illusion of a single system                                     operations;
both at high-level, for users, and at low-level, for net-                              • SDK: The Software Development Kit is typically
                                                                                         a set of development tools that allows a software
• R. Neves is with MIEET, Departmento de Engenharia Electr´ nica e    o
  Inform´ tica, Faculdade de Ciˆncia e Tecnologia, Universidade do Algarve.
         a                      e
                                                                                         engineer to create applications for a certain software
  E-mail: a25067@ualg.pt                                                                 package. Usually a SDK implements an API;
• N. Mestre is with LEI, Departmento de Engenharia Electr´ nica e In-
                                                                 o                     • Middleware:     A software layer that conceals all
  form´ tica, Faculdade de Ciˆncia e Tecnologia, Universidade do Algarve.
       a                      e
  E-mail: a28997@ualg.pt
                                                                                         the heterogeneity in a system in order for software
• F. Machado is with LEI, Departmento de Engenharia Electr´ nica e In-
                                                                  o                      developers to easily work on;
  form´ tica, Faculdade de Ciˆncia e Tecnologia, Universidade do Algarve.
       a                      e                                                        • Query Languages: Techniques and protocols to get
  E-mail: a28994@ualg.pt
• J. Lopes is with LEI, Departmento de Engenharia Electr´ nica e Inform´ tica,
                                                        o               a
                                                                                         the sough information being the most famous the
  Faculdade de Ciˆncia e Tecnologia, Universidade do Algarve.
                    e                                                                    Structured Query Language (SQL);
  E-mail: a27981@ualg.pt                                                               • Virtual Organization: A dynamic set of individ-
                                                                                         uals and/or institutions defined by a batch of
2



        resource-sharing rules. Those rules need to make             else users will eventually evade the security in favor
        perfectly clear just what is shared, who is allowed          of productivity.
        to share, and the conditions under which sharing
        occurs [9];                                            3.3   Scalability
    •   Public Resource Computing: Usually known as
                                                               Scalability is one of the most important design goals for
        PRC, this concept describes the idea of getting any-
                                                               developers in distributed systems. According to Clifford
        body with an Internet connection and spare com-
                                                               Neuman [5], a system’s scalability may be weighted
        pute power to donate CPU cycles on their computer
                                                               along three main dimensions:
        for a greater project.
                                                                  • Size: Whether it is simple to add more users and
                                                                    resources to the system or not;
3       D ISTRIBUTED S YSTEMS C HALLENGES
                                                                  • Geographical: Regarding the location of users and
3.1     Connecting Users and Resources / Concurrency                resources;
Resource sharing and distribution in any given multi-             • Administration: Ease to manage the system even
user system is always a main concern. In the case of dis-           when it is shared by multiple administrative orga-
tributed systems this task becomes of utter importance.             nizations.
   In a distributed system, both applications and services        As a system scales up, in any of these three dimen-
provide resources that can be shared among clients.            sions, it may exhibit problems that could affect perfor-
These usually allow multiple client requests to be ac-         mance.
cepted even though they may be processed one at a time.           Size scalability problems are related with the increased
   If we consider each resource being encapsulated as          amount of users and their demands. Such situation
an object and the requests being treated as concurrent         occurring on centralized services, data and algorithms
threads, it becomes clear that any application or service      will eventually create a bottleneck at the server.
must be carefully managing this concurrency in order to           Geographical scaling problems are usually related to
avoid inconsistent results and/or deadlocks.                   the connectivity limitations of the communications. In
   Though at a first glance this situation may require          wide-area networks, access to information is usually
careful analysis and programming, it’s cost effectiveness      made through unreliable connections and virtually al-
may become obvious when sharing expensive resources            ways a point-to-point connection.
like high performance processing and data structures.             Administrative scalability issues often relate to mul-
                                                               tiple domain trustworthy security certificates. As more
3.2     Transparency / Heterogeneity                           organizations join the network, strict administration and
One of the major concepts and advantages of distributed        management protocols must be defined as all included
systems is the transparency. It allows a user to access        parts should play an active role in this processes.
all resources and data that may be scattered around
a network, without having to worry where these are             3.4   Security
actually located. This may prove to be a problem when          Usually, the information resources available and main-
an virtually infinite combination of hardware resources         tained in a distributed system have a high intrinsic
and operating systems are involved. This ability to pro-       value to the users, therefore the security of that data is
vide transparency within all the heterogeneity of the dis-     considerably important.
tributed system is usually implemented by a middleware            There are three main security components that should
software layer as described on the introduction.               be taken into consideration:
   The concept of transparency may be applied to many
                                                                  • Confidentiality:    Protection against unauthorized
aspects of distributed systems [1]:
                                                                    accesses;
   • Access: Hide differences in data representation and
                                                                  • Integrity:  Protection against unauthorized alter-
     how a resource is accessed;                                    ation or corruption;
   • Location: Hide where a resource is located;
                                                                  • Availability: Protection against disruption of the
   • Migration:    Hide that a resource may move to                 communication with the resources.
     another location;
                                                                  Preventive actions should be taken in order to ensure
   • Relocation: Hide that a resource may be moved to
                                                               this security parameters, like the use of a firewall and
     another location while in use;
                                                               careful code development.
   • Replication: Hide that a resource is replicated;
   • Concurrency: Hide that a resource may be shared
     by several competitive users;                             3.5   Openness
   • Failure: Hide the failure and recovery of a resource;     According to Kazi Farooqui, Luigi Logrippo and Jan de
   • Persistence: Hide whether a (software) resource is        Meer [3], openness is the combination of the previous
     in memory or on disk;                                     stated characteristics. Sometimes however, it is generi-
   • Security: Negotiation of secure access to resources       cally referred to as the level of respect for standards that
     must require a minimum of user intervention, or           define the syntax and semantics of the provided services.
3



                                                              time. Unlike the Client-Server approach, P2P systems
                                                              rely on an agglomeration of applications and systems al-
                                                              together in order to have access to distributed resources
                                                              on a decentralized way. In terms of maintenance costs,
                                                              P2P has the edge over Client-Server since the contents
                                                              and systems are maintained individually by each peer.
                                                                 P2P systems can be categorized into three main
                                                              groups: centralized, decentralized and hybrid implemen-
                                                              tations.
                                                                 The centralized P2P concept bases its implementation
                                                              on a central server that executes simple generic functions
                                                              for the system, like work load scheduling and result
                                                              validation.
                                                                 Decentralized P2P systems rely completely on the
                                                              peers themselves to perform all the functions without
Fig. 1. Generic client-server environment                     intervention of centralized servers. In these cases, all the
                                                              result validation and communication must be handled
                                                              by each node and coordinated with its peers.
  This important concept, despite of the interpretation,         Hybrid systems are a bit of a mix between the pre-
provides the developers with the needed flexibility to         vious two implementations. These bring up the concept
add, configure and integrate new components and ser-           of super-nodes, made of several regular nodes, which
vices. Also, this flexibility ensures that the addition and    are responsible of the centralized work while it’s con-
removal of components can be made without affecting           stituents can still work on the regular decentralized way.
the stability of the overall system.

                                                              4.3   Cloud Computing
4   D ISTRIBUTED S YSTEM M ODELS
    A Distributed System is a set of loosely coupled re-      Nowadays, data storage and programs are being swept
    sources interconnected by a communication network.        from the desktop computers and corporate server rooms
    [4]                                                       and installed in the computer cloud. Cloud computing
                                                              emerges from the lesser need of the users to have
                                                              applications installed on their machines due to increased
4.1 Client-Server                                             communications speed and availability.
Client-Server is a particular type of distributed systems        Every operating system update cascades into a batch
design that clearly distinguishes the relationship be-        of time and resource consuming software revisions. Out-
tween two computers. The Server provides some kind of         sourcing computation through Internet based services
service, such as processing database queries or sending       significantly reduces these costs while offering a whole
out current stock prices. The client uses the services that   new set of advantages like mobility and cooperation.
are provided by the server, either displaying database           The amount of services and applications provided in
query results to the user or making stock purchase            the cloud is growing every day and cannot be considered
recommendations to an investor.                               as a bunch of simple tools anymore. Companies are
   The communication that occurs between the client and       starting to acquire cloud based services for every kind
the server must be reliable. That is, no data can be          of managerial and business oriented tasks.
dropped and it must arrive on the client side in the             Growing voices of worry have been expressing their
same order in which the server sent it. In order to ensure    concerns about data privacy and confidentiality not tak-
the reliability between Server and Client the communi-        ing the service’s privacy policies as creditable enough.
cation uses the TCP/IP protocols. The Internet Protocol       One famous scenario used as argument is cited by Hayes
(IP) suite is a set of communication tools that regulate      [17]:
communication on the Internet and most commercial
                                                                    (...) a government agency presents a subpoena or
networks. The Transmission Control Protocol (TCP) is
                                                                    search warrant to the third party that has posses-
one of the core protocols of this suite. Using TCP, clients
                                                                    sion of your data. If you had retained the physical
and servers can create connections to one another, over
                                                                    custody, you might still have been compelled to
which they can exchange data in packets.
                                                                    surrender the information, but at least you would
                                                                    have been able to decide for yourself whether or
4.2 Peer-To-Peer                                                    not to contest the order. The third-party service
The concept of Peer-To-Peer communication, also known               is presumably less likely to go to court on your
as P2P, is based on the idea that each individual node              behalf. In some circumstances you might not even
(peer) in the network is both client and server at the same         be informed that your documents have been released.
4



  These kind of issues will probably never be solved               4.4.1.1 Applying the three point checklist: Follow-
in time to stop or control the growth of the cloud as         ing there are some practical examples to help us make
we see major software developers and investors (Apache        the concept of grid clear, according to Ian Foster’s three
Foundation, Amazon, Adobe, Google, IBM, etc.) trying          point checklist.
to keep up with the evolutionary pace.                          • Sun Grid Engine:
                                                                      The Grid Engine project is an open source commu-
4.4   Grid Computing                                                  nity effort to facilitate the adoption of distributed
4.4.1 Definition                                                       computing solutions. [12]
The term grid was first used as a metaphor to the                   This system delivers quality of service when in-
electric power grid. The intended idea was that access             stalled on a parallel computer or local area network.
to computation and data should be as easy, pervasive               However, its complete knowledge of system states
and standard as plugging in an appliance into an outlet.           and user requests, as well as control over individual
Nowadays it is hard to find a consensual definition. Here            components, implements a centralized management
are some from the most referred authors:                           system that makes this system fail the first point of
                                                                   Foster’s checklist.
     A computational grid is a hardware and software
                                                                • The Web: The Web is open and its general-purpose
     infrastructure that provides dependable, consistent,
                                                                   protocols support access to distributed resources,
     pervasive and inexpensive access to high-end
                                                                   however it fails to coordinate those resources to
     computational capabilities. [6]
                                                                   deliver interesting quality of services.
                                                                • TeraGrid:
      Computational grid is the technology that enables
                                                                      TeraGrid is an open scientific discovery infrastruc-
      resource virtualization, on-demand provisioning and
                                                                      ture combining leadership class resources at eleven
      service (resource) sharing between organizations. [7]
                                                                      partner sites to create an integrated, persistent com-
                                                                      putational resource. [13]
      Grid computing has the ability, using a set of
                                                                   This system integrates resources from multiple insti-
      open standards and protocols, to gain access to
                                                                   tutions, each with their own policies, uses open and
      applications and data, processing power, storage
                                                                   general-purpose protocols to negotiate and manage
      capacity and a vast array of other computing
                                                                   sharing and addresses multiple quality of service
      resources over the Internet. A grid is a type of
                                                                   dimensions, therefore fully fits Ian Foster’s three-
      parallel and distributed system that enables the
                                                                   point checklist.
      sharing, selection and aggregation of resources
      distributed across ”multiple” administrative            4.4.2 Usual Features
      domains based on their (resources) availability,
                                                              In this section we thrive to describe in detail some of
      capacity, performance, cost and users’ quality-of-
                                                              the most important features usually associated with the
      service requirements. [8]
                                                              grid computing method.
                                                                   4.4.2.1 Volunteer Computing: Most grids use vol-
      The problem that underlines the Grid concept is         unteer resources, that is, resources contributed to the
      coordinated resource sharing and problem solving in     grid by anonymous individuals or organizations with
      dynamic, multi-institutional virtual organizations.     no profit intended.
      [9]                                                          4.4.2.2 Geographically dispersed: Due to its archi-
                                                              tecture and volunteer characteristic some grid resources
  Although we can find some common ground, like re-            can be spread throughout the globe.
source sharing and processing power, these are features            4.4.2.3 Idle Resources: One of the benefits of using
not only of a grid computing system but of any kind           grid computing is that you can exploit resources with
of distributed system. Due to the lack of consensus and       low usage rates because in most organizations, there are
the use of the term ”grid” as a marketing slogan (science     large amounts of under utilized computing resources. Most
grid, access grid, knowledge grid, bio grid, campus grid,     desktop machines are busy less than 5% of the time over a
commodity grid, etc.), Ian Foster suggested a checklist to    business day [14].
define what is and is not a grid in his paper ”What is              4.4.2.4 Inexpensive: Either through volunteering
the grid? A three point checklist” [11]:                      or by using idle resources within a company, usually
     A grid is a system that:                                 it’s possible to reach a considerable computational per-
       1) coordinates resources that are not subject to       formance without major investments in supercomputers
          centralized control (...)                           or clusters.
       2) (...) using standard, open, general-purpose pro-
          tocols and interfaces (...)                         4.4.3 Architecture
       3) (...) to deliver nontrivial quality of service.     In an effort to standardize grid architectures, I. Foster, C.
                                                              Kesselman and S. Tuecke [9] presented an open five-layer
5



                                                                   4.4.3.4 Collective Layer: The Collective layer con-
                                                              tains protocols and services, like APIs and SDKs, which
                                                              are not associated with specific resources but global
                                                              in nature, and capture interactions across collections of
                                                              resources. Meaning that, at this layer, individual resource
                                                              architectures and functionalities are abstracted in order
                                                              to provide collective functions that can be implemented
                                                              as persistent services with associated protocols, or as
                                                              SDKs and APIs, designed to be linked to applications.
                                                                   4.4.3.5 Application layer: This is the final layer,
                                                              therefore it is responsible to provide the end user with an
Fig. 2. Layered Grid Architecture                             abstract interface that includes the user applications and
                                                              functionalities that operate within a Virtual Organization
                                                              environment. These applications are constructed using
                                                              services defined at any layer.
structure (Application, Collective, Resource, Connectiv-
ity and Fabric layer) based on the ”hourglass model”
[10].                                                         4.5   Quality of Service
   In this architecture the narrow neck, represented by       In the multimedia communities, Quality of Service (QoS)
the Resource and Connectivity layers, defines a small set      issues are geared to provide a client with an acceptable
of protocols onto which many high-level behaviors, used       level of presentation quality when accessing content.
by the Application and Collective layer, can be mapped        Network QoS deals specifically with providing certain
(the top of the hourglass). On the other hand, the ”neck”     quality levels for network link characteristics between
protocols can themselves be mapped onto many different        two points. These characteristics are expressed in terms
underlying technologies (the base of the hourglass, the       of delay, jitter, packet loss rate and throughput.
Fabric layer) (Figure 2).                                        Unlike multimedia and network QoS, Grid QoS re-
      4.4.3.1 Fabric Layer: This layer provides the re-       quires a central information service for up-to-date infor-
sources to which shared accesses are mediated by Grid         mation on resources available for use by others. Such
protocols. Fabric layer works with resource-specific op-       information can be interrogated by an application user
erations, there is no abstraction at this level. These        to determine which resources can be used to execute an
operations are usually a result of sharing operations at      operation. In Grid computing, QoS management focus
higher layers. There’s interdependency between func-          on providing assurance on resource access while main-
tions implemented in this layer and sharing operations        taining the security level between domains [22].
supported by the Grid. A rich and more complex Fabric
set of functionalities enables more sophisticated sharing     4.5.1 QoS in Grid Computing
operations. As opposed, fewer functionalities and de-         Once the Grid applications submit their requirements to
mands at this layer imply a simplified Grid structure.         the management services that schedule jobs as resources
      4.4.3.2 Connectivity Layer: The connectivity layer      become available, these must support a resource man-
defines the core communication and authentication pro-         ager or scheduler that can receive requests from external
tocols. Communication protocols enable the exchange of        applications. Nevertheless, there are several applications
data between Fabric layer resources, requiring transport,     that need to get results for their tasks within strict
routing and naming mechanisms. Authentication proto-          deadlines. Consequently they cannot wait for resources
cols are used to ensure the security and identity of users    to become available so, it is necessary to reserve Grid
and resources. Due to complex security problems and           resource and services in a particular time. In order
wide usage, existing protocols and standards should be        to handle complex scientific and business applications,
used to implement this layer.                                 other features are highly desirable, sometimes even re-
      4.4.3.3 Resource Layer: Resource layer provides         quired.
the means to share single resources using information           A Grid resource management system tries to address
and management protocols. Information protocols are           the following QoS issues [22]:
used to obtain details about the structure and state of         • Advanced Resource Reservation:         It’s important
a single resource. Management protocols are used to                when dealing with scarce resources, as is often the
negotiate access to the shared resource, specifying its            case with end resources made available on the Grid.
requirements and the operations to be performed. These             Should support mechanisms for advance, immedi-
protocols should also ensure that requested operations             ate, or on-demand resource reservation.
respect individual policies of each resource. This layer is     • Reservation Policy: The system should have mech-
only concerned with individual resources. Global state             anisms that provide the Grid resource owners ways
and atomic actions over multiple resources are issues of           of enforcing their policies by governing when, how,
the next layer.                                                    and who can use their resource.
6



    •   Agreement Protocol: The system should inform
        the clients of their advance reservation status, and
        the resource quality they should expect during the
        service session.
    •   Security: The system should prevent malicious
        users penetrating or altering data repositories that
        hold information about reservations, policies and
        agreement protocols.
    •   Simplicity: The QoS enhancement should be rea-
        sonable and simplistic in design so that it requires
        minimal changes to be made to existing computa-
        tion, storage or network infrastructure.
    •   Scalability: The approach should be scalable to a
        large number of entities, since the Grid is a global-
        scale infrastructure.

5       BOINC                                                   Fig. 3. BOINC Infrastructure
Berkeley Open Interface for Network Computing
(BOINC) is an open middleware platform that supplies
the scientific research by placing their confidence in                  temporary input and output files on the server. Only
using resources donated by simple personal computers                  the files are deleted, the entries on the database are
around the world (core clients). The objective is to gather           kept and therefore it is possible to find information
all this energy and make a supercomputer which helps                  even after the project is completed.
the researchers (clients) in several projects. [18]               •   The ”feeder” is used to enhance the schedulers per-
   Using BOINC allows user to achieve high processing                 formance and to reduce the queries to the database.
performance with low costs. For example, to have 100                  It does so by placing WUs, from the database, into
TFlops available for one year, Amazon’s Elastic Com-                  a shared memory.
puting Cloud costs 175 million dollars, to build a cluster        •   The ”database purger” removes work-related
you need 12.4 million dollars, but with BOINC, clients                database entries when they are no longer needed
only need, in average, 125,000 dollars [19]. On the other             in order to keep the database from growing into an
hand, looking at the top500 supercomputers list (Novem-               unpractical size.
ber/2008) [21], roadrunner achieves 1105 TFlops, where
BOINC has a daily average of 1,700 TFlops and the most          5.1.2 Scheduler
relevant project (SETI@Home) has a daily average of 615         The scheduler is a CGI software that runs every time
TFlops.                                                         a client connects to a project and asks for work. It has
                                                                to compare the available WU’s needs with the clients’
5.1     Infrastructure                                          shared resources in order to match them.
BOINC provides a set of tools, daemons, scheduler and           5.1.3 Database
database (Figure 3).
                                                                The BOINC database is a MySQL database that stores in-
5.1.1 Daemons                                                   formation about registered users and hosts, applications
                                                                and their versions, WU’s and their results, and other
BOINC servers use daemons to manage and keep track              relevant information.
of their jobs or, in BOINC terms, work units (WU).
  • The ”work generator” has to generate WU and
     correspondent input files.                                  5.2   PRC vs. Grid
  • The ”transitioner” has to control and change the            According to David P. Anderson [23], both PRC and Grid
     states of each WU.                                         Computing methodologies share a common goal: to use
  • The ”validator” has to validate the results of each         the existing resources in the best possible way. There are
     WU and its redundant copies.                               however some important differences between the two.
  • The ”assimilator” daemon regroups the final results             While a Grid is usually managed and controlled by
     and processes them according to the administrator’s        a single organization, a PRC network relies on separate
     specification. It could zip and e-mail the results or       individuals to share their resources. While this particu-
     automatically do post processing and store those           larity may allow a huge growth in terms of connected
     results on a magnetic tape.                                nodes, it brings out other liabilities like the unreliability
  • The ”file deleter”, as the name indicates, checks for        of the processed results and uncertain processor time.
     completed and assimilated WUs and then deletes             Since each user is volunteer and therefore allowed to
7



                                                                   Manage their states;
                                                                   •
                                                                   Pull the results from the grid resource broker.
                                                                   •

                                                                 These results don’t need validation because the re-
                                                               sources from the Grid are considered trustworthy.

                                                               6       C ONCLUSION
                                                               Throughout the development of this paper, we scanned
                                                               the currently available paradigms on distributed com-
                                                               puting, their main advantages and limitations. This
                                                               process has taken us through part of the history of
                                                               computing as we start from the traditional client-server
                                                               model and develop it until nowadays cloud and grid
                                                               computing concepts.
                                                                  The evolution of communications and commodity per-
                                                               sonal computers brought the distributed computation
                                                               to a whole new level while approaching people and
Fig. 4. Extended BOINC Infrastructure                          science through Public Resource Computing. This new
                                                               area of computational resource sharing brought the need
                                                               to rethink the Quality of Service requirements in dis-
control the amount of work done, nothing ensures the
                                                               tributed networks. PRC proved that a huge ammount
project management that this user will be cooperating
                                                               of volunteers can supply an unbeatable system-wide
for a long time or keeping a steady work-flow.
                                                               throughput without the need of strict QoS policies. For
   Another important difference between the two meth-
                                                               instance, the major BOINC based project, and also the
ods relies on the Quality of Service. It is virtually impos-
                                                               one that encouraged it’s development, SETI@Home has
sible to ensure a strong QoS on a PRC network due to
                                                               produced so far 3 Million+ years of processing time. [20]
slow connections and low availability.
                                                                  For the reduced ammount of research projects and
                                                               processing jobs that require strict deadlines, high com-
5.3   Extended BOINC System                                    munication speed and permanent connectivity there was
As we have seen BOINC only implements two out of the           a need to merge the benefits of PRC globalization and
three points in Foster’s checklist. It has a decentralized     the QoS that a Grid system could provide. In order to
control over resources and it uses open protocols and          address this need, the bridge connector model [18] was
interfaces, but it fails to deliver non-trivial quality of     developped to extend the regular BOINC System.
service, because it does not fulfill the information ac-
cessibility and connectivity requirements.                     R EFERENCES
  With this in mind, Søttrup and Pederson [18], sug-
                                                               [1] A. S. Tannenbaum and M. van Steen, Distributed Systems - Principles
gested a bridge between BOINC and a private Grid,                  and Paradigms, International Edition, Pearson, U.S.A.: Prentice Hall,
where instead of clients pulling the jobs from BOINC               2002.
server by connecting to it, the server connects to a           [2] G. Coulouris, J. Dollimore and T. Kindberg, Distributed Systems
                                                                   - Concepts and Design, Fourth Edition, Pearson, U.K.: Addison-
resource broker, responsible for scheduling, submitting            Wesley, 2005.
jobs to remote machines, transferring files and logging,        [3] K. Farooqui, L. Logrippo, J. de Meer, The ISO Reference Model for
on the Grid and pushes jobs into it. This Grid will be re-         Open Distributed Processing - An Introduction, February 14, 1996
                                                               [4] A. Silberschatz, P. B. Galvin, and G. Gagne, Operating System
sponsible for providing quality of service and therefore,          Concepts, Seventh Edition, Wiley, U.S.A.: John Wiley & Sons, 2005,
together they would build a Foster’s Grid (Figure 4).              Page 611.
                                                               [5] B. C. Neuman, Scale in Distributed Systems, Readings in Distributed
                                                                   Computing Systems, IEEE Computer Society Press, 1994
5.3.1 BOINC to Grid architecture                               [6] I. Foster, C. Kesselman, The Grid: Blueprint for a New Computing
                                                                   Infrastructure, University of Michigan, U.S.A.: Morgan Kaufmann
In order to process the WU’s directly and for the speci-           Publishers, 1999
fications of the Grid, a new daemon has to be created           [7] P. Plaszczak, R. Wellner, Grid computing: The Savvy Manager’s Guide,
(bridge daemon) and some of the other have to be                   U.S.A.: Elsevier/Morgan Kaufmann, 2005
modified. The transitioner daemon must be adapted so            [8] IBM Solutions Grid for Business Partners: Helping IBM Business
                                                                   Partners to Grid-enable applications for the next phase of e-business on
that it wont change the state of the WUs sent into                 demand, U.S.A.:IBM, 2002
the Grid. Manipulating and controlling the several WU          [9] I. Foster, C. Kesselman, S. Tuecke, The Anatomy of the Grid: En-
states is now the bridge daemon’s responsibility. The              abling Scalable Virtual Organizations, International J. Supercomputer
                                                                   Applications, 2001
feeder daemon also needs to be modified so that it wont         [10] L. Kleinrock, Realizing the Information Future: The Internet and
load the WUs, intended to the Grid, into the shared                Beyond, National Research Council, U.S.A.: National Academy
memory. This way, the bridge daemon has to:                        Press, 1994
                                                               [11] I. Foster, What is the Grid? A Three Point Checklist, GRIDToday, July
   • Push WUs into the Grid;                                       20, 2002
8



[12] Grid      Engine       Project,      http://gridengine.sunsource.net,
    SunSource.net, Sun, June 6, 2009
[13] About TeraGrid, http://www.teragrid.org/about, TeraGrid, Na-
    tional Science Foundation, June 6, 2009
[14] B. Jacob, M. Brown, K. Fukui, N. Trivedi, Introduction to Grid Com-
    puting, First Edition, International Business Machines Corporation,
    U.S.A.: IBM, International Technical Support Organization, 2005,
    Page 8
[15] P. Roy, Operating Systems: Internals and Design Principles, 6/E
    William Stallings, Manatee Community College, U.S.A.: Prentice
    Hall, 2008
[16] Introduction        to         Distributed       System      Design,
    http://code.google.com/intl/pt-PT/edu/parallel/dsd-
    tutorial.html, Google Code University, 2009
[17] B.Hayes, Cloud Computing, Communications of the ACM, ACM,
    July 2008, Pages 9-11
[18] C. U. Søttrup, J. G. Pedersen, Developing Distrubited Computing
    Solutions: Combining Grid Computing and Public Computing, M. Sc.
    Thesis, Department of Computer Science, University of Copen-
    hagen, March 1, 2005
[19] BOINC        Documentation        Project,   Why      Use   BOINC?,
    http://boinc.berkeley.edu/trac/wiki/WhyUseBoinc, University
    of California, June 11, 2009
[20] J. Koulouris, The Big BOINC ! Projects and Chronology Page,
    http://www.angelfire.com/jkoulouris-boinc, June 11, 2009
[21] Top500.org,        Top500        List      -     November      2008,
    http://www.top500.org/list/2008/11/100, Top500 Supercom-
    puting Sites, Top500.org, June 11, 2009
[22] R. J. Al-Ali, K. Amin, G. von Laszewski, O. F. Rana, D. W. Walker,
    M. Hategan, N. Zaluzec Analysis and Provision of QoS for Distributed
    Grid Applications, Kluwer Academic Publishers, 2004
[23] D. P. Anderson, BOINC: A System for Public-Resource Computing
    and Storage, Proceedings of the Fifth IEEE/ACM International
    Workshop on Grid Computing, 2004

Contenu connexe

Tendances

networking technologies
networking technologiesnetworking technologies
networking technologiesaibad ahmed
 
Hadoop World 2011: Security Considerations for Hadoop Deployments - Jeremy Gl...
Hadoop World 2011: Security Considerations for Hadoop Deployments - Jeremy Gl...Hadoop World 2011: Security Considerations for Hadoop Deployments - Jeremy Gl...
Hadoop World 2011: Security Considerations for Hadoop Deployments - Jeremy Gl...Cloudera, Inc.
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
A GENERIC FRAMEWORK FOR DEVICE PAIRING IN UBIQUITOUS COMPUTING ENVIRONMENTS
A GENERIC FRAMEWORK FOR DEVICE PAIRING IN UBIQUITOUS COMPUTING ENVIRONMENTSA GENERIC FRAMEWORK FOR DEVICE PAIRING IN UBIQUITOUS COMPUTING ENVIRONMENTS
A GENERIC FRAMEWORK FOR DEVICE PAIRING IN UBIQUITOUS COMPUTING ENVIRONMENTSIJNSA Journal
 
Cag Corporate Dossier May 2012
Cag Corporate Dossier May 2012Cag Corporate Dossier May 2012
Cag Corporate Dossier May 2012fastmpj
 
Research Issues in P2P Netwroks
Research Issues in P2P NetwroksResearch Issues in P2P Netwroks
Research Issues in P2P Netwrokssabumt
 
Cloud computing: new challenge to the entire computer industry
Cloud computing: new challenge to the entire computer industryCloud computing: new challenge to the entire computer industry
Cloud computing: new challenge to the entire computer industryStudying
 
Enabling High Level Application Development In The Internet Of Things
Enabling High Level Application Development In The Internet Of ThingsEnabling High Level Application Development In The Internet Of Things
Enabling High Level Application Development In The Internet Of ThingsPankesh Patel
 
Signaling for multimedia conferencing in stand alone mobile ad hoc networks
Signaling for multimedia conferencing in stand alone mobile ad hoc networksSignaling for multimedia conferencing in stand alone mobile ad hoc networks
Signaling for multimedia conferencing in stand alone mobile ad hoc networksAlexander Decker
 
11.signaling for multimedia conferencing in stand alone mobile ad hoc networks
11.signaling for multimedia conferencing in stand alone mobile ad hoc networks11.signaling for multimedia conferencing in stand alone mobile ad hoc networks
11.signaling for multimedia conferencing in stand alone mobile ad hoc networksAlexander Decker
 
MedPort White Paper
MedPort White PaperMedPort White Paper
MedPort White PaperJohn Bowling
 

Tendances (18)

Research Challenges in Networked Systems
Research Challenges in Networked SystemsResearch Challenges in Networked Systems
Research Challenges in Networked Systems
 
networking technologies
networking technologiesnetworking technologies
networking technologies
 
19 23
19 2319 23
19 23
 
Hadoop World 2011: Security Considerations for Hadoop Deployments - Jeremy Gl...
Hadoop World 2011: Security Considerations for Hadoop Deployments - Jeremy Gl...Hadoop World 2011: Security Considerations for Hadoop Deployments - Jeremy Gl...
Hadoop World 2011: Security Considerations for Hadoop Deployments - Jeremy Gl...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Parking
ParkingParking
Parking
 
A GENERIC FRAMEWORK FOR DEVICE PAIRING IN UBIQUITOUS COMPUTING ENVIRONMENTS
A GENERIC FRAMEWORK FOR DEVICE PAIRING IN UBIQUITOUS COMPUTING ENVIRONMENTSA GENERIC FRAMEWORK FOR DEVICE PAIRING IN UBIQUITOUS COMPUTING ENVIRONMENTS
A GENERIC FRAMEWORK FOR DEVICE PAIRING IN UBIQUITOUS COMPUTING ENVIRONMENTS
 
Cag Corporate Dossier May 2012
Cag Corporate Dossier May 2012Cag Corporate Dossier May 2012
Cag Corporate Dossier May 2012
 
557 562
557 562557 562
557 562
 
Research Issues in P2P Netwroks
Research Issues in P2P NetwroksResearch Issues in P2P Netwroks
Research Issues in P2P Netwroks
 
Smart x
Smart xSmart x
Smart x
 
Cloud computing: new challenge to the entire computer industry
Cloud computing: new challenge to the entire computer industryCloud computing: new challenge to the entire computer industry
Cloud computing: new challenge to the entire computer industry
 
Enabling High Level Application Development In The Internet Of Things
Enabling High Level Application Development In The Internet Of ThingsEnabling High Level Application Development In The Internet Of Things
Enabling High Level Application Development In The Internet Of Things
 
Signaling for multimedia conferencing in stand alone mobile ad hoc networks
Signaling for multimedia conferencing in stand alone mobile ad hoc networksSignaling for multimedia conferencing in stand alone mobile ad hoc networks
Signaling for multimedia conferencing in stand alone mobile ad hoc networks
 
11.signaling for multimedia conferencing in stand alone mobile ad hoc networks
11.signaling for multimedia conferencing in stand alone mobile ad hoc networks11.signaling for multimedia conferencing in stand alone mobile ad hoc networks
11.signaling for multimedia conferencing in stand alone mobile ad hoc networks
 
MedPort White Paper
MedPort White PaperMedPort White Paper
MedPort White Paper
 
Sdnhpkorea
SdnhpkoreaSdnhpkorea
Sdnhpkorea
 
102 105
102 105102 105
102 105
 

Similaire à "Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo Neves, Nuno Mestre, Francisco Machado e João Lopes

The real time publisher subscriber inter-process communication model for dist...
The real time publisher subscriber inter-process communication model for dist...The real time publisher subscriber inter-process communication model for dist...
The real time publisher subscriber inter-process communication model for dist...yancha1973
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]Raul Soto
 
Tech trendnotes
Tech trendnotesTech trendnotes
Tech trendnotesStudying
 
Rain Technology.pptx
Rain Technology.pptxRain Technology.pptx
Rain Technology.pptxGaneshHS6
 
Advanced computer network
Advanced computer networkAdvanced computer network
Advanced computer networkTrinity Dwarka
 
Gridcomputingppt
GridcomputingpptGridcomputingppt
Gridcomputingpptnavjasser
 
Advanced computer network
Advanced computer networkAdvanced computer network
Advanced computer networkTrinity Dwarka
 
Cloud computing Review over various scheduling algorithms
Cloud computing Review over various scheduling algorithmsCloud computing Review over various scheduling algorithms
Cloud computing Review over various scheduling algorithmsIJEEE
 
Unit i cloud computing
Unit i  cloud computingUnit i  cloud computing
Unit i cloud computingMGkaran
 
Crypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
Crypto Mark Scheme for Fast Pollution Detection and Resistance over NetworkingCrypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
Crypto Mark Scheme for Fast Pollution Detection and Resistance over NetworkingIRJET Journal
 
Trends in Embedded system Design
Trends in Embedded system DesignTrends in Embedded system Design
Trends in Embedded system DesignRaman Deep
 
A Case Study On Implementation Of Grid Computing To Academic Institution
A Case Study On Implementation Of Grid Computing To Academic InstitutionA Case Study On Implementation Of Grid Computing To Academic Institution
A Case Study On Implementation Of Grid Computing To Academic InstitutionArlene Smith
 
Wireless Network Intrinsic Secrecy
Wireless Network Intrinsic SecrecyWireless Network Intrinsic Secrecy
Wireless Network Intrinsic SecrecyIRJET Journal
 
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENTA STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENTpharmaindexing
 

Similaire à "Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo Neves, Nuno Mestre, Francisco Machado e João Lopes (20)

The real time publisher subscriber inter-process communication model for dist...
The real time publisher subscriber inter-process communication model for dist...The real time publisher subscriber inter-process communication model for dist...
The real time publisher subscriber inter-process communication model for dist...
 
Grid computing [2005]
Grid computing [2005]Grid computing [2005]
Grid computing [2005]
 
7- Grid Computing.Pdf
7- Grid Computing.Pdf7- Grid Computing.Pdf
7- Grid Computing.Pdf
 
Tech trendnotes
Tech trendnotesTech trendnotes
Tech trendnotes
 
publishable paper
publishable paperpublishable paper
publishable paper
 
Rain Technology.pptx
Rain Technology.pptxRain Technology.pptx
Rain Technology.pptx
 
Advanced computer network
Advanced computer networkAdvanced computer network
Advanced computer network
 
Gridcomputingppt
GridcomputingpptGridcomputingppt
Gridcomputingppt
 
Final_Report
Final_ReportFinal_Report
Final_Report
 
Grid Presentation
Grid PresentationGrid Presentation
Grid Presentation
 
Advanced computer network
Advanced computer networkAdvanced computer network
Advanced computer network
 
Cloud computing Review over various scheduling algorithms
Cloud computing Review over various scheduling algorithmsCloud computing Review over various scheduling algorithms
Cloud computing Review over various scheduling algorithms
 
Unit i cloud computing
Unit i  cloud computingUnit i  cloud computing
Unit i cloud computing
 
Crypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
Crypto Mark Scheme for Fast Pollution Detection and Resistance over NetworkingCrypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
Crypto Mark Scheme for Fast Pollution Detection and Resistance over Networking
 
Trends in Embedded system Design
Trends in Embedded system DesignTrends in Embedded system Design
Trends in Embedded system Design
 
A Case Study On Implementation Of Grid Computing To Academic Institution
A Case Study On Implementation Of Grid Computing To Academic InstitutionA Case Study On Implementation Of Grid Computing To Academic Institution
A Case Study On Implementation Of Grid Computing To Academic Institution
 
395 401
395 401395 401
395 401
 
Distributed system.pptx
Distributed system.pptxDistributed system.pptx
Distributed system.pptx
 
Wireless Network Intrinsic Secrecy
Wireless Network Intrinsic SecrecyWireless Network Intrinsic Secrecy
Wireless Network Intrinsic Secrecy
 
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENTA STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
A STUDY ON JOB SCHEDULING IN CLOUD ENVIRONMENT
 

Plus de Núcleo de Electrónica e Informática da Universidade do Algarve

Plus de Núcleo de Electrónica e Informática da Universidade do Algarve (12)

"GMail" por Artur Martins
"GMail" por Artur Martins"GMail" por Artur Martins
"GMail" por Artur Martins
 
"Javascript" por Tiago Rodrigues
"Javascript" por Tiago Rodrigues"Javascript" por Tiago Rodrigues
"Javascript" por Tiago Rodrigues
 
"Estudo de implementação de alimentação eléctrica através de energia solar fo...
"Estudo de implementação de alimentação eléctrica através de energia solar fo..."Estudo de implementação de alimentação eléctrica através de energia solar fo...
"Estudo de implementação de alimentação eléctrica através de energia solar fo...
 
"Volunteer Computing with BOINC Client-Server side" por Diamantino Cruz e Ric...
"Volunteer Computing with BOINC Client-Server side" por Diamantino Cruz e Ric..."Volunteer Computing with BOINC Client-Server side" por Diamantino Cruz e Ric...
"Volunteer Computing with BOINC Client-Server side" por Diamantino Cruz e Ric...
 
"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira
"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira
"Volunteer Computing With Boinc" por Diamantino Cruz e Ricardo Madeira
 
"Estudo de implementação de alimentação eléctrica através de energia solar fo...
"Estudo de implementação de alimentação eléctrica através de energia solar fo..."Estudo de implementação de alimentação eléctrica através de energia solar fo...
"Estudo de implementação de alimentação eléctrica através de energia solar fo...
 
"Grid Computing: BOINC Overview" por Rodrigo Neves, Nuno Mestre, Francisco Ma...
"Grid Computing: BOINC Overview" por Rodrigo Neves, Nuno Mestre, Francisco Ma..."Grid Computing: BOINC Overview" por Rodrigo Neves, Nuno Mestre, Francisco Ma...
"Grid Computing: BOINC Overview" por Rodrigo Neves, Nuno Mestre, Francisco Ma...
 
“Web Services with Mobile Phones” por João Duro
“Web Services with Mobile Phones” por João Duro“Web Services with Mobile Phones” por João Duro
“Web Services with Mobile Phones” por João Duro
 
“Revision Control Systems: Subversion (SVN)” por Tiago Rodrigues
“Revision Control Systems: Subversion (SVN)” por Tiago Rodrigues“Revision Control Systems: Subversion (SVN)” por Tiago Rodrigues
“Revision Control Systems: Subversion (SVN)” por Tiago Rodrigues
 
“eSpeak” por Diogo Costa e Daniela Guerreiro
“eSpeak” por Diogo Costa e Daniela Guerreiro“eSpeak” por Diogo Costa e Daniela Guerreiro
“eSpeak” por Diogo Costa e Daniela Guerreiro
 
“LaTeX” por Manuel Rocha
“LaTeX” por Manuel Rocha“LaTeX” por Manuel Rocha
“LaTeX” por Manuel Rocha
 
“Squid” por Artur Martins, David Riedel e Florentino Bexiga
“Squid” por Artur Martins, David Riedel e Florentino Bexiga“Squid” por Artur Martins, David Riedel e Florentino Bexiga
“Squid” por Artur Martins, David Riedel e Florentino Bexiga
 

Dernier

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 

Dernier (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 

"Parallel and Distributed Computing: BOINC Grid Implementation" por Rodrigo Neves, Nuno Mestre, Francisco Machado e João Lopes

  • 1. 1 Parallel and Distributed Computing BOINC Grid Implementation ˜ Rodrigo Neves, Nuno Mestre, Francisco Machado, and Joao Lopes Abstract—With the development of communications and Internet, distributed computing became an everyday reality for everyone rather than just for a limited group of IT specialists and investors. This development allowed the emerging of several new computational concepts, some even at the cost of non-consensus. This paper intends to make a brief approach at some of the current paradigms like cloud and grid computing, peer-to-peer and client-server methods. Afterwards, it will go deeper into a detailed review of the Public Resource Computing concept and the BOINC software implementation. Finally, a case study on the Extended BOINC System created by Søttrup and Pederson [18] will present a simple solution to integrate the concepts of PRC and Grid in order to provide simple, scalable, volunteer based computer power to process QoS dependent jobs. Index Terms—parallel computing, distributed computing, grid computing, cloud computing, client server, peer to peer, boinc, extended boinc, public resource computing, quality of service ! 1 I NTRODUCTION work hardware, operating systems and programming languages, the term middleware has been created. It is D ISTRIBUTED systems have been given many defini- tions throughout the years but none of these has been consistent with each other. a software layer that provides abstraction, setting up a uniform computational model for software developers to As Andrew S. Tanenbaum and Maarten van Steen work on. One of the most widespread middleware soft- suggested [1]: wares available is the Common Object Request Broker (CORBA) [2]. A distributed system is a collection of independent computers that appears to its users as a single coherent system. 2 I MPORTANT C ONCEPTS The first part of this definition deals with hardware In order to better understand the remaining of this while the second focus specially on software. paper, it might be interesting to clear up some technical Distributed systems must present a transparent work- nomenclature in the Parallel and Distributed Systems flow to the user regardless of the differences in hardware environment. and communication methods throughout the connected • Servers: Typically high-powered workstations, computers. Other important characteristics of these sys- minicomputers or mainframes that hold the infor- tems are the ease of scalability and the high availability. mation and provide it to clients through request Such systems should easily connect users to resources handling; while hiding the fact that these may be distributed across • Clients: Computers or mainframes that request a network. It should also be open and respect the avail- services from the servers; able standards in order to facilitate future development • API: An Application Programming Interface is a and scalability and ensure the data and communication tool used to provide the end-user with an abstract security. All these specific issues shall be addressed in set of operations. Its main attraction is the ability further detail later on this paper. to hide from the user the implementation of such In order to create the illusion of a single system operations; both at high-level, for users, and at low-level, for net- • SDK: The Software Development Kit is typically a set of development tools that allows a software • R. Neves is with MIEET, Departmento de Engenharia Electr´ nica e o Inform´ tica, Faculdade de Ciˆncia e Tecnologia, Universidade do Algarve. a e engineer to create applications for a certain software E-mail: a25067@ualg.pt package. Usually a SDK implements an API; • N. Mestre is with LEI, Departmento de Engenharia Electr´ nica e In- o • Middleware: A software layer that conceals all form´ tica, Faculdade de Ciˆncia e Tecnologia, Universidade do Algarve. a e E-mail: a28997@ualg.pt the heterogeneity in a system in order for software • F. Machado is with LEI, Departmento de Engenharia Electr´ nica e In- o developers to easily work on; form´ tica, Faculdade de Ciˆncia e Tecnologia, Universidade do Algarve. a e • Query Languages: Techniques and protocols to get E-mail: a28994@ualg.pt • J. Lopes is with LEI, Departmento de Engenharia Electr´ nica e Inform´ tica, o a the sough information being the most famous the Faculdade de Ciˆncia e Tecnologia, Universidade do Algarve. e Structured Query Language (SQL); E-mail: a27981@ualg.pt • Virtual Organization: A dynamic set of individ- uals and/or institutions defined by a batch of
  • 2. 2 resource-sharing rules. Those rules need to make else users will eventually evade the security in favor perfectly clear just what is shared, who is allowed of productivity. to share, and the conditions under which sharing occurs [9]; 3.3 Scalability • Public Resource Computing: Usually known as Scalability is one of the most important design goals for PRC, this concept describes the idea of getting any- developers in distributed systems. According to Clifford body with an Internet connection and spare com- Neuman [5], a system’s scalability may be weighted pute power to donate CPU cycles on their computer along three main dimensions: for a greater project. • Size: Whether it is simple to add more users and resources to the system or not; 3 D ISTRIBUTED S YSTEMS C HALLENGES • Geographical: Regarding the location of users and 3.1 Connecting Users and Resources / Concurrency resources; Resource sharing and distribution in any given multi- • Administration: Ease to manage the system even user system is always a main concern. In the case of dis- when it is shared by multiple administrative orga- tributed systems this task becomes of utter importance. nizations. In a distributed system, both applications and services As a system scales up, in any of these three dimen- provide resources that can be shared among clients. sions, it may exhibit problems that could affect perfor- These usually allow multiple client requests to be ac- mance. cepted even though they may be processed one at a time. Size scalability problems are related with the increased If we consider each resource being encapsulated as amount of users and their demands. Such situation an object and the requests being treated as concurrent occurring on centralized services, data and algorithms threads, it becomes clear that any application or service will eventually create a bottleneck at the server. must be carefully managing this concurrency in order to Geographical scaling problems are usually related to avoid inconsistent results and/or deadlocks. the connectivity limitations of the communications. In Though at a first glance this situation may require wide-area networks, access to information is usually careful analysis and programming, it’s cost effectiveness made through unreliable connections and virtually al- may become obvious when sharing expensive resources ways a point-to-point connection. like high performance processing and data structures. Administrative scalability issues often relate to mul- tiple domain trustworthy security certificates. As more 3.2 Transparency / Heterogeneity organizations join the network, strict administration and One of the major concepts and advantages of distributed management protocols must be defined as all included systems is the transparency. It allows a user to access parts should play an active role in this processes. all resources and data that may be scattered around a network, without having to worry where these are 3.4 Security actually located. This may prove to be a problem when Usually, the information resources available and main- an virtually infinite combination of hardware resources tained in a distributed system have a high intrinsic and operating systems are involved. This ability to pro- value to the users, therefore the security of that data is vide transparency within all the heterogeneity of the dis- considerably important. tributed system is usually implemented by a middleware There are three main security components that should software layer as described on the introduction. be taken into consideration: The concept of transparency may be applied to many • Confidentiality: Protection against unauthorized aspects of distributed systems [1]: accesses; • Access: Hide differences in data representation and • Integrity: Protection against unauthorized alter- how a resource is accessed; ation or corruption; • Location: Hide where a resource is located; • Availability: Protection against disruption of the • Migration: Hide that a resource may move to communication with the resources. another location; Preventive actions should be taken in order to ensure • Relocation: Hide that a resource may be moved to this security parameters, like the use of a firewall and another location while in use; careful code development. • Replication: Hide that a resource is replicated; • Concurrency: Hide that a resource may be shared by several competitive users; 3.5 Openness • Failure: Hide the failure and recovery of a resource; According to Kazi Farooqui, Luigi Logrippo and Jan de • Persistence: Hide whether a (software) resource is Meer [3], openness is the combination of the previous in memory or on disk; stated characteristics. Sometimes however, it is generi- • Security: Negotiation of secure access to resources cally referred to as the level of respect for standards that must require a minimum of user intervention, or define the syntax and semantics of the provided services.
  • 3. 3 time. Unlike the Client-Server approach, P2P systems rely on an agglomeration of applications and systems al- together in order to have access to distributed resources on a decentralized way. In terms of maintenance costs, P2P has the edge over Client-Server since the contents and systems are maintained individually by each peer. P2P systems can be categorized into three main groups: centralized, decentralized and hybrid implemen- tations. The centralized P2P concept bases its implementation on a central server that executes simple generic functions for the system, like work load scheduling and result validation. Decentralized P2P systems rely completely on the peers themselves to perform all the functions without Fig. 1. Generic client-server environment intervention of centralized servers. In these cases, all the result validation and communication must be handled by each node and coordinated with its peers. This important concept, despite of the interpretation, Hybrid systems are a bit of a mix between the pre- provides the developers with the needed flexibility to vious two implementations. These bring up the concept add, configure and integrate new components and ser- of super-nodes, made of several regular nodes, which vices. Also, this flexibility ensures that the addition and are responsible of the centralized work while it’s con- removal of components can be made without affecting stituents can still work on the regular decentralized way. the stability of the overall system. 4.3 Cloud Computing 4 D ISTRIBUTED S YSTEM M ODELS A Distributed System is a set of loosely coupled re- Nowadays, data storage and programs are being swept sources interconnected by a communication network. from the desktop computers and corporate server rooms [4] and installed in the computer cloud. Cloud computing emerges from the lesser need of the users to have applications installed on their machines due to increased 4.1 Client-Server communications speed and availability. Client-Server is a particular type of distributed systems Every operating system update cascades into a batch design that clearly distinguishes the relationship be- of time and resource consuming software revisions. Out- tween two computers. The Server provides some kind of sourcing computation through Internet based services service, such as processing database queries or sending significantly reduces these costs while offering a whole out current stock prices. The client uses the services that new set of advantages like mobility and cooperation. are provided by the server, either displaying database The amount of services and applications provided in query results to the user or making stock purchase the cloud is growing every day and cannot be considered recommendations to an investor. as a bunch of simple tools anymore. Companies are The communication that occurs between the client and starting to acquire cloud based services for every kind the server must be reliable. That is, no data can be of managerial and business oriented tasks. dropped and it must arrive on the client side in the Growing voices of worry have been expressing their same order in which the server sent it. In order to ensure concerns about data privacy and confidentiality not tak- the reliability between Server and Client the communi- ing the service’s privacy policies as creditable enough. cation uses the TCP/IP protocols. The Internet Protocol One famous scenario used as argument is cited by Hayes (IP) suite is a set of communication tools that regulate [17]: communication on the Internet and most commercial (...) a government agency presents a subpoena or networks. The Transmission Control Protocol (TCP) is search warrant to the third party that has posses- one of the core protocols of this suite. Using TCP, clients sion of your data. If you had retained the physical and servers can create connections to one another, over custody, you might still have been compelled to which they can exchange data in packets. surrender the information, but at least you would have been able to decide for yourself whether or 4.2 Peer-To-Peer not to contest the order. The third-party service The concept of Peer-To-Peer communication, also known is presumably less likely to go to court on your as P2P, is based on the idea that each individual node behalf. In some circumstances you might not even (peer) in the network is both client and server at the same be informed that your documents have been released.
  • 4. 4 These kind of issues will probably never be solved 4.4.1.1 Applying the three point checklist: Follow- in time to stop or control the growth of the cloud as ing there are some practical examples to help us make we see major software developers and investors (Apache the concept of grid clear, according to Ian Foster’s three Foundation, Amazon, Adobe, Google, IBM, etc.) trying point checklist. to keep up with the evolutionary pace. • Sun Grid Engine: The Grid Engine project is an open source commu- 4.4 Grid Computing nity effort to facilitate the adoption of distributed 4.4.1 Definition computing solutions. [12] The term grid was first used as a metaphor to the This system delivers quality of service when in- electric power grid. The intended idea was that access stalled on a parallel computer or local area network. to computation and data should be as easy, pervasive However, its complete knowledge of system states and standard as plugging in an appliance into an outlet. and user requests, as well as control over individual Nowadays it is hard to find a consensual definition. Here components, implements a centralized management are some from the most referred authors: system that makes this system fail the first point of Foster’s checklist. A computational grid is a hardware and software • The Web: The Web is open and its general-purpose infrastructure that provides dependable, consistent, protocols support access to distributed resources, pervasive and inexpensive access to high-end however it fails to coordinate those resources to computational capabilities. [6] deliver interesting quality of services. • TeraGrid: Computational grid is the technology that enables TeraGrid is an open scientific discovery infrastruc- resource virtualization, on-demand provisioning and ture combining leadership class resources at eleven service (resource) sharing between organizations. [7] partner sites to create an integrated, persistent com- putational resource. [13] Grid computing has the ability, using a set of This system integrates resources from multiple insti- open standards and protocols, to gain access to tutions, each with their own policies, uses open and applications and data, processing power, storage general-purpose protocols to negotiate and manage capacity and a vast array of other computing sharing and addresses multiple quality of service resources over the Internet. A grid is a type of dimensions, therefore fully fits Ian Foster’s three- parallel and distributed system that enables the point checklist. sharing, selection and aggregation of resources distributed across ”multiple” administrative 4.4.2 Usual Features domains based on their (resources) availability, In this section we thrive to describe in detail some of capacity, performance, cost and users’ quality-of- the most important features usually associated with the service requirements. [8] grid computing method. 4.4.2.1 Volunteer Computing: Most grids use vol- The problem that underlines the Grid concept is unteer resources, that is, resources contributed to the coordinated resource sharing and problem solving in grid by anonymous individuals or organizations with dynamic, multi-institutional virtual organizations. no profit intended. [9] 4.4.2.2 Geographically dispersed: Due to its archi- tecture and volunteer characteristic some grid resources Although we can find some common ground, like re- can be spread throughout the globe. source sharing and processing power, these are features 4.4.2.3 Idle Resources: One of the benefits of using not only of a grid computing system but of any kind grid computing is that you can exploit resources with of distributed system. Due to the lack of consensus and low usage rates because in most organizations, there are the use of the term ”grid” as a marketing slogan (science large amounts of under utilized computing resources. Most grid, access grid, knowledge grid, bio grid, campus grid, desktop machines are busy less than 5% of the time over a commodity grid, etc.), Ian Foster suggested a checklist to business day [14]. define what is and is not a grid in his paper ”What is 4.4.2.4 Inexpensive: Either through volunteering the grid? A three point checklist” [11]: or by using idle resources within a company, usually A grid is a system that: it’s possible to reach a considerable computational per- 1) coordinates resources that are not subject to formance without major investments in supercomputers centralized control (...) or clusters. 2) (...) using standard, open, general-purpose pro- tocols and interfaces (...) 4.4.3 Architecture 3) (...) to deliver nontrivial quality of service. In an effort to standardize grid architectures, I. Foster, C. Kesselman and S. Tuecke [9] presented an open five-layer
  • 5. 5 4.4.3.4 Collective Layer: The Collective layer con- tains protocols and services, like APIs and SDKs, which are not associated with specific resources but global in nature, and capture interactions across collections of resources. Meaning that, at this layer, individual resource architectures and functionalities are abstracted in order to provide collective functions that can be implemented as persistent services with associated protocols, or as SDKs and APIs, designed to be linked to applications. 4.4.3.5 Application layer: This is the final layer, therefore it is responsible to provide the end user with an Fig. 2. Layered Grid Architecture abstract interface that includes the user applications and functionalities that operate within a Virtual Organization environment. These applications are constructed using services defined at any layer. structure (Application, Collective, Resource, Connectiv- ity and Fabric layer) based on the ”hourglass model” [10]. 4.5 Quality of Service In this architecture the narrow neck, represented by In the multimedia communities, Quality of Service (QoS) the Resource and Connectivity layers, defines a small set issues are geared to provide a client with an acceptable of protocols onto which many high-level behaviors, used level of presentation quality when accessing content. by the Application and Collective layer, can be mapped Network QoS deals specifically with providing certain (the top of the hourglass). On the other hand, the ”neck” quality levels for network link characteristics between protocols can themselves be mapped onto many different two points. These characteristics are expressed in terms underlying technologies (the base of the hourglass, the of delay, jitter, packet loss rate and throughput. Fabric layer) (Figure 2). Unlike multimedia and network QoS, Grid QoS re- 4.4.3.1 Fabric Layer: This layer provides the re- quires a central information service for up-to-date infor- sources to which shared accesses are mediated by Grid mation on resources available for use by others. Such protocols. Fabric layer works with resource-specific op- information can be interrogated by an application user erations, there is no abstraction at this level. These to determine which resources can be used to execute an operations are usually a result of sharing operations at operation. In Grid computing, QoS management focus higher layers. There’s interdependency between func- on providing assurance on resource access while main- tions implemented in this layer and sharing operations taining the security level between domains [22]. supported by the Grid. A rich and more complex Fabric set of functionalities enables more sophisticated sharing 4.5.1 QoS in Grid Computing operations. As opposed, fewer functionalities and de- Once the Grid applications submit their requirements to mands at this layer imply a simplified Grid structure. the management services that schedule jobs as resources 4.4.3.2 Connectivity Layer: The connectivity layer become available, these must support a resource man- defines the core communication and authentication pro- ager or scheduler that can receive requests from external tocols. Communication protocols enable the exchange of applications. Nevertheless, there are several applications data between Fabric layer resources, requiring transport, that need to get results for their tasks within strict routing and naming mechanisms. Authentication proto- deadlines. Consequently they cannot wait for resources cols are used to ensure the security and identity of users to become available so, it is necessary to reserve Grid and resources. Due to complex security problems and resource and services in a particular time. In order wide usage, existing protocols and standards should be to handle complex scientific and business applications, used to implement this layer. other features are highly desirable, sometimes even re- 4.4.3.3 Resource Layer: Resource layer provides quired. the means to share single resources using information A Grid resource management system tries to address and management protocols. Information protocols are the following QoS issues [22]: used to obtain details about the structure and state of • Advanced Resource Reservation: It’s important a single resource. Management protocols are used to when dealing with scarce resources, as is often the negotiate access to the shared resource, specifying its case with end resources made available on the Grid. requirements and the operations to be performed. These Should support mechanisms for advance, immedi- protocols should also ensure that requested operations ate, or on-demand resource reservation. respect individual policies of each resource. This layer is • Reservation Policy: The system should have mech- only concerned with individual resources. Global state anisms that provide the Grid resource owners ways and atomic actions over multiple resources are issues of of enforcing their policies by governing when, how, the next layer. and who can use their resource.
  • 6. 6 • Agreement Protocol: The system should inform the clients of their advance reservation status, and the resource quality they should expect during the service session. • Security: The system should prevent malicious users penetrating or altering data repositories that hold information about reservations, policies and agreement protocols. • Simplicity: The QoS enhancement should be rea- sonable and simplistic in design so that it requires minimal changes to be made to existing computa- tion, storage or network infrastructure. • Scalability: The approach should be scalable to a large number of entities, since the Grid is a global- scale infrastructure. 5 BOINC Fig. 3. BOINC Infrastructure Berkeley Open Interface for Network Computing (BOINC) is an open middleware platform that supplies the scientific research by placing their confidence in temporary input and output files on the server. Only using resources donated by simple personal computers the files are deleted, the entries on the database are around the world (core clients). The objective is to gather kept and therefore it is possible to find information all this energy and make a supercomputer which helps even after the project is completed. the researchers (clients) in several projects. [18] • The ”feeder” is used to enhance the schedulers per- Using BOINC allows user to achieve high processing formance and to reduce the queries to the database. performance with low costs. For example, to have 100 It does so by placing WUs, from the database, into TFlops available for one year, Amazon’s Elastic Com- a shared memory. puting Cloud costs 175 million dollars, to build a cluster • The ”database purger” removes work-related you need 12.4 million dollars, but with BOINC, clients database entries when they are no longer needed only need, in average, 125,000 dollars [19]. On the other in order to keep the database from growing into an hand, looking at the top500 supercomputers list (Novem- unpractical size. ber/2008) [21], roadrunner achieves 1105 TFlops, where BOINC has a daily average of 1,700 TFlops and the most 5.1.2 Scheduler relevant project (SETI@Home) has a daily average of 615 The scheduler is a CGI software that runs every time TFlops. a client connects to a project and asks for work. It has to compare the available WU’s needs with the clients’ 5.1 Infrastructure shared resources in order to match them. BOINC provides a set of tools, daemons, scheduler and 5.1.3 Database database (Figure 3). The BOINC database is a MySQL database that stores in- 5.1.1 Daemons formation about registered users and hosts, applications and their versions, WU’s and their results, and other BOINC servers use daemons to manage and keep track relevant information. of their jobs or, in BOINC terms, work units (WU). • The ”work generator” has to generate WU and correspondent input files. 5.2 PRC vs. Grid • The ”transitioner” has to control and change the According to David P. Anderson [23], both PRC and Grid states of each WU. Computing methodologies share a common goal: to use • The ”validator” has to validate the results of each the existing resources in the best possible way. There are WU and its redundant copies. however some important differences between the two. • The ”assimilator” daemon regroups the final results While a Grid is usually managed and controlled by and processes them according to the administrator’s a single organization, a PRC network relies on separate specification. It could zip and e-mail the results or individuals to share their resources. While this particu- automatically do post processing and store those larity may allow a huge growth in terms of connected results on a magnetic tape. nodes, it brings out other liabilities like the unreliability • The ”file deleter”, as the name indicates, checks for of the processed results and uncertain processor time. completed and assimilated WUs and then deletes Since each user is volunteer and therefore allowed to
  • 7. 7 Manage their states; • Pull the results from the grid resource broker. • These results don’t need validation because the re- sources from the Grid are considered trustworthy. 6 C ONCLUSION Throughout the development of this paper, we scanned the currently available paradigms on distributed com- puting, their main advantages and limitations. This process has taken us through part of the history of computing as we start from the traditional client-server model and develop it until nowadays cloud and grid computing concepts. The evolution of communications and commodity per- sonal computers brought the distributed computation to a whole new level while approaching people and Fig. 4. Extended BOINC Infrastructure science through Public Resource Computing. This new area of computational resource sharing brought the need to rethink the Quality of Service requirements in dis- control the amount of work done, nothing ensures the tributed networks. PRC proved that a huge ammount project management that this user will be cooperating of volunteers can supply an unbeatable system-wide for a long time or keeping a steady work-flow. throughput without the need of strict QoS policies. For Another important difference between the two meth- instance, the major BOINC based project, and also the ods relies on the Quality of Service. It is virtually impos- one that encouraged it’s development, SETI@Home has sible to ensure a strong QoS on a PRC network due to produced so far 3 Million+ years of processing time. [20] slow connections and low availability. For the reduced ammount of research projects and processing jobs that require strict deadlines, high com- 5.3 Extended BOINC System munication speed and permanent connectivity there was As we have seen BOINC only implements two out of the a need to merge the benefits of PRC globalization and three points in Foster’s checklist. It has a decentralized the QoS that a Grid system could provide. In order to control over resources and it uses open protocols and address this need, the bridge connector model [18] was interfaces, but it fails to deliver non-trivial quality of developped to extend the regular BOINC System. service, because it does not fulfill the information ac- cessibility and connectivity requirements. R EFERENCES With this in mind, Søttrup and Pederson [18], sug- [1] A. S. Tannenbaum and M. van Steen, Distributed Systems - Principles gested a bridge between BOINC and a private Grid, and Paradigms, International Edition, Pearson, U.S.A.: Prentice Hall, where instead of clients pulling the jobs from BOINC 2002. server by connecting to it, the server connects to a [2] G. Coulouris, J. Dollimore and T. Kindberg, Distributed Systems - Concepts and Design, Fourth Edition, Pearson, U.K.: Addison- resource broker, responsible for scheduling, submitting Wesley, 2005. jobs to remote machines, transferring files and logging, [3] K. Farooqui, L. Logrippo, J. de Meer, The ISO Reference Model for on the Grid and pushes jobs into it. This Grid will be re- Open Distributed Processing - An Introduction, February 14, 1996 [4] A. Silberschatz, P. B. Galvin, and G. Gagne, Operating System sponsible for providing quality of service and therefore, Concepts, Seventh Edition, Wiley, U.S.A.: John Wiley & Sons, 2005, together they would build a Foster’s Grid (Figure 4). Page 611. [5] B. C. Neuman, Scale in Distributed Systems, Readings in Distributed Computing Systems, IEEE Computer Society Press, 1994 5.3.1 BOINC to Grid architecture [6] I. Foster, C. Kesselman, The Grid: Blueprint for a New Computing Infrastructure, University of Michigan, U.S.A.: Morgan Kaufmann In order to process the WU’s directly and for the speci- Publishers, 1999 fications of the Grid, a new daemon has to be created [7] P. Plaszczak, R. Wellner, Grid computing: The Savvy Manager’s Guide, (bridge daemon) and some of the other have to be U.S.A.: Elsevier/Morgan Kaufmann, 2005 modified. The transitioner daemon must be adapted so [8] IBM Solutions Grid for Business Partners: Helping IBM Business Partners to Grid-enable applications for the next phase of e-business on that it wont change the state of the WUs sent into demand, U.S.A.:IBM, 2002 the Grid. Manipulating and controlling the several WU [9] I. Foster, C. Kesselman, S. Tuecke, The Anatomy of the Grid: En- states is now the bridge daemon’s responsibility. The abling Scalable Virtual Organizations, International J. Supercomputer Applications, 2001 feeder daemon also needs to be modified so that it wont [10] L. Kleinrock, Realizing the Information Future: The Internet and load the WUs, intended to the Grid, into the shared Beyond, National Research Council, U.S.A.: National Academy memory. This way, the bridge daemon has to: Press, 1994 [11] I. Foster, What is the Grid? A Three Point Checklist, GRIDToday, July • Push WUs into the Grid; 20, 2002
  • 8. 8 [12] Grid Engine Project, http://gridengine.sunsource.net, SunSource.net, Sun, June 6, 2009 [13] About TeraGrid, http://www.teragrid.org/about, TeraGrid, Na- tional Science Foundation, June 6, 2009 [14] B. Jacob, M. Brown, K. Fukui, N. Trivedi, Introduction to Grid Com- puting, First Edition, International Business Machines Corporation, U.S.A.: IBM, International Technical Support Organization, 2005, Page 8 [15] P. Roy, Operating Systems: Internals and Design Principles, 6/E William Stallings, Manatee Community College, U.S.A.: Prentice Hall, 2008 [16] Introduction to Distributed System Design, http://code.google.com/intl/pt-PT/edu/parallel/dsd- tutorial.html, Google Code University, 2009 [17] B.Hayes, Cloud Computing, Communications of the ACM, ACM, July 2008, Pages 9-11 [18] C. U. Søttrup, J. G. Pedersen, Developing Distrubited Computing Solutions: Combining Grid Computing and Public Computing, M. Sc. Thesis, Department of Computer Science, University of Copen- hagen, March 1, 2005 [19] BOINC Documentation Project, Why Use BOINC?, http://boinc.berkeley.edu/trac/wiki/WhyUseBoinc, University of California, June 11, 2009 [20] J. Koulouris, The Big BOINC ! Projects and Chronology Page, http://www.angelfire.com/jkoulouris-boinc, June 11, 2009 [21] Top500.org, Top500 List - November 2008, http://www.top500.org/list/2008/11/100, Top500 Supercom- puting Sites, Top500.org, June 11, 2009 [22] R. J. Al-Ali, K. Amin, G. von Laszewski, O. F. Rana, D. W. Walker, M. Hategan, N. Zaluzec Analysis and Provision of QoS for Distributed Grid Applications, Kluwer Academic Publishers, 2004 [23] D. P. Anderson, BOINC: A System for Public-Resource Computing and Storage, Proceedings of the Fifth IEEE/ACM International Workshop on Grid Computing, 2004