2. humans clouds sensors beginner to expert sharing logins and access click to code to workflow personal storage big data and replication compute and scaling software as component interoperabilty survey and event control or autonomous The New Science
4. Service Oriented Architecture 3. bind service request request client response response 2. find service contract registry 1. publish Principle: Click or Code
5. VO Data Services Cone Search radius+position list of objects encoded as VOTable Simple Image Access Protocol Simple Spectrum Access Protocol spectra have subtleties protocol more complicated Astronomical Data Query Language For database queries Core SQL functions plus astronomy-specific extensions Sky region, Xmatch Table Access Protocol Exposes relational databases What tables What table schema Here is a query in ADQL
6. VO Compute Services Asynchronous May not get immediate answer just get a place to check back Security Expensive resources, big requests, sequestered data Strong or Weak or None Scalable Graduated path to powerful computation and big data Cloud store VOSpace Sharable
7. VO Registry publish -- find -- bind Registry Metadata Descriptions of data collections data delivery services organizations, etc. Based on Dublin Core with astronomy-specific extensions Represented as XML schema; extensible Contents stored in Resource Registries exchange metadata records through the Open Archives Initiative Protocol (OAI-PMH)
9. Semantics & Search Identifiers ivo://nasa.gsfc.gcn/SWIFT#BAT_GRB_Pos_374875-722 Free tags beard Fred pudding Controlled Vocab (UCD) phot.flux;em.ir Controlled Vocabinterop (SKOS) Ontology Greek isA Man, Socrates isA Greek Socrates isA Man Data Models Each sky position will have a circular positional error estimate ... Text markup Outflows from <object>NGC 666</object> are irregular ... Schema Columns are Magnitude, Position, Identifier , ... Metadata (registry) forms Full Registry: true; ManagedAuthorities: authority, nasa.heasarc Formal service description
16. Skyalert Push-based workflow Can be cyclic Portfolio aggregation by citation Annotation as software components Stream owner builds template Django, Python, Jquery now 4 developers via SVN
18. Roles human or robot1. browse query, human computing, WWT/Google skyalert.org human or robot2. subscribe human or robot3. author 4. annotate contrib software components archive, mining push inject web portfolios db IM/tweet/email/TCP triggers actions
19. skyalert.org Cyclic workflow graph Trigger CRTS[“Geometry”][“Moon angle”] > 30 and SDSS[“Photoprimary”][“g-magnitude”] < 18 Action annotator followup request dynamically loads module run(triggerEvent, portfolio): <business logic> can build event and inject recursively send message Alerts and event cascade 18
35. Pannstarrs PS1 compute User facing SQL/casjobs workbench privacy/share stored queries Data valet load/validate merge crawl replicate log workflow workflow data head/slice hot/warm/cold Fault tolerance: multiple replication, fault workflow Cost and energy carefully considered Future: Hadoop/Mapreduce
36. Cloud Supercomputing? Teragrid/Globusvs Cloud/Amazon MI Both ways to get wholesale computing Both provide IaaS, Infrastructure as a Service Virtual Machine more popular than CTSS stack What about parallelism? I/O speed? GPUs? etc Watch 3leaf and ScaleMP for these
37. Science and Web 2.0 Easy for groups to form and collaborate Integrates with user workspace iGoogle and OpenSocial alongside other aspects of their lives Use existing tools SlideShare, blogs, google gadgets, facebook, Gwave, Flickr, YouTube Sharing workspace Electronic log Provenance Virtual Data as “equivalent script”
38. Science and Web 2.0 Server delivers only code Browser makes presentation Ajax and Ajaj and Http “long poll” Jquery and Google toolkit see WWT and GSky in Skyalert “Everything is a wiki” or a wave? Visible/editable by group/s
45. Arroyo Gateway Architecture 1. use HTML/JS from webserver to create job definition. wholesale computing 2. Daemon is polling & sees new job, makes local space for it. 3. Start job on compute resource & update jpb status. daemon 7. User fetches results from webserver 4. Fetch &update status of running job. Repeat. 5. Output to remote space. webserver Django MySQL job definitions and status 5. Daemon copies output from remote to local, updates job status. local space for results remote space for results retail wholesale RW and J. Bunn
55. Human Volunteers Science Layer Describe what you see in image Each person has level of expertise How to use results most effectively Galaxyzoo.org, citizensky.org good models Game Layer Makes people come back Top 10 ranking etc Anonymous partner a la gwap.com
56. Human Volunteer Evidence Donalek et al arXiv:0810.4945 [astro-ph] 4 of 10 say artifact artifact
60. Classic Machine LearningMetric in “Feature Space” Relevance Vector Machine (Tipping) Feature Vectors Learning from Training set Picking relevant lessons RW and J. Beck
68. User Interface (wrong) and now do some science.... Finally get some help Ask for help Translate VOTable format Learn to use VO Registry Read about web services Read about XML Wait for account Register
69. User interface (right) in Darwinian evolution every small change must give benefit Power user Learn the VO structure hey this is interesting .... Run bigger job more science.... Register some science.... Web form Anonymous be careful with complex authentication!
70. Steering the Ship Short term Pragmatism useful tools now simple protocols (eg cone search) “just use RA and Dec” vs Long term Architecture modular suite of interoperable tools sophisticated protocols (egskynode) sophisticated Space-Time coordinates
86. What is a Data Center? machines services doesn’t matter where or how testing testing testing do we have enough power and HVAC?
87. Complex scienceComplex machines Separate science user from complexity Must have domain science context Making simple things simple but Power to scale up Drill-down if wanted Machines are not the objective Science through data, compute, sharing
88. eScience is for People, right? Getting Started Help Desk Forum Documentation Knowledge Base Calendar Contact Us Social Media Blog/newsfeed Campus Champions Summer Schools Advanced Support for Developers Education