Automating Google Workspace (GWS) & more with Apps Script
A Taste Of InfoGrid
1. A Taste of
InfoGrid™
This presentation contains 6 slides (with build-ups).
Please listen to audio as you go through the slides, or read through the notes.
July 2009
infogrid.org
2. How Do You Build Your Web
Applications?
Ruby on Rails
J2EE
Python
PHP
.NET …
infogrid.org
3. RDBMS-Centric Architecture
Browser
Browser Browser
Browser Browser
Browser
Browser Browser Browser
Application
Application
Application
Server
Server
Server
RDBMS
infogrid.org
4. RDBMS-Centric Architecture
Browser
Browser Browser
Browser Browser
Browser
Browser Browser Browser
Major problems:
1.RDBMS joins don’t scale.
Application
Application
Application
Server
Server
Server
RDBMS
infogrid.org
5. RDBMS-Centric Architecture
Browser
Browser Browser
Browser Browser
Browser
Browser Browser Browser
Major problems:
1.RDBMS joins don’t scale.
Application
Application
Application 2.Tables are not web native at all.
Server
Server
Server
RDBMS
infogrid.org
6. RDBMS-Centric Architecture
Browser
Browser Browser
Browser Browser
Browser
Browser Browser Browser
Major problems:
1.RDBMS joins don’t scale.
Application
Application
Application 2.Tables are not web native at all.
Server
Server
Server
3.Most of the data important to
applications lives elsewhere.
RDBMS
infogrid.org
7. High-Level InfoGrid Features
• (Blob) Store abstraction w/ a variety of implementations
• RDBMS joins
(e.g. files, MySQL, Hadoop, S3…) and automatic mapping
don’t scale
• Conceptual model/schema w/ run-time enforcement
• Tables are not
web native at all
• Most of the data
important to
applications lives
elsewhere
infogrid.org
8. High-Level InfoGrid Features
• (Blob) Store abstraction w/ a variety of implementations
• RDBMS joins (e.g. files, MySQL, Hadoop, S3…) and automatic mapping
don’t scale • Conceptual model/schema w/ run-time enforcement
• Graph traversal instead of joins
• Tables are not
web native at all
• Most of the data
important to
applications lives
elsewhere
infogrid.org
9. High-Level InfoGrid Features
• (Blob) Store abstraction w/ a variety of implementations
• RDBMS joins (e.g. files, MySQL, Hadoop, S3…) and automatic mapping
don’t scale • Conceptual model/schema w/ run-time enforcement
• Graph traversal instead of joins
• All data objects automatically have a URL (REST-ful);
• Tables are not
multiple output representations for each URL
web native at all
• Most of the data
important to
applications lives
elsewhere
infogrid.org
10. High-Level InfoGrid Features
• (Blob) Store abstraction w/ a variety of implementations
• RDBMS joins (e.g. files, MySQL, Hadoop, S3…) and automatic mapping
don’t scale • Conceptual model/schema w/ run-time enforcement
• Graph traversal instead of joins
• All data objects automatically have a URL (REST-ful);
• Tables are not
multiple output representations for each URL
web native at all
• Objects can change type(s) at run-time
• Most of the data
important to
applications lives
elsewhere
infogrid.org
11. High-Level InfoGrid Features
• (Blob) Store abstraction w/ a variety of implementations
• RDBMS joins (e.g. files, MySQL, Hadoop, S3…) and automatic mapping
don’t scale • Conceptual model/schema w/ run-time enforcement
• Graph traversal instead of joins
• All data objects automatically have a URL (REST-ful);
• Tables are not multiple output representations for each URL
web native at all • Objects can change type(s) at run-time
• Cached in memory until not needed any more
• Most of the data
important to
applications lives
elsewhere
infogrid.org
12. High-Level InfoGrid Features
• (Blob) Store abstraction w/ a variety of implementations
• RDBMS joins (e.g. files, MySQL, Hadoop, S3…) and automatic mapping
don’t scale • Conceptual model/schema w/ run-time enforcement
• Graph traversal instead of joins
• All data objects automatically have a URL (REST-ful);
• Tables are not multiple output representations for each URL
web native at all • Objects can change type(s) at run-time
• Cached in memory until not needed any more
• Most of the data • XPRISO and Probe framework automatically make
important to external data appear as local/native
applications lives • Automatic/adaptive, incremental updates w/ events
elsewhere • Library of Probes
infogrid.org
13. High-Level InfoGrid Features
• (Blob) Store abstraction w/ a variety of implementations
• RDBMS joins (e.g. files, MySQL, Hadoop, S3…) and automatic mapping
don’t scale • Conceptual model/schema w/ run-time enforcement
• Graph traversal instead of joins
• All data objects automatically have a URL (REST-ful);
• Tables are not multiple output representations for each URL
web native at all • Objects can change type(s) at run-time
• Cached in memory until not needed any more
• Most of the data • XPRISO and Probe framework automatically make
important to external data appear as local/native
applications lives • Automatic/adaptive, incremental updates w/ events
elsewhere • Library of Probes
Plus: GUI template framework, complex traversals and events, ability to
distribute P2P-style, modules, identity awareness and much more…
infogrid.org
14. Why InfoGrid Matters (We Think)
Makes a lot of development work unnecessary
➡ faster development
➡ higher application quality
➡ higher deployment flexibility
➡ lower development cost
infogrid.org
15. Why InfoGrid Matters (We Think)
Makes a lot of development work unnecessary
➡ faster development
➡ higher application quality
➡ higher deployment flexibility
➡ lower development cost
No more:
Database sharding in the application layer, O/R mapping, custom
data import, code for event detection and generation, spaghetti
code mixing data access/storage/refresh/GUI, total rewrites for
S3/Hadoop/...
infogrid.org
16. This concludes:
A Taste of
InfoGrid™
For more information:
infogrid.org
infogrid.org
Notes de l'éditeur
How can InfoGrid be explained in 6 slides? Well, we will try anyway.
And if you’d like more than a taste, go get seconds at infogrid.org.
There are many ways in which developers build web applications today. Some are shown on this slide.
I’m sure you know some developers who are almost religiously convinced that their language of choice is vastly better than some other languages on this chart.
But the truth is that all of these approaches are much closer to each other than it might look, because they all use the same architecture.
This architecture could be called a relational database-centric architecture.
The RDBMS-centric architecture assumes that there is a single relational database for the application. This database is accessed by one or more application servers in the languages we discussed, and serves so many concurrent users with web browsers.
The RDBMS-centric architecture was well established long before the web, and has been essentially the same for the past 40 years. But in the age of Web 2.0, it is showing its strains:
Ask any developer of a high-volume website, and they will tell you that joins in relational databases don’t scale, and that they have to spend substantial amounts of precious developer time and budget to avoid them. It’s difficult to avoid joins because they are the very reason relational databases were invented in the first place! Some sites go as far as eBay does, which has banned the use of joins in some of their applications. Imagine! There is not much reason left for using a relational database if we aren’t allowed to do joins.
Secondly, even if we don’t do joins, relational database tables are just a very bad way of storing information for the web. The web is all about distributed URLs at which data can be found, about hyperlinks, about rich media types and so forth. If you set out to design a way of storing information that is as mismatched as possible for the requirements of the web, it’s hard to come up with a worse idea than tables. Object-relational mapping anybody, and why again do developers have to deal with it? Going forward, we need to do better than that, and InfoGrid does that.
The last, and perhaps most important and under-appreciated point about relational databases and the web, is that more and more of today’s web applications don’t actually own much of their own data. Before the web, application developers could assume that they could simply put all the data in a database, and that was that.
But today, with RSS feeds, web services, social networking, internet identity like OpenID, outsourcing relationships, dashboards and the like, much of the data in our applications comes from elsewhere and is managed elsewhere. Often the maintainers of that data don’t even know that our application exists. Nevertheless that data must be accessed by our application, processed, cached, updated, integrated and reconciled with data from other sources, and much more; the dirty secret is that an RDBMS-centric architecture cannot help us with that at all.
These are some of the reasons why we designed the InfoGrid platform. Applications built on InfoGrid don’t run into these problems.
In InfoGrid, data is stored not in rows or columns as in a relational database, but as it is on the web: URLs identify serialized chunks of data. They are accessed through the Store abstraction, which you can think of like a gigantic hash table. InfoGrid allows data to be stored in a variety of ways, such as in a file system, in a relational database, or on a grid like Hadoop or S3. Best of all, application code usually does not depend on how data is stored, so developers can write the same application, and decide only after the fact where and how data gets stored.
What the developer sees is objects that are instances of a conceptual model or schema. These instances may be typed and may be related to each other, so together they form a graph with nodes and edges. If you are familiar with conceptual information modeling or object modeling like the UML, or semantic modeling with OWL or the semantic web, you will feel right at home. The model or schema of any InfoGrid application is automatically enforced by InfoGrid at run-time.
For example, if you defined a type called Customer for your InfoGrid application, and one called Order, and a relationship that expresses that Customers Place Orders, InfoGrid makes sure that only customers can place orders. If you wanted somebody else to also be able to place orders, you would create another allowed relationship. If you said that customers can be in one of good standing, late or delinquent, that’s all they can be unless you change the model. You get the picture.
Instead of the infamous joins of the relational database, InfoGrid applications traverse that node-edge graph of objects. This is the programmatic equivalent of the web, where we navigate by hyperlinks instead of by SELECT * FROM WEB.
It’s important to understand that because of this, the performance characteristics of InfoGrid applications are different from traditional web applications. Many InfoGrid graph traversal operations are exceedingly fast, and can be parallelized much more effectively for high-volume applications than they could with a relational model. However, some are slower, such as, say, ranking all customers by sales volume. It’s important to understand that tradeoff.
Instead of residing in tables with keys, all objects in InfoGrid automatically have a URL. That makes all InfoGrid applications automatically REST-ful, which means that all information in an InfoGrid application can be easily bookmarked in web browsers, tagged on services such as delicious, e-mailed, texted and twittered, and so forth. No additional effort required for the application developer. RDBMS-centric applications are often very bad at that, to the utmost frustration of their customers.
Going slightly beyond REST, InfoGrid supports multiple output representations per URL. For example, all objects can be edited with a PropertySheet that comes with InfoGrid. Application developers can add as many representations as they like, whether using different MIME types like HTML, XML or JSON, or different formatting within the same type like an overview, or detailed form. One of their customers doesn’t like the way some object is formatted? Give them their own version.
Unlike most mainstream programming frameworks, objects in InfoGrid can change their types at run-time. Any object can carry several types at the same time, too, and change them at will.
Again, this is a very web-centric view of the world: given a URL, over some period of time, all kinds of data might be served at that same URL. Conceptually, in InfoGrid, a single object resides at that URL, whose types may change over time.
You probably already guessed that InfoGrid caches objects in memory for efficiency reasons, and again without the developer having to do anything special about it.
Finally, InfoGrid contains two technologies called XPRISO and the Probe Framework that were created to deal with the problem that so much of an application’s data is often owned by someone other than the application. These technologies enable data external to InfoGrid to appear locally, in the same address space like all other objects, as if they were local. For example, if an InfoGrid application accesses an RSS feed, the items in that RSS feed automatically show up and act just like any other InfoGrid objects to the application developer.
You might say: many people have done that. True. But what they usually have not done is to keep the local copy in sync with the external copy after the initial import. When the external information changes, InfoGrid detects that and incrementally applies the detected changes on the local objects, just as if those changes had been made locally. Application developers don’t need to write code to figure out what to do when external information changes, which is a major expense and quality issue in many applications built on a traditional architecture.
Naturally, InfoGrid developers have a great amount of freedom in defining when updates occur. Timing is very configurable, just like everything in InfoGrid. And before you ask, yes, information can also be automatically written back to the external data source.
So any kind of external information can be treated as if it was a graph of local objects in InfoGrid. InfoGrid provides a library of Probes for common formats, but it is straightforward to build new Probes. We should mention that InfoGrid also caches all external information, so an InfoGrid application doesn’t need to go down just because some web service invocation failed – a scourge and major expense for applications built on less powerful foundations.