*** Developed in the early 1970s, by IBM *** Computers were large centralised things *** According to IBM's 1974 paper in Communications of the ACM, SQL was designed both for programmers and to make databases accessible to "accountants, engineers, architects, and urban planners"[0] **** Ha ha ha! *** However, it is lovely to be able to throw together a one-line query that does some quite complex operations.
*** Once upon a time, programmers processed files directly *** Then there were simple libraries of file-handling routines *** Then there were some early databases, all with their different interfaces *** Then SQL emerged as a standard; it was a good fit for the problems of the time, it was nicer to use, and before long there were many implementations **** Although compatability was poor, it was a matter of adjusting for dialects rather than re-learning the whole thing *** But NoSQL lingered in a few places **** Embedded systems didn't want the overhead of SQL, so DBM/GDBM/BDB etc. were popular there **** SQL engines were work to set up (until SQLite appeared, anyway), so apps often had to use something simple for their internal storage, unless they were 'enterprise apps' **** Sometimes it was just history (UNIX password database) **** Sometimes more specialised interfaces were made for specific purposes, too (LDAP comes to mind)
*** The two often go hand in hand *** For a while, we just put in bigger, better, SQL servers, hot backups, and the like *** Rarely-modified data was easily manually replicated to a fault-tolerant pool of otherwise read-only servers *** But Web sites were migrating towards more interactive models; creating pressure for databases to handle updates while scaling and being highly available *** Replication without a central point of failure (eg, master server) is hard in SQL due to query ordering issues **** UPDATE table SET foo = foo + 1 WHERE pk = 1; **** UPDATE table SET foo = foo * 2 WHERE pk = 1; **** ...order matters, so queries need to be processed on all servers in the same order **** Global ordering requires global synchronisation, which leads to bottlenecks, harming scaling and availability *** But there was some slumbering frustration with SQL, too; it can be limiting **** SQL is a declarative language, where one specifies what one wants, not how to get it. **** The SQL server has to come up with an efficient query plan. **** Sometimes it fails, and performance is disappointing. **** Then you need to learn how to second-guess it and write queries designed to trick the server into coming up with the query plan you always wanted. **** It would be much easier to just bypass it!
*** Re-inventing the database for modern demands *** Various ways of doing it, not all of which are mutually exclusive *** Sometimes simpler data models, sometimes more complex *** Generally, more flexible data models, as the Web introduces a desire for less painful live upgrades *** Most importantly, the application makes less assumptions of the behaviour of the database **** Eventual consistency: updates may 'take a while' - to allow for asynchronous replication **** Perhaps updates might be duplicated, delayed, or re-ordered - to allow for operation in the face of network partitions **** Generally, the removal of ACID properties
*** Re-inventing the database for modern demands *** Various ways of doing it, not all of which are mutually exclusive *** Sometimes simpler data models, sometimes more complex *** Generally, more flexible data models, as the Web introduces a desire for less painful live upgrades *** Most importantly, the application makes less assumptions of the behaviour of the database **** Eventual consistency: updates may 'take a while' - to allow for asynchronous replication **** Perhaps updates might be duplicated, delayed, or re-ordered - to allow for operation in the face of network partitions **** Generally, the removal of ACID properties