1. 1 | P a g e
SaaS and Multi-Tenancy – Foundational Concepts
Version 1.0
2. 2 | P a g e
Document Revision History
Author Version Date Comments
Jeelani Shaik 1.0 06/20/2012 First version
3. 3 | P a g e
1. Fundamentals
1.1 What is multi-tenancy?
Multi-tenancy is the ability to server multiple (many) tenants and is a critical feature in a SaaS world
where a service is deployed as a hosted service and accessed over the Internet by many tenants. A well-
designed SaaS application is scalable, multi-tenant-efficient and configurable. The term multi-tenancy
in general is applied to software development to indicate an architecture in which a single running
instance of an application simultaneously serves multiple clients (tenants). Isolating information (data,
branding and theming, customizations, etc.) pertaining to the various tenants is a particular challenge in
these systems. The key component will be the data owned by each tenant stored in the database; hence
the data is also referred to as multi-tenant data.
1.2 SaaS maturity levels
Writing custom code to customize end user experience for a financial institute the traditional way will
bring a new set of challenges in a SaaS world and will not be economical and poses challenges in
allowing us to market on time. Instead of customizing the application the traditional way, leveraging
metadata to configure the way the product appears and behaves for its users is the preferred approach.
The design and architecture challenge is to ensure that the task of configuring the product is simple and
easy for the customers, without incurring extra development and operation cost for each configuration.
The separation of data and configuration is depicted using various SaaS maturity levels based on
“spectrum of isolation”.
1.2.1 Maturity Level 1 - Nothing Shared
In this architecture, each FI has its own customized version of the hosted instance deployed on the
host’s server.
1.2.2 Maturity Level 2 - Configurable
In this architecture, the SaaS provider hosts a separate instance for each FI (Tenant). In the Nothing
Shared, each instance is individually customized for the FI (Tenant), in ‘Configurable’, all instances use
the same code implementation, and the vendor meets customer’s needs by providing detailed
configuration options that allow the customer to change how the application looks and behaves to its
users. Despite being identical to one another at the code level, each instance remains wholly isolated
from all the others.
1.2.3 Maturity Level 3 - Configurable, Multi-Tenant-Efficient
At the third level of maturity, the processor runs a single instance that serves every customer (financial
institute), with configurable metadata providing a unique user experience and feature set for each one.
Authorization and security policies ensure that each customer's data is kept separate from that of other
4. 4 | P a g e
customers; and, from the end user's perspective, there is no indication that the application instance is
being shared among multiple tenants.
This approach eliminates the need to provide server space for as many instances as the processor has
customers (financial institutes), allowing for much more efficient use of computing resources than the
second level, which translates directly to lower costs. A significant disadvantage of this approach is that
the scalability of the application is limited. Unless partitioning is used to manage database performance,
the application can be scaled only by moving it to a more powerful server (scaling up), until diminishing
returns make it impossible to add more power cost-effectively.
1.2.4 Maturity Level 4 - Scalable, Configurable, Multi-Tenant-Efficient
At the fourth and final level of maturity, the processor hosts multiple customers (financial institutes) on
a load-balanced farm of identical instances, with each customer's data kept separate, and with
configurable metadata providing a unique user experience and feature set for each customer. A SaaS
system is scalable to an arbitrarily large number of customers, because the number of servers and
instances on the back end can be increased or decreased as necessary to match demand, without
requiring additional re-architecting of the application, and changes or fixes can be rolled out to
thousands of tenants as easily as a single tenant.
1.3 Choosing a Maturity Level
One might expect the fourth level to be the ultimate goal for any SaaS application, but this isn't always
the case. It may be more helpful to think of SaaS maturity as a continuum between isolated data and
code on one end, and shared data and code on the other (see the diagram below).
Where the SaaS architecture should fall along this continuum depends on architectural, deployment and
operational needs of SaaS provider (Processor) as explained below. As you can see, all of these
considerations are interrelated to some degree.
§ Business – The maturity level should be decided by keeping the financial aspects into
consideration. The isolation is governed by customer’s SLA and may often conflict with the
benefits of shared approach.
§ Architecture – The architecture should allow customization and extensions through easy
configurations.
§ Regulations – Regulations such as FFIEC might dictate the architecture and isolation of data.
§ Operations – Feasibility of satisfying SLAs without compromising on isolation. Impact to
customers when a maintenance schedule is performed.
5. 5 | P a g e
2. Multi-tenant Data
Multi-tenant data is critical in designing and developing SaaS services. There are three main approaches
to isolating information (data) in these multi-tenant systems, which go hand-in-hand with different
database schema definitions and JDBC setups. Each approach has pros and cons as well as specific
techniques and considerations. The following diagram depicts these approaches with respect to
isolation/shared. In addition to these three, there is a hybrid approach that leverages both two and
three.
2.1 Separate Database
Each tenant's data is kept in a physically separate database instance. Database (JDBC) Connections
would point specifically to each database, so any pooling would be per-tenant. A general application
approach here would be to define a database Connection pool per-tenant and to select the pool to use
based on the “tenant identifier” associated with the currently logged in user.
Computing resources and application code are generally shared between all the tenants on a server, but
each tenant has its own set of data that remains logically isolated from data that belongs to all other
tenants. Metadata associates each database with the correct tenant, and database security prevents any
tenant from accidentally or maliciously accessing other tenants' data.
Giving each tenant its own database makes it easy to extend the application's data model to meet
tenants' individual needs, and restoring a tenant's data from backups in the event of a failure is a
relatively simple procedure. Unfortunately, this approach tends to lead to higher costs for maintaining
equipment and backing up tenant data. Hardware costs are also higher than they are under alternative
approaches, as the number of tenants that can be housed on a given database server is limited by the
number of databases that the server can support.
Separating tenant data into individual databases is the "premium" approach, and the relatively high
hardware and maintenance requirements and costs make it appropriate for customers that are willing
to pay extra for added security and customization (as this approach provides a very strong data isolation
requirements), and may not consider an application that does not supply each tenant with its own
individual database.
6. 6 | P a g e
2.2 Separate Schema
Like the isolated approach, the separate-schema approach is relatively easy to implement, and tenants
can extend the data model as easily as with the separate-database approach. (Tables are created from a
standard default set, but once they are created they no longer need to conform to the default set, and
tenants may add or modify columns and even tables as desired.) This approach offers a moderate
degree of logical data isolation for security-conscious tenants, though not as much as a completely
isolated system would, and can support a larger number of tenants per database server.
A significant drawback of the separate-schema approach is tenant data is harder to restore in the event
of a failure. If each tenant has its own database, restoring a single tenant's data means simply restoring
the database from the most recent backup. With a separate-schema application, restoring the entire
database would mean overwriting the data of every tenant on the same database with backup data,
regardless of whether each one has experienced any loss or not. Therefore, to restore a single
customer's data, the database administrator may have to restore the database to a temporary server,
and then import the customer's tables into the production server—a complicated and potentially time-
consuming task.
The separate schema approach is appropriate for applications that use a relatively small number of
database tables, on the order of about 100 tables per tenant or fewer. This approach can typically
accommodate more tenants per server than the separate-database approach can, so you can offer the
application at a lower cost, as long as your customers will accept having their data co-located with that
of other tenants.
Each tenant's data is kept in a distinct database schema on a single database instance. There are two
different ways to define JDBC Connections here:
§ Connections could point specifically to each schema, as we saw with the separate database
approach. This is an option provided when the driver supports naming the default schema in the
connection URL or if the pooling mechanism supports naming a schema to use for its
Connections. Using this approach, we would have a distinct JDBC Connection pool per-tenant
where the pool to use would be selected based on the “tenant identifier” associated with the
currently logged in user.
§ Connections could point to the database itself (using some default schema) but the Connections
would be altered using the SQL SET SCHEMA (or similar) command. Using this approach, we
7. 7 | P a g e
would have a single JDBC Connection pool for use to service all tenants, but before using the
Connection it would be altered to reference the schema named by the “tenant
identifier” associated with the currently logged in user.
2.3 Partitioned (discriminator) Data
All data is kept in a single database schema. The data for each tenant is partitioned by the use of
partition value or discriminator known as ‘Tenant Identifier’. The complexity of this discriminator might
range from a simple column value to a complex SQL formula. Again, this approach would use a single
Connection pool to service all tenants. However, in this approach the application needs to alter each and
every SQL statement sent to the database to reference the “tenant identifier” discriminator.
Of the three approaches explained here, the shared schema approach has the lowest hardware and
backup costs, because it allows you to serve the largest number of tenants per database server.
However, because multiple tenants share the same database tables, this approach may incur additional
development effort in the area of security, to ensure that tenants can never access other tenants' data,
even in the event of unexpected bugs or attacks.
The procedure for restoring data for a tenant is similar to that for the shared-schema approach, with the
additional complication that individual rows in the production database must be deleted and then
reinserted from the temporary database. If there are a very large number of rows in the affected tables,
this can cause performance to suffer noticeably for all the tenants. The shared-schema approach is
appropriate when it is important that the application be capable of serving a large number of tenants
with a small number of servers, and prospective customers are willing to surrender data isolation in
exchange for the lower costs that this approach makes possible.
8. 8 | P a g e
Tenant Id Account Number Date
100 11111111 05-06-2012
225 2222222222 05-12-2012
456 33333333333 06-14-2012
2.4 Hybrid Approach
This is a hybrid version of separate schema and partitioned data. In this design, all the common static
and non-sensitive data is placed in a common schema and all the tenant specific sensitive and
transactional data is placed in tenant specific schema.
3. Choosing an Approach
Each of the three approaches described above offers its own set of benefits and tradeoffs that make it
an appropriate model to follow in some cases and not in others, as determined by a number of business
and technical considerations. Some of these considerations are listed below.
3.1 Economic Considerations
Applications optimized for a shared approach tend to require a larger development effort than
applications designed using a more isolated approach (because of the relative complexity of developing
a shared architecture), resulting in higher initial costs. Because they can support more tenants per
server, however, their ongoing operational costs tend to be lower.
The diagram below shows the shift in priorities from On-premises to SaaS.
The shared schema approach can end up saving you money over the long run, but it does require a
larger initial development effort before it can start producing revenue. If you are unable to fund a
development effort of the size necessary to build a shared schema application, or if you need to bring
9. 9 | P a g e
your application to market more quickly than a large-scale development effort would allow, you may
have to consider a more isolated approach.
3.2 Security Considerations
As your application will store sensitive tenant data, prospective customers will have high expectations
about security and your service level agreements (SLAs) will need to provide strong data safety
guarantees. A common misconception holds that only physical isolation can provide an appropriate level
of security. In fact, data stored using a shared approach can also provide strong data safety, but requires
the use of more sophisticated design patterns.
3.3 Tenant Considerations
The number, nature, and needs of the tenants you expect to serve all affect your data architecture
decision in different ways. Some of the following questions may bias you toward a more isolated
approach, while others may bias you toward a more shared approach.
§ How many prospective tenants do you expect to target? You may be nowhere near being able to
estimate prospective use with authority, but think in terms of orders of magnitude: are you
building an application for hundreds of tenants? Thousands? Tens of thousands? More? The
larger you expect your tenant base to be, the more likely you will want to consider a more
shared approach.
§ How much storage space do you expect the average tenant's data to occupy? If you expect
some or all tenants to store very large amounts of data, the separate-database approach is
probably best. (Indeed, data storage requirements may force you to adopt a separate-database
model anyway. If so, it will be much easier to design the application that way from the beginning
than to move to a separate-database approach later on.)
§ How many concurrent end users do you expect the average tenant to support? The larger the
number, the more appropriate a more isolated approach will be to meet end-user requirements.
§ Do you expect to offer any per-tenant value-added services, such as per-tenant backup and
restore capability? Such services are easier to offer through a more isolated approach.
10. 10 | P a g e
Tenant-related factors and how they affect "isolated versus shared" data architecture decisions
3.4 Regulatory Considerations
Processors and Financial Institutes are subjected to regulatory laws such as FFIEC that can affect their
security and record storage needs. Hence we need to investigate the regulatory environments that your
prospective customers occupy in the markets in which you expect to operate, and determine whether
they present any considerations that will affect your decision.
3.5 Realizing Multi-Tenant Data Architecture
A well-designed SaaS application is distinguished by three qualities: scalability, configurability,
and multi-tenant efficiency. The table below lists the patterns appropriate for each of the three
approaches, divided into sections representing these three qualities.
Optimizing for multi-tenant efficiency in a shared environment must not compromise the level of
security safeguarding data access. The security patterns listed below demonstrate how you can design
an application with "virtual isolation" through mechanisms such as permissions, SQL views, and
encryption.
Configurability allows SaaS tenants to alter the way the application appears and behaves without
requiring a separate application instance for each individual tenant. The extensibility patterns describe
possible ways you can implement a data model that tenants can extend and configure individually to
meet their needs.
11. 11 | P a g e
The approach you choose for your SaaS application's data architecture will affect the options available
to you for scaling it to accommodate more tenants or heavier usage. The scalability patterns address the
different challenges posed by scaling shared as well as dedicated databases.
3.5 Patterns for Multi-Tenant Applications
Approach Security Patterns Extensibility Patterns Scalability Patterns
Separate Databases
§ Trusted Database
Connections
§ Secure Database Tables
§ Tenant Data Encryption
§ Custom Columns § Single Tenant Scale-
out
Shared Database,
Separate Schemas
§ Trusted Database
Connections
§ Secure Database Tables
§ Tenant Data Encryption
§ Custom Columns § Tenant-Based
Horizontal
Partitioning
Shared Database,
Shared Schema
§ Trusted Database
Connections
§ Tenant View Filter
§ Tenant Data Encryption
§ Pre-allocated Fields
§ Name-Value Pairs
§ Tenant-Based
Horizontal
Partitioning
4. Architecture Considerations
Multi-tenancy imposes more challenges than those required for a white-labeled product both at the
front end as well as at the backend mainly the database. When a service is deployed as a SaaS service in
a hosted environment and accessed over the Internet by various customers (Financial Institutes), apart
from business requirements, equally important and critical are the non-functional requirements (NFRs).
4.1 Key Attributes of Multi-Tenant Architecture
From architecture’s point of view, there are three key differentiators that separate a well-designed SaaS
application from that of a poorly designed: scalable, multi-tenant-efficient, and configurable.
Scaling the application means maximizing concurrency and using application resources more efficiently -
for example, optimizing locking duration, statelessness, sharing pooled resources such as threads and
network connections, caching reference data, and partitioning large databases.
Multi-tenancy may be the most significant paradigm shift that an architect accustomed to designing
isolated, single-tenant applications has to make. For example, when a user at a financial institute
accesses information, the application instance that the user connects to may be accommodating users
from dozens, or even hundreds, of other financial institutes - all completely unknown to any of the
12. 12 | P a g e
users. This requires an architecture that maximizes the sharing of resources across tenants, but that is
still able to differentiate data belonging to different customers.
Of course, if a single application instance on a single server has to accommodate users from several
different companies at once, you can't simply write custom code to customize the end-user
experience—anything you do to customize the application for one customer will change the application
for other customers as well. Instead of customizing the application in the traditional sense, the
architecture should leverage metadata to configure the way the application appears and behaves for its
users. The challenge is to ensure that the task of configuring applications is simple and easy for the
customers, without incurring extra development or operation costs for each configuration.
4.2 Configuration using Metadata
Metadata, at the service level as well as in the database, is used for managing application configuration
for individual tenants. Business Services interact with the metadata (services) in order to retrieve
information that describes configurations and extensions that are specific to each tenant.
In a mature SaaS application, the metadata service provides customers with the primary means of
customizing and configuring the application to meet their needs. Typically, customers can make
configuration changes in four broad areas:
§ User interface and branding – Refers to the ability to modify the user interface to reflect their
corporate branding, and therefore SaaS applications typically offer features that allow
customers to change UI aspects such as graphics, colors, fonts, and so on.
§ Workflow and business rule - To be of use to a wide range of potential customers, a business-
critical SaaS application has to be able to accommodate differences in workflow.
§ Extensions to the data model - An extensible data model gives customers the freedom to
enhance the data model to meet their specific data needs.
§ Access control - Typically, each customer is responsible for creating individual accounts for end
users, and for determining which resources and functions each user should be allowed to
access. Access rights and restrictions for each user are tracked by using security policies, which
should be configurable by each tenant.
To provide customers with flexibility in configuring the software as necessary, these options are
organized into hierarchical configuration units known as scopes, each of which contains options for
making changes in each of the four areas listed above. Every customer has a top-level scope that it can
configure as needed, and the customer may establish one or more scopes underneath the top level in an
arbitrary hierarchy. A relationship strategy determines how and whether child nodes inherit and
override configuration settings from parent nodes.
For example, a typical mega-processor may have several financial institutes with distinct needs, all of
which must follow certain processor-wide standards, but also must be able to configure some aspects of
the application individually. Within each financial institute as well, there may be organizational groups
that have their own special configuration needs. For each of these identified organizational units, the
customer can establish a scope that gives the group access to the configuration options that it may set
or change.
13. 13 | P a g e
Unlike traditional vendor-customized line-of-business applications, SaaS applications are much more
likely to be configured by customers themselves. Designing the configuration interface is therefore
almost as important as designing the interface for end users. Ideally, customers should be able to
configure the application through a wizard, or through simple, intuitive screens that present all available
options without causing information overload and that clearly distinguish between options that can and
cannot be changed within a given scope.
Appendix:
Multi-tenancy in Hibernate: http://docs.jboss.org/hibernate/orm/4.1/devguide/en-US/html/ch07.html