Anametrix is a distributed data acquisition, processing and visualization platform that allows structured and unstructured
data to be made available for reporting, visualization and data federation. To meet the extreme demands of its clients,
Anametrix operates a cloud-based multi-tenant analytics platform that allows clients to gain analytical capabilities without upfront costs and investments in server and processing infrastructure.This white paper explains the patent-pending technology that makes the Anametrix platform fast, scalable and secure for any type of application.
2. Overview
Anametrix is a distributed data acquisition, processing and visualization platform that allows structured and unstructured
data to be made available for reporting, visualization and data federation. To meet the extreme demands of its clients,
Anametrix operates a cloud-based multi-tenant analytics platform that allows clients to gain analytical capabilities without
upfront costs and investments in server and processing infrastructure.
This white paper explains the patent-pending technology that makes the Anametrix platform fast, scalable and secure for any
type of application.
INTRODUCTION
A change in the way organizations access and manage data has created a major shift in the way software applications are
designed, built, and accessed. Today, advances in technologies such as broadband Internet access and service-oriented
architectures (SOAs) have created an environment more adept for handling and processing large amounts of data. However,
the cost inefficiencies surrounding the management of on-premises applications are also driving a transition toward the
delivery of Web-based services, or software as a service (SaaS). Anametrix utilizes a SaaS platform to deliver its robust
solution to clients around the world.
THE MULTI-TENANT ARCHITECTURE
To reduce the delivery cost of providing the same application to many different clients, a number of applications are
multi-tenant rather than single-tenant. A multi-tenant application can satisfy the needs of multiple tenants (companies or
departments within a company, etc.) using the hardware resources and staff needed to manage just a single software
instance. This allows for a dedicated set of resources to fulfill the needs of many organizations.
This unique architecture is structured in such a way that tenants using multi-tenant services operate in virtual isolation from
one another. This allows organizations to use and customize an application as though they each have a separate instance.
However, their data and customizations remain secure and insulated from the activity of all other tenants. The single
application instance effectively morphs at runtime for any particular tenant at any given time.
Multitenancy is a win-win situation to both application providers and users. Economies of scale are leveraged and the cost of
hardware resources is much less than that required by on-premise applications. As a result, a relatively small, experienced
administrative staff can efficiently manage only one stack of software and hardware, and developers can build and support
a single code base on just one platform (operating system, database, etc.) rather than many. Also, because multi-tenant
application is a single large community hosted by the provider itself, operational information from a collective user population
(which queries respond slowly, what errors happen, etc.) can be more easily obtained. This information can then be used to
make frequent improvements to the services that benefit the entire user community.
The above advantages of multitenancy allow the application provider to offer a service to end users at a much lower cost.
Some additional benefits of multitenancy include a higher degree of quality, user satisfaction, and customer retention.
3. DATA ACQUISITION
Anametrix utilizes several complimentary techniques for acquiring data from the various data sources that combine into a
multi-channel data repository. (Figure A)
Primarily, three methods are used for data acquisition:
API-based connections: Anametrix uses a 3rd party API to download and integrate report data into the Anametrix data
warehouse. This typically happens on a set schedule that is determined in accordance with recommendations from the API
provider.
Batched data uploads from various sources: This is an approach that is often used for client-specific data uploads from
internal databases, end-user uploads from Anametrix tools (such as the Excel Client) or from 3rd parties. Batch uploads can
happen on demand or on a schedule.
Live web-based data acquisition: This is the preferred mode for web analytics and is also the method with least delay between data being created and reporting availability. In the web analytics scenario, client-side data collection (also known as
page tags, web beacons, pixel technology and “web bugs”) are utilized to send real-time data to the Anametrix cloud for
direct integration into the Anametrix data warehouse.
4. DATA TIMELINESS
The Anametrix cloud makes acquired data available in real-time. Anametrix is always “as real-time as the source data”, meaning that data will be integrated as quickly as possible within the constraints placed by third parties. In particular, certain data
sets may be finalized only once a day and will subsequently only be available to the Anametrix interface on the same schedule, while others will be query-able instantly as they happen.
SESSIONIZATION, DATA CLEANSING AND STRUCTURING
Sessionization refers to how the Anametrix solution is able to order a sequence of actions or requests made by an individual during the course of an interaction or “session” as part of a series of transactions made available to the Anametrix cloud.
Sessionization capabilities allow Anametrix to extract and visualize essential information contained within data streams. With
sessionization for Web Analytics, you can determine where visitors get lost or frustrated, how deeply they go into content, and
where the opportunities are for site organizational improvements. Without a sessionization method, log files and page tags
have no reliable way of determining that the individual who viewed page one is the same person who viewed page two.
To ensure that acquired data is actionable and report-ready, Anametrix will also apply a layer of data appropriate cleansing
and restructuring to data that is provided for integration. The actual amount of transformation needed varies by data source
but may involve large amounts of pre-processing for data that with low entropy (in other words, low amount of actionable
information per transaction) to direct data imports for report-ready data.
QUERIES, DATA VISUALIZATION, AND EXTRACTION
The Anametrix distributed query engine by Anametrix is a comprehensive, real-time, cloud-based data storage and retrieval
service that enables all products to provide real-time query ability for clients while leveraging a multi-tenant processing architecture.
Anametrix receives billions of rows of client-supplied data each month and continuously integrates all acquired data in data
centers. The system is responsible for handling incoming data, structuring, processing and making it available to the query
engine for instant availability to the end user.
All data that is made available is replicated across a shared distributed query system. Data integrity and safety is ensured by
an intelligent software layer that takes logical and physical parameters into account when storing data. In particular, the system is aware of the physical characteristics of each Anametrix storage system. Data is replicated, there is no single point of
failure and data is spread evenly across servers, switches, server cabinets and data centers to guard against logical, physical,
and geographical failures.
5. Conclusions
The Anametrix approach for managed data acquisition, processing, visualization and reporting provides significant cost savings. Internet-based, shared computing platforms are attractive because they let businesses quickly access hosted, managed
software assets on demand and altogether avoid the costs and complexity associated with the purchase, installation, configuration, and ongoing maintenance of an on-premise data center. Dedicated hardware, software, and accompanying administrative staff are not needed and result in additional cost savings for businesses.
The Anametrix platform provides world class security, proven scalability, performance and high availability.
Anametrix continually monitors and gathers operational information from the Anametrix cloud. These are used to help drive
incremental improvements and new features that benefit existing and new clients.
ABOUT ANAMETRIX
Anametrix transforms businesses with marketing analytics. We collect, analyze and make sense out of data across all
marketing channels in real time to enable marketers to discover new truths about customers, prospects and the market at
large. Anametrix delivers 360-degree visibility into business data to uncover new trends and hidden correlations, explore new
relationships and deliver a bigger and more predictable impact on revenue. Founded in 2010 by the trailblazing web analytics
team behind WebSideStory, Anametrix has headquarters in San Diego, Calif.
For more information, visit our Website, Twitter, Facebook, Google+, and our Blog.