This paper, by cloud security provider PerspecSys, offers a high-level overview of tokenization as a data protection and obfuscation technique in the cloud. It also discusses the PCI Data Security Council’s tokenization standards.
[Read More: http://www.perspecsys.com/resources/resource-center/knowledge-series/tokenization-for-cloud-data-protection/]
1. TOKENIZATION FOR CLOUD DATA PROTECTION
Tokenization is a process where a sensitive data field is replaced with a surrogate value called a token.
De-tokenization is the reverse process of replacing a token with its associated clear text value.
Depending on the particular implementation of a tokenization solution, tokens can be used to achieve
compliance with requirements that stipulate how sensitive data needs to be treated and secured by
companies in order to adhere to guidelines such as PCI DSS, HITECH & HIPAA, ITAR, and Gramm–
Leach–Bliley. Whether sensitive data resides within on-premise systems or in the cloud, transmission,
storage and processing of tokens instead of original data are acknowledged industry-standard
methods for securing sensitive information.
How Does Tokenization Differ From Encryption?
Encryption is an obfuscation approach that uses a cipher algorithm to mathematically transform data.
The resulting encrypted value can be transformed back to the original value via the use of a key. While
encryption can be used to obfuscate a value, a mathematical link back to its true form still exists.
Tokenization is unique in that it completely removes the original data from the systems in which the
tokens reside. As such, depending on an enterprise’s objectives, tokenization offers some advantages:
• Tokens cannot be returned to their corresponding clear text values without access to the original
“look-up” table that matches them to their original values. These tables are typically kept in a
database in a secure location inside a company’s firewall.
• Tokens can be made to maintain the same structure and data type as their original values. While
format-preserving encryption can retain the structure and data type, it’s still reversible to the
original given the key and algorithm.
• Unlike encrypted values which express the relative length of their clear text value, tokens can be
generated such that they do not have any relationship to the length of the original value.
• Because tokens cannot be mathematically reversed back to their original values, tokenization is
frequently the de facto approach to addressing data residency. Depending on the countries in
which they operate, companies often face strict regulatory guidelines governing their treatment
of sensitive customer and employee information. These data residency laws mandate that
certain types of information must remain within a defined geographic jurisdiction. In cloud
environments, where data centers can be located in various parts of the world, tokenization can
be used to keep sensitive data local (resident) while tokens are stored in the cloud.
1