Since spectral data is significantly higher-dimensional than colorimetric data, the choice of operating in a spectral domain brings memory, storage and computational throughput hits with it. While spectral compression techniques exist, e.g., on the basis of Multivariate Analysis (mainly Principal Component Analysis and related methods), they result in representations of spectra that no longer have a direct physical meaning in that their individual val- ues no longer directly express properties at a specific wavelength interval. As a result, such compressed spectral data is not suitable for direct application of physically meaningful computation and analysis. The framework presented here is an evolution and exten- sion of the spectral correlation profile published before. It is a simple model, driven by a few adjustable parameters, that allows for the generation of nearly arbitrary, but physically realistic, spectra that can be computed efficiently, and are useful over a wide range of conditions. A practical application of its principles then includes a spectral compression approach that relies on dis- carding spectral wavelengths that are most redundant, given cor- relation to their neighbors. The goodness of representing realistic spectra is evaluated using the MIPE metric as applied to the SOCS and other databases as a reference. The end result is an efficient, yet physically meaningful, compressed spectral representation that benefits computation, transmission and storage of spectral content.
Minimal Residual Disease in leukaemia andhematological malignancies
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correlation Profile
1. ANALYSIS AND COMPRESSION
OF REFLECTANCE DATA
USING AN EVOLVED
SPECTRAL CORRELATION PROFILE
Peter Morovič*, Ján Morovič*, Michael Brillº, Eric Walowit
*Hewlett Packard Company, ºDatacolor Inc
2. OUTLINE
• Motivation
• How far did we get for last year’s CIC21?
• Spectral analysis using neighborhood profiles
• Relative/Absolute Spectral Neighborhood Range (R/ASN)
• Spectral Neighborhood Distribution (SND)
• Spectral Neighborhood Correlation (SNC)
• Spectral compression using SNC profile
• Evaluating spectral compression
• Conclusions
2
3. MOTIVATION
• Spectral data is the basis of all color and imaging research and applications
• Variety of data processed, but stimuli have spectral properties among their causes
• Needed to provide solutions (e.g. multiple illuminants, observers (human/machine),
materials)
• Challenge: spectral data is higher dimensional than colorimetric data → requires more
storage, may also require more computation and operating memory
• But: spectra don’t have dimensionality matching number of samples
• linear combinations of 3-8 basic “spectra” give very close approximations of measurements
taken at 16 or even 31 wavelengths
• PCA typically used to reduce dimensionality and compress spectral data, but:
• PCA weights strip data of range
• PCA weights have no physical meaning → not suitable for physical analysis or computation
3
4. MEANINGS OF “PHYSICAL MEANING”
• Allows a wavelength-by-wavelength analysis such as
Kubelka-Munk, in coded-and-decoded (codec) state:
Compression will save storage but not cpu time.
• Example: wavelength-derivative encoding, but not PCA
• Allows a wavelength-by-wavelength analysis such as
Kubelka-Munk, in coded state: saves storage and cpu time.
• Example: Neither wavelength-derivative nor PCA
• An option allowing both “physical meanings”—our way—is
to drop certain wavelengths with redundant information.
4
7. SPECTRAL NEIGHBORHOOD PROFILES
• CIC21 “spectral correlation profile” - intuitive, first attempt
• Now: formal exposition + new domains:
• Relative Spectral Neighborhood Range (same as CIC21)
• Absolute Spectral Neighborhood Range
• Spectral Neighborhood Distribution
• Spectral Neighborhood Correlation
7
8. RELATIVE SPECTRAL NEIGHBORHOOD RANGE PROFILE
• M reflectances S, with N equal-interval spectral samples (i.e., S is an M x N
matrix)
• Relative Spectral Neighborhood Range Profile defined as pair of (N-1) –
vectors cmin and cmax where at each wavelength λi:
cmin(i) = MINj=1:M S(j, i) - S(j, i+1)
cmax(i) = MAXj=1:M S(j, i) - S(j, i+1)
• I.e., cmin & cmax are lower and upper bounds of neighboring wavelength sample
differences
• if negative → at least one case where reflectance is increasing
• if positive → all reflectances are decreasing between λi and λi+1
8
9. CHECKING & GENERATING REFLECTANCES
• 1 x N reflectance vector s satisfies rSNR profile, if for all neighboring wavelengths λi and λi+1
following inequalities hold:
• cmin(i) ≤ s(i) - s(i+1) ≤ cmax(i)
• Synthetic reflectances can be generated from an rSNR profile progressively (where superscript
1 refers to the upper and 2 to the lower limit branch at each step):
• Note, following scheme samples extremes of spectral “envelope” - any value within its limits
can be samples by weighting cmax(i) or cmin(i)
• Generated spectra are clipped to valid range [0,1]
9
10. ABSOLUTE SPECTRAL NEIGHBORHOOD RANGE PROFILE
• rSNR does not consider the offset of actual values at any one wavelength
• Range of data represented by two (N-1)-vectors vmin and vmax
• For rSNR these are implicitly at 0 and 1 respectively
• They are the minimum and maximum reflectance values at any
wavelength λi over the whole S:
vmin(i) = MINj=1:M S(j, i)
vmax(i) = MAXj=1:M S(j, i)
• Generating reflectances: vmin and vmax used as starting point and limit for
progressive process described for rSNR
10
11. SPECTRAL NEIGHBORHOOD DISTRIBUTION PROFILE
• Distributions, instead of only ranges, of absolute differences across
wavelengths
• N-1 distributions computed from S - e.g., approximated by normal
distribution
• At each wavelength: mean & standard deviation of pair of wavelengths
under scrutiny:
dμ(i) = MEANj=1:M S(j, i) - S(j, i+1)
dσ(i) = STDDj=1:M S(j, i) - S(j, i+1)
• Statistical synthesis: new data consistent in probability of values at λ.
11
13. SPECTRAL NEIGHBORHOOD CORRELATION PROFILE
• Small per-neighboring-wavelength ranges yield high correlation coefficients
• BUT: converse not true for large ranges:
• correlation coefficients can be small if distribution is narrow and the range
is wide due to outliers, or
• large if distribution has spread in the data
• (N-1)-vector r computed to express degree to which sets A and B of m
neighboring wavelengths are correlated:
13
17. APPLICATION: OPTIMAL SPECTRAL SAMPLING
• Aim: how to select the optimal set of non-uniform spectral bands for
representing a data set → SNCP
• Given a data set with N equidistant wavelength samples:
• Compute Spectral Neighborhood Correlation Profile (SNCP)
• For i=1:N-1, progressively pick i wavelengths with lowest correlation
• For each of N-1 sets of i wavelengths
• Interpolate values at wavelengths i from full data set
• Compare against original spectral data using MIPE metric (∆E00s under
173 illuminants)
17
18. SNCP CODEC SUMMARY
• Code: drop samples where the correlation between
wavelengths is greatest
• Decode: interpolate among remaining samples
• Properties: physically-meaningful, compressed
representation that can also be suitable for further
computation performed in same way as on canonical,
equidistant wavelength sample representations (e.g.,
Kubelka-Munk).
18
24. CONCLUSIONS
• The relationships between neighboring wavelength intervals are an inherent
characteristic of single reflectance or their sets
• New approaches presented for their characterization that allow for:
• gamut-aware synthesis
• synthesis that preserves original data set’s difference distributions
• dimensionality reduction that takes advantage of highly correlated neighbors
• SNCP enabled physically-meaningful, compressed representation that can also be
suitable for further computation
• Spectral neighborhood based techniques are a useful extension to existing methods of
spectral analysis and processing
• Good starting point to revisiting various applications in the future: e.g., choice of spectral
data used for camera characterization, which authors will explore next
24