Assessing product-image quality is important in the context of online shopping. A high quality image that
conveys more information about a product can boost the buyer’s confidence and can get more attention.
However, the notion of image quality for product-images is not the same as that in other domains. The
perception of quality of product-images depends not only on various photographic quality features but also
on various high level features such as clarity of the foreground or goodness of the background etc. In this
paper, we define a notion of product-image quality based on various such features. We conduct a crowdsourced
experiment to collect user judgments on thousands of eBay’s images. We formulate a multi-class
classification problem for modeling image quality by classifying images into good, fair and poor quality based
on the guided perceptual notions from the judges. We also conduct experiments with regression using average
crowd-sourced human judgments as target. We compute a pseudo-regression score with expected average of
predicted classes and also compute a score from the regression technique. We design many experiments with
various sampling and voting schemes with crowd-sourced data and construct various experimental image
quality models. Most of our models have reasonable accuracies (greater or equal to 70%) on test data set.
We observe that our computed image quality score has a high (0.66) rank correlation with average votes
from the crowd sourced human judgments.
6. What is product image quality?
• We care mainly about product images.
• Product images have specific characteristics.
eBay Confidential
7. Computing image quality
• Machine learning problem.
• Factors from images to construct feature
vectors.
• Label the data points as one of the classes.
• Build a classifier to get the class probabilities.
• Alternatively, make a regression model from
human judgment data.
eBay Confidential
8. What factors? How do we compute?
• Size factors such as area, aspect ratio.
• Image attributes such as brightness,
saturation, colorfulness, contrast, dynamic
range.
• Factors based on background and foreground
segmentation.
eBay Confidential
10. Colorfulness
• Difference between a color against gray.
• Many empirical notions.
• We use an empirical expression designed in
Natural Color Space. (NCS)
• This space has a concept of rg and yb
coordinates.
• stdev (rg,yb)+ 0.3 mean (rg,yb)
• We compute this globally and for foreground.
eBay Confidential
14. Dynamic range
• Variants of expressions in photography
literature and in computer vision.
• We are using a simpler definition used in
photography based on range of gray scale
intensity.
eBay Confidential
20. Background and foreground area ratio
• Use segmented image.
• An approximation is used by using ratio of
pixels in the foreground and in the
background.
eBay Confidential
23. Properties of background
• stdev of lightness (distance from white in
RGB.)
• Mean of lightness. (RGB)
• A score on uniformity of background intensity
that approximates texture properties.
eBay Confidential
26. Crowdsourcing
• Has its own challenges.
• Require thoughts for framing questions.
• Require thoughts for conducting the
experiment.
• Cheap labelers can attempt cheating.
• Classifier result can be different based on
voting techniques used to find the label.
• More judgments are better.
eBay Confidential
28. Professional Quality Images
• Mostly white, light or uniform background.
• Image is free from compression artifact such
as blurring.
• Professionally photographed in proper lighting
condition .
• Subject has a reasonable size and is in focus.
• Example of such images can be seen in
branded retail websites.
eBay Confidential
30. Poor images
• Poor or dark background
• Can have incomprehensible texture.
• Subject small.
• Subject unclear.
• Bad aspect ratio.
• Poor resolution and photography.
eBay Confidential
32. Fair images
• These are images that are not poor. However,
they are not as clean as professional looking
photos. (Add examples)
eBay Confidential
33. How do we develop the model
• Multi-class classification with two data
sources.
• Direct Regression with crowd-sourced data.
• Used Gradient boosted tree.
eBay Confidential
34. Factor importance in Quality Classifier
(Classifier)
• Background lightness
• Brightness
• Aspect ratio
• Dynamic range
• Background foreground area ratio
• Michelson contrast
eBay Confidential
35. Error Rates
• Poor misclassification 10%
• Fair misclassification 50% [BAD]
• Good misclassification 7%
• However, our training data is so far not
perfect.
eBay Confidential
36. Quality Score for Classification
• The quality score is the expected average of
the class weights using the class probabilities.
• Currently class weights are simple linear
function that maps poor, fair and good to
1,2,3.
eBay Confidential