1. Recommendations
@ American Express
Abhijit Bose, Henry H Yuan and Huiming Qu
Data Science and Engineering
American Express Company
MLConf, SanSan Francisco, CA
MLConf, Francisco, CA
November 15, 2013 2013
1November 15,
3. Our closed loop gives us direct relationships with millions
of buyers and sellers
and a wealth of information
about buyers and sellers
MLConf, San Francisco, CA
November 15, 2013
3
4. Our products must adhere to
the highest standards
Trust and security have been the hallmarks of the American
Express brand for more than 160 years.
Turning good data into more tailored and targeted commerce
does not change our privacy policies and principles.
We know customers need transparency and clear
explanations.
We use data to better serve our customers. We do not sell
personally identifiable information in any context.
MLConf, San Francisco, CA
November 15, 2013
4
5. Data Scientists @ American Express
Diverse backgrounds (MS, MBA, PhD):
computer science
electrical engineering
physics
mechanical engineering statistics
economics
operations research
A mix of American Express talent and alumni Of:
and others
MLConf, San Francisco, CA
November 15, 2013
5
11. What a Typical Transaction Looks Like
Merchant Name
Merchant Street
Address
Merchant Zip Code
Total Amount
Amex card used
Transaction ID (useful
for history, e.g. returns,
tips, etc)
Transaction
Timestamp
MLConf, San Francisco, CA
November 15, 2013
11
13. General Approaches
Collaborative Filtering
- Recommend what similar users like explicitly or implicitly.
Content based
- Recommend similar items solely based on the content of items.
Hybrid
- Combines the above two.
MLConf, San Francisco, CA
November 15, 2013
13
14. Input to MyOffers
Find the most relevant merchant offers for our cardmembers,
with closed loop data and “real time” context.
Transactional
History
Lifestyle
Attributes
AXP Internal
19-Nov-13
MLConf, San Francisco, CA
November 15, 2013
16. Lessons Learnt
•Agile development for shorter cycle
•Platform and software challenges
•Noisy signals, e.g. taxicabs
•Cold-start issue
•Local vs. Online
MLConf, San Francisco, CA
November 15, 2013
16
17. Lesson Learnt – Geo-Fencing is Critical
MLConf, San Francisco, CA
November 15, 2013
17
18. Current Focus is to build out an end-to-end
platform and a rich experimentation layer
•Centralization of data
•Better algorithms
•Better incorporation of customer feedback
MLConf, San Francisco, CA
November 15, 2013
18
20. We are Hiring!
Build the next generation of:
- Recommendation systems
- Graph Algorithms
- Machine Learning algorithms for Marketing,
Fraud and a variety of problems
- Data products
- Experiments
MLConf, San Francisco, CA
November 15, 2013
20
21. Please Contact us at:
Abhijit Bose
VP, Data Science & Engineering
http://www.linkedin.com/in/abose
abhijit.bose@aexp.com
Henry Yuan
Director, Data Science
http://www.linkedin.com/pub/henry-yuan/4/29b/9bb
henry.h.yuan@aexp.com
Huiming Qu
Sr. Data Scientist, Data Science & Engineering
http://www.linkedin.com/pub/huiming-qu/4/400/b82
huiming.qu@aexp.com
MLConf, San Francisco, CA
November 15, 2013
21
Notes de l'éditeur
We are a global financial products and services company with a very rich history of innovation, best-in-class analytics and industry-leading customer experience. Two numbers I would like to draw your attention are the cards-in-force (102 million) and $888 in annual purchase volume in the Amex network. That gives you an idea of the millions of transactions every day that flows through our network. What makes us unique and sets us apart from every other financial services company is our closed loop..which is next slide.
We are a closed loop because we acquire both card members (which are both consumers and businesses) as well as merchants into our network. This allows us to have a deeper understanding of how our card members make their purchases, and allows us to deploy best-in-class solutions for marketing and fraud protection. This is different from other card-issuing banks and networks. For them, the issuing bank acquires the customer and the network acquires the merchants. So, neither has the complete end-to-end insight into every transaction, every customer and merchant. This wealth of information about our buyers and sellers make our data extremely valuable.
We also take our customers’ trust and security with utmost care. Seven years in a row, we have received the highest award in JD Powers customer satisfaction surveys for financial services companies. At American Express, we’ve always had strong privacy policies in place to inform customers of how and when we use their information, as well as to offer them the ability to opt-out of or opt-in to information sharing. This hasn’t changed in the digital age. We take the trust that customers have in our brand very seriously. We use data to better serve our customers. We do not sell personally identifiable information in any context.
Before going into our recommender platform, I wanted to give you a brief intro of our team. We are a mix of Amex internal talent and recent hires from Google, Yahoo, M6D, IBM Research and other places. We also have a very diverse set of skills which as you know is very critical when setting up a data science team. Some of us are great with math/statistics, some are great with building systems, some with algorithm development, some with analytics…together we have been very successful in building out a number of platforms within a very short time…most of what we will talk today have been designed and deployed in the past 12 months. Next Huiming will describe the different recommendation products at American Express that we have built.
We use recommender systems in many areas to better serve our customers and merchant partners. I will list just a few as examples.
In early 2012, Amex launchedan industry first mobile offer engine that recommends and ranks relevant merchant offers in real time for US Cardmembers based on their spending history and location.You can find "My Offers" feature on the American Express iPhone App -- where millions of Cardmembers currently manage their Card accounts in a convenient and secure environment.My top offer is texas de brazil, which is a brazilian barbeque restaurant franchise that has over 30 locations in 11 states. I always claim to be a salad person, but I guess my spend discloses my secret craving. We developed our mobile offer engine with three key points of differentiation in mind: relevance, convenience and value.My Offers leverages the company's closed loop network to connect Cardmembers and merchants in order to deliver value to both.----Opt-in for locationhttp://about.americanexpress.com/news/pr/2012/gosocial.aspx
Mobile is not the only channel where amex smart offer technology is being used. Amex cardmembers can find relavant offers when they sync with tripadvisor, facebook, foursquare, and twitter. The sync process is through a secure channel and your card number will never be shared. Sync functionality not only provides more relevant offers but also enables a new social commerce experience. Now cardmembers who sync their eligible cards with twitter can tweet special #hashtags can buy American Express Gift Cards and products from Amazon, Sony, andXbox 360. Three are three simple steps after you’ve sync your card, tweet the special hashtag, watch for confirmation hashtag, and tweet the confirmation hashtag within 15 mins to confirm. Your order will be shipped to your billing address in 2 days.For services that our customers opt into (such as My Offers, Foursquare, TripAdvisor), we provide specific information regarding that platform at registration. Card Member data is never passed to partners. Additionally, our online privacy statement describes how we collect, use, share or retain online data.
Besides, mobile and social media, we also use the insights gained through our spend information to personalize our website. For each cardmember logging into their account, we will personalize the page in real time. We extract customer interests such as tennis or greek restaurants through spend graph and use those information for more relevant, better targeted and higher converting offers.-------- MBTWe also create personalized experience for merchant websites. When merchants log in our website, they will not only see their general transaction summaries but they will be able to view transaction trends happening within their merchant peer set this month. We make sure privacy is well protected through aggregation and generalization. (PUT IN INFORMATION DIRECTLY FROM WEBSITE)
-------- MBTWe also create personalized experience for merchant websites. When merchants log in our website, they will not only be able to track their current customer activity and sales performance (such as average transaction size, and average transactions per customer), but also they will be able to compare with other businesses where their customers or customers like them may use their Amecrican express cards. Of course, we make sure when comparison is shown, merchant’s private information is well protected through aggregation and generalization.
Typical transaction information includes merchant name and address. Card used for the transactions. Transaction id, time, and amount. Basically all the information on your receipt and more.
We use our unique close loop data to provide personalization or recommending offers. Our input includes customer purchase history, customer profiles, merchant profiles, and some contextual information such as geo locations if the customer are opt-ed in. We leverage both traditional and digital channels such as email, online, and social media to deploy personalized messages to our customers and merchants provided that it is complaint with our privacy guidelines and customer explicit preferences.
Adopted by many leaders in different industry such as Netflix, Amazon, etc.Cons: cold start, sparsity, popularity bias, etc.Used by Amazon book recommendation, pandora, etc.Cons: requires high quality meaningful features of items, and users tastes needs to be learnable functions of those features.We used a hybrid approach. We will go through an application example to illustrate how we deal with problems with collaborative filtering.
When we launched myoffers, we talked about spend graph….We leverage transactions as shown in the previous page as inputsWe also have other Lifestyle attributes like interests gleaned from transaction information, some of them machine learned separately.for people who opt in, we also leverage their geo location, time, and feedback like thumps up& thumps down for real time adjustmentThe goal is to find the most relevant offers for our cardmembers
This is a high level of the myoffers ecosystem. It is a great engineering and algorithm challenge for us to integrate all the existing and new platform together.It starts with the offer depository system. It can come from both self service website and client manager system.Offer contents and together with transaction and other life style attributes was fed into the batch system which runs on Hadoop cluster. Hadoop does the heavy lifting calculation, like similarity matrix and other weighting and rules.Our real time system – Solr gets contextual information from channels, like mobile, and query from batch pre calculated results. Then produce a real time recommendation to be sent to different channels.What’s unique about us is the smart card syncing system. User does not need any coupons. The system recognizes the synced the card and qualified transaction. We are able to track the results both for both Merchant reporting and feedback to recommender system.
1)New product development cycle use to be relatively long for the large financial service company. We realized early on that we need a new format. We have data science team and software engineering team sit on the same location. We have set up war room for daily meeting. e.g. one thing we were constantly testing is what looks good in the lab vs. what can be put into production.2)Platform software challenges, early on we are using some combination of Hive and Mahout. The batch jobs ran over 20 hours/day, almost around the clock. Very quickly we started to run map reduce job, and today it runs at a faction of the original time.3)There are lot of noisy signals in our transactional data. Taxi cabs. We have developed some logics to reduce or eliminate them4)There are cold start issues on both cardmember and merchant side. We use profile cross reference matrix and dithering logic to solve the issue.5) Unlike some online only recommender applications, Amex has store front and online merchant, some merchants do both. We have ran some experiment to find the right balance to rank them. Which brings us to the next page.
For the merchant with store front we found its important to balance the distant and recommender scores. It will be different from urban to rural. New York and LA. within New York, some people will never cross the river to shop on the new jersey side even if geographically it might be closer.
Amex has a lot of source of data from different system. We are working on to create a centralized data depository. We are also working on experiment and both in lab and production system for better algorithms. We made some progress but love to do more on incorporation of customers feedback.
This page shows some of the technologies we used either for production or for prototyping. Abhijit will talk through some of these in case of questions.