This document provides an overview of Floyd Morgan's role at Intuit and the Live Community search capabilities at TurboTax. It includes:
- Floyd Morgan is a Principal Software Engineer at Intuit who works on the TurboTax tax engine and Live Community platform.
- The TurboTax Live Community is a large Q&A system that sees high volume during tax season, with over 23 million users, 32 million page views, and 750,000 answered questions.
- Live Community Search utilizes Apache Solr for capabilities like auto-suggest, in-product search, and presenting similar questions/answers to aid users.
4. Intuit QuickBase
Intuit Inc. is a leading provider of business and financial management solutions
for small and mid-sized businesses; financial institutions, including banks and
credit unions; consumers and accounting professionals.
More
than
200
applica0ons
and
7700
employees
worldwide.
6. TurboTax is the nation’s No. 1 rated, best-selling, do-it-yourself tax
preparation software. TurboTax helps more than 20 million people a
year.
$1 billion in revenue
7. About Me
• Principal Software Engineer at Intuit
• TurboTax Engineering
– Core tax engine
8. About Me
• Principal Software Engineer at Intuit
• TurboTax Engineering
– Core tax engine
– TurboTax Online
9. About Me
• Principal Software Engineer at Intuit
• TurboTax Engineering
– Core tax engine
– TurboTax Online
– TurboTax Live Community
10. About Me
• Principal Software Engineer at Intuit
• TurboTax Engineering
– Core tax engine
– TurboTax Online
– TurboTax Live Community
• Central Technology Organization
– Live Community Platform
13. About Live Community
• It’s a user contribution system
– Q&A
• It can be integrated into an application, contextually
– Page-to-page relevance
14. About Live Community
• It’s a user contribution system
– Q&A
• It can be integrated into an application, contextually
– Page-to-page relevance
• We use social, technology and data
– To create our value proposition…assisting users
15. About Live Community
• It’s a user contribution system
– Q&A
• It can be integrated into an application, contextually
– Page-to-page relevance
• We use social, technology and data
– To create our value proposition…assisting users
• We launched our Beta in 2007
– TurboTax Online Home & Business
16. About Live Community
• It’s a user contribution system
– Q&A
• It can be integrated into an application, contextually
– Page-to-page relevance
• We use social, technology and data
– To create our value proposition…assisting users
• We launched our Beta in 2007
– TurboTax Online Home & Business
• We use open source…primarily open source
– Apache HTTP, Ruby on Rails, MySQL, memcached ...
17. About Live Community
• It’s a user contribution system
– Q&A
• It can be integrated into an application, contextually
– Page-to-page relevance
• We use social, technology and data
– To create our value proposition…assisting users
• We launched our Beta in 2007
– TurboTax Online Home & Business
• We use open source…primarily open source
– Apache HTTP, Ruby on Rails, MySQL, memcached ...
• It’s a platform
– APIs, skinning, dynamic provisioning (AWS in progress)
29. About TurboTax Live Community
• Largest community
– 150+ servers, 200 thousand concurrent users
30. About TurboTax Live Community
• Largest community
– 150+ servers, 200 thousand concurrent users
• Over 23 million users have used the service
– Over 8 million last tax season alone
31. About TurboTax Live Community
• Largest community
– 150+ servers, 200 thousand concurrent users
• Over 23 million users have used the service
– Over 8 million last tax season alone
• Over 32 million pages views last tax season
– In-product views in the billions
32. About TurboTax Live Community
• Largest community
– 150+ servers, 200 thousand concurrent users
• Over 23 million users have used the service
– Over 8 million last tax season alone
• Over 32 million pages views last tax season
– In-product views in the billions
• Over 750 thousand answered questions
– 10 thousand questions asked on peak day
33. About TurboTax Live Community
• Largest community
– 150+ servers, 200 thousand concurrent users
• Over 23 million users have used the service
– Over 8 million last tax season alone
• Over 32 million pages views last tax season
– In-product views in the billions
• Over 750 thousand answered questions
– 10 thousand questions asked on peak day
• Our contributors answers thousands of
questions
– Top contributor – 70 thousand answers
38. Why Solr?
• Lots of features/functionality
• Ease of integration
39. Why Solr?
• Lots of features/functionality
• Ease of integration
• We can scale it independently
40. Why Solr?
• Lots of features/functionality
• Ease of integration
• We can scale it independently
• You’ll need some search expertise…that’s
ok
– Community and Lucid Imagination!
41. Why Solr?
• Lots of features/functionality
• Ease of integration
• We can scale it independently
• You’ll need some search expertise…that’s
ok
– Community and Lucid Imagination!
• Search is really important
– Search everywhere…
42. Why Solr?
• Lots of features/functionality
• Ease of integration
• We can scale it independently
• You’ll need some search expertise…that’s
ok
– Community and Lucid Imagination!
• Search is really important
– Search everywhere…
50. Auto suggest
• Provides a glimpse of our vast content
• facet query (Solr 1.2)
• We use NLP…
51. Auto suggest
• Provides a glimpse of our vast content
• facet query (Solr 1.2)
• We use NLP…
• It’s used on every search touch point
52. Auto suggest
• Provides a glimpse of our vast content
• facet query (Solr 1.2)
• We use NLP…
• It’s used on every search touch point
• Second most frequent request
58. In-product “mini” search
• Primary search interface for consumers
• It appears integrated
• Now the most utilized search interface
59. In-product “mini” search
• Primary search interface for consumers
• It appears integrated
• Now the most utilized search interface
• It makes all content available
60. In-product “mini” search
• Primary search interface for consumers
• It appears integrated
• Now the most utilized search interface
• It makes all content available
• Over 3 million users last tax season
61. # using Solr is easy!
require 'solr’
c = Solr::Connection.new(
"http://localhost:8090/solr/posts" )
c.search( "how do i input 1099”,
:filter_queries => "post_status: #
{Post::ANSWERED}" )
66. Web-site “full” search
• Primary search interface for contributors
and employees
• More real estate, more facets, more
suggestions ...
67. Web-site “full” search
• Primary search interface for contributors
and employees
• More real estate, more facets, more
suggestions ...
• Faceted search empowers development
teams to narrow on issues
68. Web-site “full” search
• Primary search interface for contributors
and employees
• More real estate, more facets, more
suggestions ...
• Faceted search empowers development
teams to narrow on issues
• 200+ TurboTax issues discovered last tax
season
69.
70.
71. # using Solr is easy!
require 'solr’
c = Solr::Connection.new(
"http://localhost:8090/solr/posts" )
c.search( ”bug”,
:filter_queries => "post_status: #
{Post::OPEN}" )
76. Instant answer
• Present similar answered question
• Search with the terms of the new question
• Narrow the focus to the subject
77. Instant answer
• Present similar answered question
• Search with the terms of the new question
• Narrow the focus to the subject
• Show snippet of a recommended answer
78. Instant answer
• Present similar answered question
• Search with the terms of the new question
• Narrow the focus to the subject
• Show snippet of a recommended answer
• Accidental A/B test
85. Instant question
• Present similar unanswered questions
• Answer reuse
• Search with the terms of the answered
question
86. Instant question
• Present similar unanswered questions
• Answer reuse
• Search with the terms of the answered
question
• Narrow the focus to the subject
87. Instant question
• Present similar unanswered questions
• Answer reuse
• Search with the terms of the answered
question
• Narrow the focus to the subject
• We also use a date filter
93. Answer bot
• We continue to search for you
– The day after you ask
94. Answer bot
• We continue to search for you
– The day after you ask
• Send an email
95. Answer bot
• We continue to search for you
– The day after you ask
• Send an email
• Runs for 7 days
96. Answer bot
• We continue to search for you
– The day after you ask
• Send an email
• Runs for 7 days
• We only send another email if the results
have changed
97. Answer bot
• We continue to search for you
– The day after you ask
• Send an email
• Runs for 7 days
• We only send another email if the results
have changed
• From our explicit feedback
– 39% answered question
102. Advertising
• We use our user generated content in
advertising
• Has 300% higher click through rate than
static banner ads
103. Advertising
• We use our user generated content in
advertising
• Has 300% higher click through rate than
static banner ads
• Ads displayed throughout the tax season
on many ad networks
104. Advertising
• We use our user generated content in
advertising
• Has 300% higher click through rate than
static banner ads
• Ads displayed throughout the tax season
on many ad networks
• Content selection is automated and
continuous
107. <?xml version="1.0" encoding="UTF-8"?>
<lc_trending end_date="2011-05-21" include_popular="true" type="queries" duration="day">
<topic>
<rank>1</rank>
<text>Ptp</text>
<post>
<post_id>aBHMBWxzar4lKMacfArRo0</post_id>
<subject>Final K-1 Disposition of PTP Units</subject>
<detail>I bought units in a PTP in five separate transactions in 2008; I sold all my
units in five separate transactions in 2010. TT does not allow me to report all 5
transactions while stepping through the K-1 form -- these transactions are reported on
Schedule D, but also need to be on Form 4797, Part II, Box 10. I can't seem to make the
linkage work. I would appreciate some guidance on how to make this happen.</detail>
<response>OK, several steps needed for your situation:
1) on the K-1 on the screen entitled Describe the Partnership Disposal, choose "Disposition
was not via a sale"
2) Then search for the topic "sale of business property" - you will be taked to a topic
entitled "Any Other Property Sales?" - select the first option. Ove rthe next few screens
here you will have the opportunityut to enter the sale amounts associated witht he Form
4797.
3) then choose the topic on the income landing table for "Stocke, Mutual Funds, Bonds,
other - here you will enter the rest of the sale, that portion attributable to capital
gains.
Hope this helps you,
</response>
<viewsCount>60</viewsCount>
<answersCount>2</answersCount>
<asker>Xuxan</asker>
<display_post_url>https://ttlc.intuit.com/post/show_full/aBHMBWxzar4lKMacfArRo0?
rmode=ad</display_post_url>
</post>
114. Search everywhere
• Search first, ask second
– Used to be ask first, search later or never!
• Auto complete everywhere too
– 64 bit Linux, 10 (8 core) slaves, 300 req/s
115. Search everywhere
• Search first, ask second
– Used to be ask first, search later or never!
• Auto complete everywhere too
– 64 bit Linux, 10 (8 core) slaves, 300 req/s
• Search requests
– 900 % increase
116. Search everywhere
• Search first, ask second
– Used to be ask first, search later or never!
• Auto complete everywhere too
– 64 bit Linux, 10 (8 core) slaves, 300 req/s
• Search requests
– 900 % increase
• Questions asked
– 50 % decrease…is that good?
117. Search everywhere
• Search first, ask second
– Used to be ask first, search later or never!
• Auto complete everywhere too
– 64 bit Linux, 10 (8 core) slaves, 300 req/s
• Search requests
– 900 % increase
• Questions asked
– 50 % decrease…is that good?
• Increased consumption
– 38% users, 43% content…very good!
131. regular expressions (many)
if text =~ / any/
text.gsub!(/ any where /, ' anywhere ')
text.gsub!(/ any(body| body| one) /, ' anyone ')
text.gsub!(/ any( thing| things|things) /, ' anything ')
text.gsub!(/ any(one|thing|where) else /, ' any1 ’)
end
if text =~ / don /
text.gsub!(/ don i /, ' do not i ')
text.gsub!(/ don (have|know|see|want) /, ' do not 1 ')
text.gsub!(/ (are|be|have|is|was|were) don /, ' 1 done ’)
text.gsub!(/ don (not|nt|t) /, ' do not ’)
end
text.gsub!(/ (do|can) (ai|ii) /, ' 1 i ’)
text.gsub!(/ d (oyou|you) /, ' do you ')
text.gsub!(/ (1|ai|ii|my) (did|do|had|have|was) /, ' i 2 ’)
text.gsub!(/ crap{1,10} /, ' crap ’)
text.gsub!(/ gr{1,} /, ' ')
132. Spell Checker
Stemmer (Porter)
Word Collocation
Stop Phrase Correction
Stop Word Removal
Synonyms Substitution
Tax Domain Correction
Phrase Encoding
133. # NLP is not easy!
# this class wraps our NLP
sf = SemanticFilter.new
# does it work?
sf.act_on_post( "HwO do iput 10 99 i don,t
know what to do need help help me." )
=>[" wheretoent 1099 ”]
sf.act_on_post( "Where do I enter a 1099?" )
=>[" wheretoent 1099 ”]
134. NLP
• Search is not enough…unfortunately
• Our domain is noisy…ugly at times
• How it works…
• It works well, but it’s not perfect
136. NLP
• Search is not enough…unfortunately
• Our domain is noisy…ugly at times
• How it works…
• It works well, but it’s not perfect
• Not just for search…
141. Recommendations
• Deliver unanswered questions to
contributors
• Too much content to scan manually
• Based on past answering behavior
• Recommend a question to multiple
contributors
142. Recommendations
• Deliver unanswered questions to
contributors
• Too much content to scan manually
• Based on past answering behavior
• Recommend a question to multiple
contributors
• Uses Mahout machine learning library
143. Answered Unanswered
NLP NLP
User Post
vectors vectors
Mahout
Heuristics
147. Next Steps
• We’re going to rewrite it! … most of it ;)
• Real-time indexing
148. Next Steps
• We’re going to rewrite it! … most of it ;)
• Real-time indexing
• Question vs. Query
149. Next Steps
• We’re going to rewrite it! … most of it ;)
• Real-time indexing
• Question vs. Query
• Social feedback
– Page ranking
150. Next Steps
• We’re going to rewrite it! … most of it ;)
• Real-time indexing
• Question vs. Query
• Social feedback
– Page ranking
• Social dictionaries
– Content classification
151. Next Steps
• We’re going to rewrite it! … most of it ;)
• Real-time indexing
• Question vs. Query
• Social feedback
– Page ranking
• Social dictionaries
– Content classification
• Beer?!