Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Thnm API X.  Dnxhlnlmrzl

Pam
 Thoth Predictor

Thmh Curr

   
   

- A’ [mum u‘.  Thoth Monitor

 

What Is Thoth

 

Eus...
® Thoth

Real-time Solr Monitoring & Analysis @ Vtrulia
Sr.  Software Engineer.   

On Tru| ia's search team.  KW ¢

I also help manage the search 
infrastructure and create inte...
- Innovation project at Trulia

- To help us understand our search
infrastructure without touching logs

- To help trouble...
-; ,’: ‘..  The Beginning

- Innovation project at Trulia

- To help us understand our search
infrastructure without touch...
/1—-

; r A;  ~ Aspirations

47

~ Not just a single tool but a modular
system

- Aset of tools that can help gather info,...
@ Definitions

- Search infrastructure

- Solr request =  Solr query +
additional info (e. g: QTime, 
Hits)
W

   

   
  
    
  

Problem:  Understanding a
Search Infrastructure

Let's just use logs! 

 
      
   
     
   

- ...
,’]7" Thoth: 
” , .— ‘ Use Solr to Understand Solr

4 5

- Solr is amazing.  We all know that
- We can search on it
- We h...
Basics & Architecture

g. ..= .—-u-

 

xmmn. .. .  M . .u. ... ..

5.. .». 

run. -,-m. ..m. ... .
-unwind‘

Q-Pym. 
--n....
Basic Concepts

1 solr request =  1 thoth solr document

Server Info

hostname,  port number,  core
name,  pool name

iill...
/-—n _
1.» 3; ~ Basic Concepts

< p»

 Data collection should be smooth. 

No slowing down of traffic. 

 We care about re...
r
Sea ch servers J bus

  

  
   

 

Thoth core
indexing

Thoth
dashboard



' V‘ API
Thoth <3
Thoth Index

predictor Th...
Thoth Core

Q The Collection <rI: qLlL’sl>-ieilrdlcr rvzirrru sc-loci cute:  ‘solr Sczirclrmiriulur
delmi me‘
<flfl rmmr:  ...
® The Collection

 
      
   
   
   
     
   
     
   

- Interceptor on each search server

- We use a So| rComponent...
. 
'_ tr.  nlrzactivemqsisfre

     
       

are


  
   

 ze   
cm eque. ui. .ng: :biu. tfe po. lling"'-“a5iQ: i:Eiint§...
Solr Search
sewer

exu. -DIICI‘
'9-QC ‘Q5 reqA

Queue System

     

Thoth indexing

.  .
CIUGW
Thoth thoth
index document
- We need granular information for real-time data
- Less granularity for historical data
- Too much data =  slow search,  ...
Q Thoth Index

     
     
        

- Solr 4.7
- Soft commit on real time core

- maxTime set to ls

- Auto commit set to...
/Eh

   
      

Thoth API & Dashboard

°rr. ..r. e.. la. ..a

-Vlsuul «.99».  on ream nu
- Uselul grams by server ur Iry ...
® Thoth API

  

- Provide abstraction for Thoth index and thoth
data

- Read only rest-like API

- JSON response

- Node....
/~— 
I ‘ Thoth Dashboard

r"
‘ j

- Visual insight on Thoth data

- Useful graphs by server or by pool

- Handy list of sl...
From 2014rO‘? r1312.00 00 To 2014.0‘? /1512 00 0 Server

Port limo:  Core

4»

THUTH

Avg number of queries 0 [Query count...
Zero Hits count Q j' Zero Hits count @

 

'! =1lu-spiral-'1|9l '. U1u. t=rIIEIlh§so2At¢l| liIlE Vlul-. t-)‘ 
Y! .~i1-lili...
rum:  7nI4In9/1.1170000 To anumnnmswnnn Saw» mm. <n< :  mm man:  Corn vrwm

: '(gI ii!  ms:  it= :.~. 

   

 

Okvgmnheto...
ill! 

 

-- I| llI“if1.(lIl-'3  -

8
S

 

; v.rmn)v~m«v=1-1 ‘ : ur-vrmruinmu : 'ni'IA-1(I<nl-‘Kl

.0   11.082 5 10.324 s...
': f"| vé! I_10Ih: ! ‘I"Io'1$1I1

Qlllula

C

 

<r‘L. 

 

-ll! ’

f. ~mu1=u‘I:1(= l~"n1»:1il= I-.  _~‘€= If= H>
=1!=1n—<...
- we znr, ulma: )1nrr1n'1rv1mi"

 

  

whatcmwnnu
MnAurMslma2

    
 

m Knurrr 1.1 mlrrmalnnrl
Dan we mm 4. o K111:-Aeag...
‘f if:  ‘ What Can We Do
 . .«- With All This Data? 

4 7

- Rich source of information
- Can we turn it into knowledge? 
...
Learning on
Collected Data

- Query execution time
prediction

- Analysis of zero hit
quenes

- Analysis of query
excepflon...
Q Query Time Prediction

- Look at query attributes
- query text
- start parameter
- facets,  rangeQueries, 
geoSpatia|  s...
Query Time
Prediction

    
 
     
     

  

- Imbalanced dataset

- Frequency of model
training

- Type of Model
:. T.

/ 

: - GBM Based Predictor

4

V 500 gradient boosted

USES

 Experimental results: 

 AUC:  0.98103
 Accuracy:  0...
tr ' , _ V ‘ Nagios - Qtime monitor . 
T .  .. __ . . - # Zero hits queries
‘ ‘fill @ - You name it
PoolA
_ 7 _ mom mamlor:...
Futcire Direction

 
   
   
   
  

_. 

6 Future Direction

- Predicting query time
buckets - regression v/ s
classificat...
® Future Direction

- Predicting query time
buckets - regression vls
classification

- Exceptions and zero hit
query analy...
i’; :_‘« Future Direction

~ Dashboard additions:  real-time
r Dashboard integration with Solr

cloud

l~ More metrics to ...
Y0 IIAWG,  I HEAIIII Yflll lII(E
Slllll

 

80 I PIIT Slllli INSIIJE
sou:  

    
 
 
       
 

  

Damiano :  dbraga@trul...
Prochain SlideShare
Chargement dans…5
×
Prochain SlideShare
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damiano Braga & Praneet Mhatre, Trulia
Suivant
Télécharger pour lire hors ligne et voir en mode plein écran

2

Partager

Télécharger pour lire hors ligne

Thoth: Real-time Solr Monitoring & Analysis at Trulia

Télécharger pour lire hors ligne

Downtown SF Lucene/Solr Meetup - September 17. "Thoth: Real-time Solr Monitoring & Analysis at Trulia", presented by Damiano Braga and Praneet Mhatre, Trulia.

Thoth: Real-time Solr Monitoring & Analysis at Trulia

  1. 1. Thnm API X. Dnxhlnlmrzl Pam Thoth Predictor Thmh Curr - A’ [mum u‘. Thoth Monitor What Is Thoth Eusu ~ «. Av('ImA-, rmr-- Vmuu mu “ "Y e Thoth Real-time Salr Monitoring & Analysis @ Vtrulia
  2. 2. ® Thoth Real-time Solr Monitoring & Analysis @ Vtrulia
  3. 3. Sr. Software Engineer. On Tru| ia's search team. KW ¢ I also help manage the search infrastructure and create internal » go tools to help the scaling process. ( I _ xi Damiano Braga Data Mining Engineer. On Tru| ia's algorithms team. I work on market trends and stats, comparable homes, home value estimates etc. Praneet Mhatre
  4. 4. - Innovation project at Trulia - To help us understand our search infrastructure without touching logs - To help troubleshoot search performance issues search server) - Log rotation. what we need? Problem: Understanding a Search Infrastructure Let's just use logs! - Good info, sometimes partial. - Decentralized data (at least 1 log per - A lot of logs. how to quickly search First attempt: crunch data with hadoop - Not just a single tool but a modular system ~ A set of tools that can help gather info. monitor, understand a search infrastmcture - Open source project What Is Thoth Q Definitions - Search infrastructure - Solr request = Solr query + additional info (e. g: Qfime, Hits) O Use Solr to Understand Solr - Solr is amazing. We all know that - We can search on it - We have some handy features for free: facets. stats etc - It's scalable
  5. 5. -; ,’: ‘.. The Beginning - Innovation project at Trulia - To help us understand our search infrastructure without touching logs - To help troubleshoot search performance issues
  6. 6. /1—- ; r A; ~ Aspirations 47 ~ Not just a single tool but a modular system - Aset of tools that can help gather info, monitor, understand a search infrastructure - Open source project
  7. 7. @ Definitions - Search infrastructure - Solr request = Solr query + additional info (e. g: QTime, Hits)
  8. 8. W Problem: Understanding a Search Infrastructure Let's just use logs! - Good info, sometimes partial. - Decentralized data (at least 1 log per search server) - Log rotation. - A lot of logs. . how to quickly search what we need? First attempt: crunch data with hadoop
  9. 9. ,’]7" Thoth: ” , .— ‘ Use Solr to Understand Solr 4 5 - Solr is amazing. We all know that - We can search on it - We have some handy features for free: facets, stats etc - lt’s scalable
  10. 10. Basics & Architecture g. ..= .—-u- xmmn. .. . M . .u. ... .. 5.. .». run. -,-m. ..m. ... . -unwind‘ Q-Pym. --n. «a. .uu—i u-. .-an-nun
  11. 11. Basic Concepts 1 solr request = 1 thoth solr document Server Info hostname, port number, core name, pool name iillll" Query Info timestamp, actual query, qtime, hits, exception?
  12. 12. /-—n _ 1.» 3; ~ Basic Concepts < p» Data collection should be smooth. No slowing down of traffic. We care about real time data r We care about historical data Dataset is fast growing
  13. 13. r Sea ch servers J bus Thoth core indexing Thoth dashboard ' V‘ API Thoth <3 Thoth Index predictor Thoth Monitor
  14. 14. Thoth Core Q The Collection <rI: qLlL’sl>-ieilrdlcr rvzirrru sc-loci cute: ‘solr Sczirclrmiriulur delmi me‘ <flfl rmmr: an components‘- . s1r= <.rlr2a’irvr= r*'q<r'sir> - inrelcepmr on each search server »r-err» - We use .1 snircorrrnnnnrrr mrncnea in n I/ vwneswavdlev > request handler < We can collect requests exceptions wrmoui | ailirIfl logs - We use a Queue sysrem (e g ArllveMQj in lacililale and temporary store messages - Why not a lrig urillttlrll 7 - Importance of pcrlnrmanc i stnh iry <searctrComr. ~op: ni vrarue: 'soir2ac1rvem<i‘ rl1< rnm r. n|rZr1r: ivemr: SrllrTnIrlvDMQ£rw"rInrierr mum lxuxcr rrr-~ut. .llu. r< tr» tiiuelirqrulokervt/ rt -f~'. k.w-/ lril ciivwmi tau-er «leer aim lyur: etitlc-1ie(r‘5lI« . nivemi1—hm-<er<Ie<tinallivri name’ -resifllreirflv in» «ii rrn<rr. nrrrr- r*ui: nll>rr<i<lsI'> -S0lrZAcnveMQ <lriIniiNP: .rrirnn«r' on i»‘m| > . zaur search server has a manifest In rsu um--c mi lmuv-mm -u- "in visa]- l r xrril <3[IVlr_Illll’->LJii. |)1E1IzlVV| e mu 2 -ur 3-. » 5‘"‘°" '9 ~irii nan-e— nlr2a: llvemn»btiiier—>l1e . .»ac<nrri. ~ <nrn. ur«»_ nrzmn-aura rleuueurilg riurer , ,.rr. ..g >5oD< rub wrirriarr rrlrlarlrvemrrriuwrkrnrrivemnrnnlllrvj ma </ scarclfiwvpulrcrrf’ © Data 7 - we need granular rnrurmarron for real lime data - Less granularity for historical data 4 You rrrucrr darn slow searcll. space problem i Dlleuesyilrm . Shnnklng feature Each x rrrlnlrres lrilormalinn Exiiactlorl and real lime core cleanup WWI-awn: , _ r — ‘— . . . u. ... ... ... r., . nm, query , t. .., ... t.r. ..r. Sam M W‘ rM: i“tr““ rullnrnure " ‘ L 4 mrarmrrrerr an . nnr. r. in». — ' G Thoth Index ' Sol! 4.7 Soil commit on real ume core - maxTime sel to 15 Aulo commit set to 155 Update chain set to enforce UUlD as PKlD Solrj clleni
  15. 15. ® The Collection - Interceptor on each search server - We use a So| rComponent attached to a request handler. - We can collect requests, exceptions without tailing logs - We use a Queue System (e. g: ActiveMQ) to facilitate and temporary store messages - Why not a log collector ? - Importance of performance & stability - So| r2ActiveMQ - Each search server has a manifest in so| rconfig. xmI
  16. 16. . '_ tr. nlrzactivemqsisfre are ze cm eque. ui. .ng: :biu. tfe po. lling"'-“a5iQ: i:Eiint§a za? iiem= q~cheiik: ~:a: cTtive: mc[—po. IIing“; >5.f_slinta
  17. 17. Solr Search sewer exu. -DIICI‘ '9-QC ‘Q5 reqA Queue System Thoth indexing . . CIUGW Thoth thoth index document
  18. 18. - We need granular information for real-time data - Less granularity for historical data - Too much data = slow search, space problem - Shrinking feature. Each X minutes information extraction and real time core cleanup Data newer than last shrinking time 5 real time core Data older than last shrinking time . shrank core , L. .. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... Thoth index
  19. 19. Q Thoth Index - Solr 4.7 - Soft commit on real time core - maxTime set to ls - Auto commit set to 15s - Update chain set to enforce UUID as PKID - Solrj client
  20. 20. /Eh Thoth API & Dashboard °rr. ..r. e.. la. ..a -Vlsuul «.99». on ream nu - Uselul grams by server ur Iry uuol - Hnnflr Irstal sirrw mlrles and Eileclhiiis . 5t| !*’1iriq VIEW: based an in - Snnrltile URL5 (095 mum Q4 Iurrll
  21. 21. ® Thoth API - Provide abstraction for Thoth index and thoth data - Read only rest-like API - JSON response - Node. js Example: curl thoth:3001/api/ server/ foo/ corelbarlportlportbarl start/ NOW-1DAY/ end/ NOW/ count/ nqueries {"numFound":95,"va| ues": [{"timestamp": "2014-09-16T18:0O:02Z". "va| ue":453 37}, {"timestamp": "2014-O9-16T18:15:O2Z", "va| ue":778 25}, {"timestamp": "2014-O9-16T18:30:02Z", "va| ue":109 523; {"timestamp": "2014-09-16T18:45:02Z". "va| ue":112 279L {"timestamp": "2014-O9-16T19:00:02Z". "va| ue":115 334}
  22. 22. /~— I ‘ Thoth Dashboard r" ‘ j - Visual insight on Thoth data - Useful graphs by server or by pool - Handy list of slow queries and exceptions - Selecting views based on time - Sharable URLs (OPS team, QA team )
  23. 23. From 2014rO‘? r1312.00 00 To 2014.0‘? /1512 00 0 Server Port limo: Core 4» THUTH Avg number of queries 0 [Query count Q) mums 1-‘ma; PIJDIS EXDEPIIDOS SIDUJ IJUERIES Rtlllllllll 13.187 124548 09/I3 12.0092 . ‘ 09/13 12.00.02 09/“ l7.' Value: Time: 0 0 09/13 110092 . 09/13 1100.02 09/14 17. - -
  24. 24. Zero Hits count Q j' Zero Hits count @ '! =1lu-spiral-'1|9l '. U1u. t=rIIEIlh§so2At¢l| liIlE Vlul-. t-)‘ Y! .~i1-liliér 'ivJt<u<; M -rI_k'r1:. l | 'i“"A ~. ,~« I " “‘ 'L‘"vf . "‘-ea)/ ‘ vlVlr, wt. . .1 ' . _,~. - r. r . . -—r"’. "’ : . vi I§"#§lLl'l'l-A-kl vi-/ iemi-1 -'. - . "_ii«: -A-is -i-/ ir. -mt-: - Avg query time — sec (9 Distribution of query times (sec) (9 'I= _tlc_i-r ‘itruar lr': Ji'. '.a'-. iiII)_. l'I ', :14'. 'as'i_| X‘l‘§(': 'l )9:-Ii'. '.. ',~rE_1IL u; I; 'r: l'. lIliil—l(rii)| ' "‘t"°" ‘Y5-1-I ~-. A - xv-. ... .~. . . ._ 1 ‘mg ”' " ”' *‘ ‘ : ' ' "fix . -'. rmzlI, l°l. l Avg number of queries 'cri deck‘ (9 I 'l= !|-. t=r- ‘flirt-rt 4 hi‘ '$‘c_.1¥. y, ‘, Iy‘, A". ‘”, J P_‘Fvf“"‘Y"/ ._Vv; ‘V; |¥/ ‘L 4)“ : - -rill‘. .4-. -i. -zi -/ llulléfi
  25. 25. rum: 7nI4In9/1.1170000 To anumnnmswnnn Saw» mm. <n< : mm man: Corn vrwm : '(gI ii! ms: it= :.~. Okvgmnhetolmltnes ‘"3311? am as/132326 nsnnxim nsmmn mzrumns L19-unun mrunzla narunzns osnuuzn 05140: 53 mwnuzs 094149500 mum: mmum mmam omen: 091411145 as/ unnzn «muons: oswuwze unumw ‘JJJM I0 _ ‘I-mesumn woman I smmou ""3:fiu: oo owuuza ommu wmeno nuuame wmm: mmoazu oimooaa uumom mmluo mum um": C HJ5 ccum 0 Iiera :1 J 3 4:11 0
  26. 26. ill! -- I| llI“if1.(lIl-'3 - 8 S ; v.rmn)v~m«v=1-1 ‘ : ur-vrmruinmu : 'ni'IA-1(I<nl-‘Kl .0 11.082 5 10.324 s 10.168 s 5 "ms “ auuy Query ouuy A EXCEPIIIIHS E stnwnumncs 4?) nmulmi ‘ : u(~vy4:)r-x(yq| :~ zuv-1r40(-VAL-)1,-E1‘ ‘ : v.v-vrmr--«may. 9.985 s 9.981 s 9.811 s Ovary Ouory Omry 5 FORM ms 'trul'a ; :.rv,4.. r--mum» ; «n-vr«-n--3-<: =1u l : ruvr4-1:-mx1:v. -5 9.692 8 9.654 S 9.504 S Query Query Query -'-Milli: -u-H-1-V<rA ~01-m’Ii1r-v-)1-1.-30» : v.I-In-It-u u-1-10¢ A AA— . A AAA A AAA
  27. 27. ': f"| vé! I_10Ih: ! ‘I"Io'1$1I1 Qlllula C <r‘L. -ll! ’ f. ~mu1=u‘I:1(= l~"n1»:1il= I-. _~‘€= If= H> =1!=1n—<| omoli1=1i1I1:«lr= -ahIlII:1=1:flI}'13?¢9€‘ -. r~ [L1 pnlikl E : ,-1-1 k'L0_ lo) 11»= s"A! ti I37: . -T1013} 4 VIJVJM REM .5710; — 4 VIKJIHSVL L-71!}-‘. — 4 Vlalvlltxth . -Tl! 4 V111‘)! -1167; 1-Tlllav 4 VIII‘ lath 1-Tl I12)» 4 IWJIUIKVL 1413136» 4l'J'a'1?‘l-‘RC7: . -1315': .« Pl4|1’15fll(‘1L . -1mm, 4 l'J'-0'13! I61: .513! K! » 4 P14143167; . -1~1¢}’4- 4 V14?! -15157; .31-1:1-‘, - V)‘-«'4-1=)(¢VL . -1-1-1L » 4 PIWHSKVL . -1-VA ‘. V 4 V)‘-£1’-(€! '4(€7L .61-1|! . . — 4?)‘-47%’13fl NW: .3437: r 4Pl4'V1€1¢n¢V- . -1615 av 4P)<fi'J5YA. nth . a1¢! ¢Kz» 4|’! -1'. 'J61n1¢n Ii1¢Y£. v4VV4.‘1". '1513)(¢7A 974-1'4» 4V?4¥J-‘V4671 . $Y)'. l:p. 4l‘ld¥lo1|t¢h 315:7-. '.. -rl4rM: xcn 31 I:1:. .-V14-1em: x:m . .1I: Y44PJ'4v'11l»! u€7A j1II:1€y»4V)'49'£! :n¢h i1O)'1=)r4Vl4¥lIM! ¢l . ~1O)W4r4PI4f11l¢x¢7x . -S212!-‘. r4V)'4'1A§(¢Vn . '§3): {Tp4V)‘4v’-(67467: #521574» 4 VIJVIJERGVL .4571!-'. — 4I')'4°é'}‘ll! ¢VL .1L'J42p 4lW4v'J B367. .4216)» ‘Via’: I157: .431". —4I'J'fl'J V-KC‘ ,1!‘ '74. avlyllkmm .1701: 4Vl49f4I)a)(:71 Jug, -| w)41q¢¢7. 47195:, IP14 . 21:11:71 .1121?» ‘P14 5271571 .17! ‘ L -914 132157; . -II-YA 4I')€1 lzmgcfi A521-‘. . 4714 . l:13)KM AIS! -F, 4Vl4 H9! 671 - $71, 47). ‘ I: I|K7L.1l'-1-‘_. «VA Ilflthflfiflla . -V)‘ Vlmth 415255, -Vlq VI‘ ISM .4!31W4- ‘V411 lixtin 47161:}. -7), ‘ 1.11137: 471%. , 47,81 I-‘$157; .1 - OIL, -[I4 I-16x67; .1!: YAL, -[I4 anm _4:yI+ «W4 . I-167. . e1:u: ,. -yr. . l: )(¢7.. A!: llrA. .. -ya ,1»; -.ucy. ,An:1:. . -ya 1:-5:151. .4141. . awn . I.11m .1704. -yr; . Imqcv. .4:1:1:, . -rm . I. -mum . A1:u:1. -yr; . I. -mu. .4.-vm_. -ya .5: «gm .4421.-, -ma .1-‘V4167. _Aun: .. Aw. Imam 4292:, 4'14 Imgm $1.. «ma . m:x: v. _-K716). -4Vl4 gnrucr. .-yer . awn . I¢1:g1¢v. .-sue), -w. .I¢1=11¢v. .4c1a1:; . aw. ,x1¢11¢v. ,: §'1¢Ie; ..r-VJ'4 CHI‘! :! ',J£l:1.. 4|’)-1 ‘K1651 : !'4L. -7). ‘ Itltxtl .4II: (?, . -714 (41671 .1574!-'. . -0&1 EIRGYA : !I¢! i'. -[#1 l: ?,§| G7A AIR! -'. -VI4 hlfltin M219. -PA ‘ HD3674 .61-‘J-1‘). -V), l: !:l(¢71 .5¥! )¢l~‘. . 4V. I4 alhfih $13111‘-‘. .x 4|’)! -1 1 14-161; Jill-‘p IV}; 4 M5157; 410?). » 471‘ 4 lsnltfi Alli. , «Pl. 5 (:52 7; _1!I1I'L, «VI; . I-1¢)(¢I: ..1!I¢I. , 4Pl4 J I-IIIQGVKJCJII. IP14 .1l-".1351; _1}f1€Y4. IP14 1|-1~2x¢fi_A! fIl; . - '14 A21-11m. Am: r4. «VI. .l= !:x¢n , -m:1c; . -VA H57. .1711‘, «V14 Ilmcv. ,1 711:, 4 V): J'l3flC'A . eyll: ;. -V); rims]. .4922]-: ~,. «P14 Jlmm ,1 §: y1.-, ar). mgcn .1 ma, ml. ‘ Jlzcmcy. .1 m 51;, . ‘I4 . l»! u¢l ¢!1!¢{3}» 4'14 JSIQQGYA .1.’-BK; 4'14 AS13157: ¢!3P4» 4VI4YJlK)| ¢7A .1!-161:1» ~VI4D, ’1lI6)(G7- . ‘!6T61i‘. . 4714192!» RG7. . A!£13l»! p» 4P)’4‘4lII| (¢7. AS1313; 4V)'4%; ’1I¢x¢7; :7I4. 4|')'4'1|3)(¢7s A7“l€, . Vlllxjfil; 471% Li 4 V #11367: —‘ 7' hr 4 VI4l’J¢1!‘l .1 1‘ I4 ‘. / 4 PIA/8| lath .1 V‘ M. - I‘lA'1£! v1(€7L .1 71 II? " - VI4l'£ I67: »9'l“O-Jr 4 PIA-'£?4(¢7A ,1 VI : ). 4 PIAVLI4 5&7; . :Vl4§3)— 4 I'lJ. >‘L4!R‘7n -17137 | :MAmn4A 4.“ an I. nmmnm A A: :»1') nual-M114 1:: '1') (A LA
  28. 28. - we znr, ulma: )1nrr1n'1rv1mi" whatcmwnnu MnAurMslma2 m Knurrr 1.1 mlrrmalnnrl Dan we mm 4. o K111:-Aeage’ 9 ‘ef. .". ."'1‘S. .. - wry mum Mm fl'<1!v. m11 . m1.s»-. 1.1 rho >11 All-«ma . mu. .. 91 mum‘ . sewer =11 Q Query nun Pridiction . Lnuk at query nllrlbmes - query iexl - smrl parameler - iaeeis. rangeouerlzs. qcospalial searches ea: - Train a supervised learn" mndel - use Ieemeu model on predlcl 11 a query win he slaw vls lasl . Mrmhnai when . >1n1umI: ‘I 171 rvudci mm. . we nl u. ..1.1 Qaaua-eedhemcm . sou guamem boasted IVQES - Experlmenlal nzsulls . AUC 038103 . Accuracy 0 95350
  29. 29. ‘f if: ‘ What Can We Do . .«- With All This Data? 4 7 - Rich source of information - Can we turn it into knowledge? - How about machine learning?
  30. 30. Learning on Collected Data - Query execution time prediction - Analysis of zero hit quenes - Analysis of query excepflons - Server sizing
  31. 31. Q Query Time Prediction - Look at query attributes - query text - start parameter - facets, rangeQueries, geoSpatia| searches etc - Train a supervised learning model - Use learned model to predict if a query will be slow v/ s fast
  32. 32. Query Time Prediction - Imbalanced dataset - Frequency of model training - Type of Model
  33. 33. :. T. / : - GBM Based Predictor 4 V 500 gradient boosted USES Experimental results: AUC: 0.98103 Accuracy: 0.95360 1 Actual / Predicted —» O 1 Ermr 0 76.056 2.518 0.03205 = 2,518 / 78,574 1,912 15.007 0.11301 =1.912/15.919 Totals 77,938 17,525 0.04639 I 4.430 I 95,493
  34. 34. tr ' , _ V ‘ Nagios - Qtime monitor . T . .. __ . . - # Zero hits queries ‘ ‘fill @ - You name it PoolA _ 7 _ mom mamlor: QTimn zlerl rm scar(h21(8l)50)lboIl ALERT A A co in T _: Thoth Qtime Monitor server A Thoth Monitor ' u. n.« HUM! mm «. m.wn. nu«»s: u VV VS! than mnan Crlflk‘ mm ~4‘r. Iv'~ ill MI‘ uni» pncl ; ..m. u.m~uiwe an. » mszamu mum; mm mm. Ismnnuaussnl - Continuous monitor of health - Mainly using Solr Statscomponent [hnpzzlwiki. apaghe. org/ sglri ] - Monitor specific server/ core using real time and historical data against its past behaviour and/ or other members of same pool - Alert through email - Basic alerting through Nagios (paging system)
  35. 35. Futcire Direction _. 6 Future Direction - Predicting query time buckets - regression v/ s classification - Exceptions and zero hit query analysis - Sizing and resource allocation , » /6 Future Direction - Dashboard additions: real-time - Dashboard integration with Solr cloud - More metrics to monitor with nzaq Thoth / - More data collection (load, GC)_.
  36. 36. ® Future Direction - Predicting query time buckets - regression vls classification - Exceptions and zero hit query analysis - Sizing and resource allocation
  37. 37. i’; :_‘« Future Direction ~ Dashboard additions: real-time r Dashboard integration with Solr cloud l~ More metrics to monitor with Thoth ~ More data collection (load, GC)
  38. 38. Y0 IIAWG, I HEAIIII Yflll lII(E Slllll 80 I PIIT Slllli INSIIJE sou: Damiano : dbraga@trulia. com Praneet: pmhatre@truIia. com Special thanks : JD Cantrell, Giulio Grillanda, Ying Wang
  • paulocoimbra

    May. 4, 2015
  • monopolize

    Mar. 14, 2015

Downtown SF Lucene/Solr Meetup - September 17. "Thoth: Real-time Solr Monitoring & Analysis at Trulia", presented by Damiano Braga and Praneet Mhatre, Trulia.

Vues

Nombre de vues

1 038

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

3

Actions

Téléchargements

20

Partages

0

Commentaires

0

Mentions J'aime

2

×