Pinot
Kishore Gopalakrishna
Tuesday, August 18, 15
Agenda
• Pinot @ LinkedIn - Current
• Pinot - Architecture
• Pinot Operations
• Pinot @ LinkedIn - Future
Tuesday, August ...
WVMP
Tuesday, August 18, 15
Slice and Dice Metrics
Tuesday, August 18, 15
Pinot @ LinkedIn
Customers Members Internal tools
Tuesday, August 18, 15
• 100B documents
• 1B documents ingested per day
• 100M queries per day
• 10’s of ms latency
• 30 tables in prod, 250 * 3 ...
Key features
SQL-like
interface
Columnar
storage and
indexing
Real-time
data load
Tuesday, August 18, 15
(S)QL: Filters and Aggs
SELECT count(*)
FROM companyFollowHistoricalEvents
WHERE entityId = 121011 AND
'day' >= 15949 AND ...
(S)QL: Group By
SELECT count(*)
FROM companyFollowHistoricalEvents
WHERE entityId = 121011 AND
'day' >= 15949 AND 'day' <=...
(S)QL: ORDER BY and LIMIT
SELECT *
FROM companyFollowHistoricalEvents
WHERE entityId = 121011 AND
entityId = 1000 AND
acti...
Whats not supported
• JOIN: unpredictable performance
• NOT A SOURCE OF TRUTH
• Mutation
Tuesday, August 18, 15
Pinot
• Data flow
• Query Execution
• How to use/operate
• Pinot @ LinkedIn - Future
Tuesday, August 18, 15
Broker Helix
Real
time
Historical
Kafka Hadoop
Pinot
Architecture
Queries
Raw
Data
Tuesday, August 18, 15
Pinot
• Pinot segments
Tuesday, August 18, 15
Pinot Segment layout: Columnar storage
Tuesday, August 18, 15
Pinot Segment layout: Sorted Forward Index
Tuesday, August 18, 15
Pinot Segment layout: Other techniques
• Indexes: Inverted index, Bitmap, RoaringBitmap
• Compression: Dictionary Encoding...
Data aware
pre-computation
Star tree Index
Tuesday, August 18, 15
Pinot
• Query Execution
Tuesday, August 18, 15
Pinot Query Execution: Distributed
Servers
S1
S3
S2
S1
S3
S2
Helix
Brokers
Tuesday, August 18, 15
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
Brokers
Tuesday, August 18, 15
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
Tuesda...
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
3. Sca...
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
3. Sca...
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
3. Sca...
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
3. Sca...
Pinot Query Execution: Single Node Architecture
EXECUTION ENGINE
INVERTED
INDEX
BITMAP
INDEX
COLUMN FORMAT
PLANNER
Tuesday...
Pinot Query Execution: Single Node Architecture
SELECT
campaignId,
sum(clicks)
FROM Table A
WHERE
accountId = 121011
AND
'...
Pinot
• Operations
Tuesday, August 18, 15
Cluster Management: Deployment
Helix
Brokers
Servers
• Brokers and Servers register themselves in Helix
• All servers star...
On boarding new use case
Helix
Brokers
Servers
XLNT XLNT
XLNT
Create Table
command
Controller
XLNT
XLNTTag
Servers
TableNa...
Segment Assignment
Servers
S3
S2
S1
Upload Segment S2
S1
S3
S2
S1
S3
Helix
Brokers
Copies
TableName
2
XLNT_T1
Controller
T...
• AUTO recovery mode: Automatically redistribute
segments on failure/addition of new nodes
• Custom mode: Run in degraded ...
Pinot vs Druid
Druid Pinot
Architecture
Realtime + Offline,
Realtime only
Realtime + Offline
Realtime only -> consistency is...
• Documentation & tooling
• In progress - consistency among real time replicas.
• Improve cost to serve - leverage SSD, pa...
Thank You
30
Tuesday, August 18, 15
Prochain SlideShare
Chargement dans…5
×

Pinot: Realtime Distributed OLAP datastore

300 881 vues

Publié le

Pinot is a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally.

Publié dans : Technologie
70 commentaires
672 j’aime
Statistiques
Remarques
  • Hello Mr. and Mrs. I am a particular financial. I make loans between individuals to people in difficulties financial and able to repay the amount borrowed. My loan terms are simple and well-defined.Here are the areas in which I can help you:Financial home Loan investment Loan Debt consolidation line of credit Second mortgage Redemption of credit personal Loan You are stuck, prohibited bank and you do not have the favor of banks or better you have a project and need financing, a bad file of credit or need money to pay bills, funds to invest on business. I offer loan from 2000€ to 50,000,000€, with a guarantee to be well studied. My interest rate on the loan is 3%. If you have projects to do, or if you are in financial difficulties you can contact me by mail as quickly as possible, stating the amount you want to borrow.For more information, please contact to my e-mail address: xaviiermusca@gmail.com
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Hello, I am Brenda waters After being in relationship with Jeff for years, he broke up with me, I did everything possible to bring him back but all was in vain, I wanted him back so much because of the love I have for him, I begged him with everything, I made promises but he refused. I explained my problem to my friend and she suggested that I should rather contact a spell caster that could help me cast a spell to bring him back but I am the type that never believed in spell, I had no choice than to try it, I mailed the spell caster, and he told me there was no problem that everything will be okay before three days, that my ex will return to me before three days, he cast the spell and surprisingly in the second day, it was around 4pm. My ex called me, I was so surprised, I answered the call and all he said was that he was so sorry for everything that happened that he wanted me to return to him, that he loves me so much. I was so happy and went to him that was how we started living together happily again. Since then, I have made promise that anybody I know that have a relationship problem, I would be of help to such person by referring him or her to the only real and powerful spell caster who helped me with my own problem he used his Africa Black Power. His email: greatosemudiagaspelltemple@gmail.com you can email him if you need his assistance in your relationship or any other Cases. 1) Love Spells 2) Lost Love Spells 3) Divorce Spells 4) Marriage Spells 5) Binding Spells 6) Breakup Spells 7) Banish a past Lover. 8.) You want to be promoted in your office 9) want to satisfy your lover Contact this great man if you are having any problem for a lasting solution via: (greatosemudiagaspelltemple@gmail.com) Africa Black Power
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Advice for heart broken and how to get your Ex lover back and he will apologize for his mistake and the pain he caused you!..I highly recommends this real spell caster Dr.Unity to anyone in need of help! I am so excited i got my husband back after he left me for another woman. After 12 years of marriage, me and my husband has been into one quarrel or the other until he finally left me and moved to California to be with another woman. I felt my life was over and my kids thought they would never see their father again. i tried to be strong just for the kids but i could not control the pains that torments my heart, my heart was filled with sorrows and pains because i was really in love with my husband. Every day and night i think of him and always wish he would come back to me, I was really upset and i needed help, so i searched for help online and I came across a website that suggested that Dr Unity can help get ex back fast. So, I felt I should give him a try. I contacted him and he told me what to do and i did it then he did a (Love spell) for me. 28 hours later, my husband really called me and told me that he miss me and the kids so much, So Amazing!! So that was how he came back that same day,with lots of love and joy,and he apologized for his mistake,and for the pain he caused me and the kids. Then from that day,our Marriage was now stronger than how it were before, All thanks to Dr Unity. he is so powerful and i decided to share my story on the internet that Dr.Unity real and powerful spell caster who i will always pray to live long to help his children in the time of trouble, if you are here and you need your Ex back or your husband moved to another woman, do not cry anymore, contact this powerful spell caster now. Here’s his contact: Email him at: Unityspelltemple@gmail.com , you can also call him or add him on Whats-app: +2348071622464 , you can also visit his website:https://unitysspelltemple.blogspot.com .
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Get to know about how to check your cibiil score for free - http://andhrapradesh.locanto.net/ID_1733907216/How-to-check-cibil-score-online-for-free.html
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Cool!
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
Aucun téléchargement
Vues
Nombre de vues
300 881
Sur SlideShare
0
Issues des intégrations
0
Intégrations
2 023
Actions
Partages
0
Téléchargements
1 354
Commentaires
70
J’aime
672
Intégrations 0
Aucune incorporation

Aucune remarque pour cette diapositive

Pinot: Realtime Distributed OLAP datastore

  1. Pinot Kishore Gopalakrishna Tuesday, August 18, 15
  2. Agenda • Pinot @ LinkedIn - Current • Pinot - Architecture • Pinot Operations • Pinot @ LinkedIn - Future Tuesday, August 18, 15
  3. WVMP Tuesday, August 18, 15
  4. Slice and Dice Metrics Tuesday, August 18, 15
  5. Pinot @ LinkedIn Customers Members Internal tools Tuesday, August 18, 15
  6. • 100B documents • 1B documents ingested per day • 100M queries per day • 10’s of ms latency • 30 tables in prod, 250 * 3 std app nodes Pinot @ LinkedIn Tuesday, August 18, 15
  7. Key features SQL-like interface Columnar storage and indexing Real-time data load Tuesday, August 18, 15
  8. (S)QL: Filters and Aggs SELECT count(*) FROM companyFollowHistoricalEvents WHERE entityId = 121011 AND 'day' >= 15949 AND 'day' <= 15963 AND paid = 'y’ AND action = 'stop' Tuesday, August 18, 15
  9. (S)QL: Group By SELECT count(*) FROM companyFollowHistoricalEvents WHERE entityId = 121011 AND 'day' >= 15949 AND 'day' <= 15963 AND paid = 'y’ GROUP BY action Tuesday, August 18, 15
  10. (S)QL: ORDER BY and LIMIT SELECT * FROM companyFollowHistoricalEvents WHERE entityId = 121011 AND entityId = 1000 AND action = 'start' ORDER BY creationTime DESC LIMIT 1 Tuesday, August 18, 15
  11. Whats not supported • JOIN: unpredictable performance • NOT A SOURCE OF TRUTH • Mutation Tuesday, August 18, 15
  12. Pinot • Data flow • Query Execution • How to use/operate • Pinot @ LinkedIn - Future Tuesday, August 18, 15
  13. Broker Helix Real time Historical Kafka Hadoop Pinot Architecture Queries Raw Data Tuesday, August 18, 15
  14. Pinot • Pinot segments Tuesday, August 18, 15
  15. Pinot Segment layout: Columnar storage Tuesday, August 18, 15
  16. Pinot Segment layout: Sorted Forward Index Tuesday, August 18, 15
  17. Pinot Segment layout: Other techniques • Indexes: Inverted index, Bitmap, RoaringBitmap • Compression: Dictionary Encoding, P4Delta • Multi Valued columns, skip lists, • Hyperloglog for unique • T-digest for Percentile, Quantile Tuesday, August 18, 15
  18. Data aware pre-computation Star tree Index Tuesday, August 18, 15
  19. Pinot • Query Execution Tuesday, August 18, 15
  20. Pinot Query Execution: Distributed Servers S1 S3 S2 S1 S3 S2 Helix Brokers Tuesday, August 18, 15
  21. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix Brokers Tuesday, August 18, 15
  22. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers Tuesday, August 18, 15
  23. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers 3. Scatter Request Tuesday, August 18, 15
  24. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers 3. Scatter Request 4. Process Request & send response Tuesday, August 18, 15
  25. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers 3. Scatter Request 4. Process Request & send response 5. Gather Response Tuesday, August 18, 15
  26. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers 3. Scatter Request 4. Process Request & send response 5. Gather Response 6. Return Response Tuesday, August 18, 15
  27. Pinot Query Execution: Single Node Architecture EXECUTION ENGINE INVERTED INDEX BITMAP INDEX COLUMN FORMAT PLANNER Tuesday, August 18, 15
  28. Pinot Query Execution: Single Node Architecture SELECT campaignId, sum(clicks) FROM Table A WHERE accountId = 121011 AND 'day' >= 15949 GROUP BY campaignId account Id daycampaign Id click Filter Operator Projection Operator Aggregation Group by Operator Combine Operator Pinot Segments Data sources Matching doc ids campaignId,Click tuple Tuesday, August 18, 15
  29. Pinot • Operations Tuesday, August 18, 15
  30. Cluster Management: Deployment Helix Brokers Servers • Brokers and Servers register themselves in Helix • All servers start with no use case specific configuration Controller Tuesday, August 18, 15
  31. On boarding new use case Helix Brokers Servers XLNT XLNT XLNT Create Table command Controller XLNT XLNTTag Servers TableName Brokers 3 XLNT_T1 1 Tuesday, August 18, 15
  32. Segment Assignment Servers S3 S2 S1 Upload Segment S2 S1 S3 S2 S1 S3 Helix Brokers Copies TableName 2 XLNT_T1 Controller Tuesday, August 18, 15
  33. • AUTO recovery mode: Automatically redistribute segments on failure/addition of new nodes • Custom mode: Run in degraded mode until node is restarted/replaced. Pinot - Fault tolerance/Elasticity Tuesday, August 18, 15
  34. Pinot vs Druid Druid Pinot Architecture Realtime + Offline, Realtime only Realtime + Offline Realtime only -> consistency is hard and schema evolution/Bootstrap is hard Inverted Index Always On all columns, Fixed Configurable on per column basis Allows trade off between scanning v/s inverted index + scanning. More data can be fit in given memory size Data organization N/A Sorts data Organizing data provides speed/better compression and removes the need for inverted index Smart pre- materialization N/A star-tree Allows trade off between latency and space Query Execution Layer Fixed Plan Split into Planning and execution Smart choices can be made at runtime based on metadata/query. Tuesday, August 18, 15
  35. • Documentation & tooling • In progress - consistency among real time replicas. • Improve cost to serve - leverage SSD, partial pre materialization • ThirdEye - Business Metrics Monitoring Pinot - Future Tuesday, August 18, 15
  36. Thank You 30 Tuesday, August 18, 15

×