Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Getting Actionable, Reactive and Historic insights on large volumes of data
1.
2. ACTIONABLE,
REACTIVE,
HISTORICAL
INSIGHTS ON LARGE VOLUMES OF DATA
Ashish Tadose
Senior Data Architect @ PubMatic
Note: opinions expressed in these slides are the authors and not necessarily those of PubMatic
3. Who am I ?
• Senior BigData Architect
• Working on large volume & fast processing infrastructure
• Built and managed Data products on Petabyte scale volume at
VeriSign & PubMatic
• Apache committer and FOSS lover
5. Reporting categories
Timing is important in advertising
Right message at the right moment wins customers
• Faster RealTime reports on smaller critical dimensions & metrics
(availability within 1 min of ad-serving )
• Detailed Historic insights reports with higher cardinality dimensions most of the
customers
(availability within few hours of ad-serving )
• Lazily evaluated reports
(availability within few hours of ad-serving )
• AdHoc reports
(availability within days of request )
7. Traditional Historical Reporting historical
reporting dashboard display metrics from any specified time point
Access
Points
Access
Points
Access
Points Access
Points
Access
Points
Kafka
Producer Access
Points
Access
Points
Kafka
Consumer HDFS
MapReduce
Jobs
MySql
Merge
Clean/Dedupe
Collect
Transform &
Process
Kafka Ingestion framework – Camus & Goblin
8. Advanced Historical Reporting
• Reporting needed for all attributes impacting monetization
• Demand Insights report
• Which parameters are driving the greatest demand, such as lat/long?
• Revenue / Campaign Pacing
• Who are the advertisers who are expected to show the biggest growth next
quarter for my "sport" section?
• User-based analytic:
• The eCPM for males reading America's Cup content is 50% above my average
and therefore I should create more articles against this content.
9. Historical Reporting
Cross dimensional query on 50+ dimensions & 80+ metrics
Non Functional requirements
• Less than 3 secs of response time for slice & dice
• 250+ analytical queries
• Highly available
• Linearly scalable
10. Reactive Analytics
Is the process of using information to respond to matters after they
occur.
• Real-time reporting
• Reporting of critical metrics around campaign monetization
• Revenue, impression & click info
• Aggregate counters & reporting on top N metrics
• Latency – within 2 mins of ad-serving
• Floor price optimization
• Publisher can set new floor on specific demand to increase the revenue
• Create & modify complex floors
• Change the priority of deal
• Updated blocklist of specific advertiser
17. Need for Streaming - Latency with Batch jobs
Access
Points
Access
Points
Access
Points Access
Points
Access
Points
Kafka
Producer Access
Points
Access
Points
Kafka
Consumer HDFS
MapReduce
Jobs
Merge
Clean/Dedupe
Collect
Transform &
Process
Minutes to
transfer
Hours to Clean
& Bucket
Hours to Run Jobs
& Update store
19. Actionable Analytics
Data for information that can be acted upon or information that gives
enough insight into the future that the actions that should be taken become
clear for decision makers.
Feedback to ad serving for guaranteed delivery & line item pacing
Access
Points
Access
Points
Access
Points
AdServer
Apex Streaming
App
Store