Image analysis for malicious advertisement detection

•Télécharger en tant que PPTX, PDF•

2 j'aime•582 vues

Jithendranath Joijoide

Technologie Business

Image analysis for fraudulent advertisements
Jithendranath J V

Awful
Awesome Advertisement

2

12/4/2013

Performance – Precision and Recall
Precision

0.81818

0.81818

0.81818

0.8125

0.8125

0.76522

0.74638

0.72327

0.67895

0.65198

0.55102

0.41163

0.34281

0.27941

0.22636

0.1649

Recall

0.00402

0.06827

0.13253

0.19679

0.26104

0.3253

0.38956

0.45382

0.51807

0.58233

0.64659

0.71084

0.7751

0.83936

0.90361

1

Threshold

3.64068

2.45085

1.85615

1.24538

0.91759

0.29167

0.06556

-0.18092

-0.52049

-0.78885

-1.18095

-1.75574

-2.07244

-2.52684

-3.13143

-4.45694

5

PrecisionVsRecall

4

1.2

3
1

2

0.8

1
0

0.6

Precision
Recall

PrecisionVsRecall
-1

0.4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Threshold

-2
0.2
-3
0
0

0.5

-4

1

-5

7

12/4/2013

IA Integration with CT
Image Analyzer

HTTP

Creative Tester

Servlet

Feature Extractor
K Means

8

Histogram
Classifier

12/4/2013

$IA - API Example Response: Request: { { "responses":[ { "requests":[ { "imgid":"1", "imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d079 b5de50f6b30602e4a00b84a6e49e9443af7.jpg", "imgid":"1", "classifiers":[ { "classifier":"wnddlg", "status”:true, "result":true, "conf":0.40216639639794 }] } ] } "imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d0 79b5de50f6b30602e4a00b84a6e49e9443af7.jpg", "run_wnddlg":true } ] } 9 12/4/2013$

Sample – Classified images

Yahoo! Confidential & Proprietary.

10

12/4/2013

What Does Success Look Like
• Who are the customers?
– RMX and APT creative serving systems.
– Moneyball (Going forward)

• Success metrics
– Reducing the manual effort needed in identifying win mimic based
advertisements
– This would be measured by the confidence score generated by the system, that
would eventually help us do everything automated
– Reduction in customer complaints.

• Key business stakeholders who have/will validate success
– Serving systems
– Business teams
– Manual review teams

11

12/4/2013

Competitive Landscape
• 3rd party ad verification companies.

etc.,
• What differentiates our product/Solution?
–
–
–
–

Avoiding the need to expose and send out demand inventory.
Flexibility to keep improvising the algorithms for higher precision/recall.
Quick turn around time for validation.
Building highly targeted models ( for ex: fake facebook, or fake adobe)

12

12/4/2013

Image analysis for malicious advertisement detection

Recommandé

Violencia en contra de la mujerCedoc Inamu

TestslideJithendranath Joijoide

SUPERGOOGLE, GESS - Intelligent search engine for Smart Web X.0Azamat Abdoullaev

Clickstream Data Warehouse - Turning clicks into customersAlbert Hui

Penetration Testing Procedures & Methodologies.pdfHimalaya raj Sinha

Dcc Cheque Scannerarigleton

HELMET DETECTION ON TWO-WHEELER RIDERS USING MACHINE LEARNINGIRJET Journal

IRJET- Drive Assistance an Android Application for Drowsiness DetectionIRJET Journal

Recommandé

Violencia en contra de la mujerCedoc Inamu

TestslideJithendranath Joijoide

SUPERGOOGLE, GESS - Intelligent search engine for Smart Web X.0Azamat Abdoullaev

Clickstream Data Warehouse - Turning clicks into customersAlbert Hui

Penetration Testing Procedures & Methodologies.pdfHimalaya raj Sinha

Dcc Cheque Scannerarigleton

HELMET DETECTION ON TWO-WHEELER RIDERS USING MACHINE LEARNINGIRJET Journal

IRJET- Drive Assistance an Android Application for Drowsiness DetectionIRJET Journal

Event-Based Vision Systems – Technology and R&D Trends Analysis ReportNetscribes

IRJET- Credit Card Authentication using Facial RecognitionIRJET Journal

How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...Kai Wähner

Mechatriks automation - Vision Inspection/Machine Vision SystemMechatriks Industrial Services Pvt Ltd

ProjectproposalProsper Muzenda

Machine Learning Impact on IoT - Part 2Value Amplify Consulting

Car Damage Assessment to Automate Insurance ClaimIRJET Journal

Data Annotation_Cars.pptxssuserfb92ae

Traceability in Manufacturing Barcoding, Inc.

IRJET - Smart Assistance System for DriversIRJET Journal

cctv hand book 2013 xxx.pdfPawachMetharattanara

Internet of things & predictive analyticsPrasad Narasimhan

IRJET- Face Detection and Tracking Algorithm using Open CV with Raspberry PiIRJET Journal

IRJET - Face Detection based ATM Safety System for Secured TransactionIRJET Journal

IRJET- Damage Assessment for Car InsuranceIRJET Journal

Bank Customer Segmentation & Insurance Claim PredictionIRJET Journal

IRJET- Driver’s Sleep DetectionIRJET Journal

Beautiful and Fast Images Doug Sillars

Highly dependable automotive softwareAlan Tatourian

IRJET - Automatic Car Insurance using Image AnalysisIRJET Journal

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica

Contenu connexe

Similaire à Image analysis for malicious advertisement detection

Event-Based Vision Systems – Technology and R&D Trends Analysis ReportNetscribes

IRJET- Credit Card Authentication using Facial RecognitionIRJET Journal

How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...Kai Wähner

Mechatriks automation - Vision Inspection/Machine Vision SystemMechatriks Industrial Services Pvt Ltd

ProjectproposalProsper Muzenda

Machine Learning Impact on IoT - Part 2Value Amplify Consulting

Car Damage Assessment to Automate Insurance ClaimIRJET Journal

Data Annotation_Cars.pptxssuserfb92ae

Traceability in Manufacturing Barcoding, Inc.

IRJET - Smart Assistance System for DriversIRJET Journal

cctv hand book 2013 xxx.pdfPawachMetharattanara

Internet of things & predictive analyticsPrasad Narasimhan

IRJET- Face Detection and Tracking Algorithm using Open CV with Raspberry PiIRJET Journal

IRJET - Face Detection based ATM Safety System for Secured TransactionIRJET Journal

IRJET- Damage Assessment for Car InsuranceIRJET Journal

Bank Customer Segmentation & Insurance Claim PredictionIRJET Journal

IRJET- Driver’s Sleep DetectionIRJET Journal

Beautiful and Fast Images Doug Sillars

Highly dependable automotive softwareAlan Tatourian

IRJET - Automatic Car Insurance using Image AnalysisIRJET Journal

Similaire à Image analysis for malicious advertisement detection (20)

Event-Based Vision Systems – Technology and R&D Trends Analysis Report

IRJET- Credit Card Authentication using Facial Recognition

How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...

Mechatriks automation - Vision Inspection/Machine Vision System

Projectproposal

Machine Learning Impact on IoT - Part 2

Car Damage Assessment to Automate Insurance Claim

Data Annotation_Cars.pptx

Traceability in Manufacturing

IRJET - Smart Assistance System for Drivers

cctv hand book 2013 xxx.pdf

Internet of things & predictive analytics

IRJET- Face Detection and Tracking Algorithm using Open CV with Raspberry Pi

IRJET - Face Detection based ATM Safety System for Secured Transaction

IRJET- Damage Assessment for Car Insurance

Bank Customer Segmentation & Insurance Claim Prediction

IRJET- Driver’s Sleep Detection

Beautiful and Fast Images

Highly dependable automotive software

IRJET - Automatic Car Insurance using Image Analysis

Dernier

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica

Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica

Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3

Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA

Connecting the Dots for Information Discovery.pdfNeo4j

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

QCon London: Mastering long-running processes in modern architecturesBernd Ruecker

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

A Journey Into the Emotions of Software DevelopersNicole Novielli

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765

2024 April Patch TuesdayIvanti

Scale your database traffic with Read & Write split using MySQL RouterMydbops

Dernier (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx

Zeshan Sattar- Assessing the skill requirements and industry expectations for...

Potential of AI (Generative AI) in Business: Learnings and Insights

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...

Design pattern talk by Kaya Weers - 2024 (v2)

Moving Beyond Passwords: FIDO Paris Seminar.pdf

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx

Generative AI - Gitex v1Generative AI - Gitex v1.pptx

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

Long journey of Ruby standard library at RubyConf AU 2024

Connecting the Dots for Information Discovery.pdf

Testing tools and AI - ideas what to try with some tool examples

UiPath Community: Communication Mining from Zero to Hero

QCon London: Mastering long-running processes in modern architectures

Genislab builds better products and faster go-to-market with Lean project man...

A Journey Into the Emotions of Software Developers

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration

2024 April Patch Tuesday

Scale your database traffic with Read & Write split using MySQL Router

Image analysis for malicious advertisement detection

1. Image analysis for fraudulent advertisements Jithendranath J V

2. Awful Awesome Advertisement 2 12/4/2013

3. Image Analyzer for Creative Tester User Issue / Yahoo! Challenge Roadmap Theme & Goal • Creative Tester gets approximately around a million creatives per day to be tested for malicious content. Of this 2 %– 5 % of adverts are of category windows mimic. These needs to be detected and banned at the earliest, with less human intervention. • Need to validate brand safety and ensure quality impressions for advertisers. • Trust and Safety team in collaboration with Sciences came up with a Image Analyzer module that can detect the malicious advertisements like windows mimic or fake brands with phony downloads and tag them appropriately to be banned. Value Proposition/Positioning – To reduce the manual effort in recognizing and banning of malicious advertisements that can be visually identified as fraudulent 3 12/4/2013

4. IONIX / CT Ecosystem Downloaders (Chrome, Firefox, IE) Cqueuer (RMX Apps) Primary/Secondary Creative/Click_URL Review IONIX Minbar/Technical Tags Creatives/LineItems gets banned with Min-Bar Min-bar / Technical Tags Classifiers Creative Tester (CT) Creative Feed based on Advertisers profile Domain Lookup Service Virus Checker (ClamAv / Trend Micro) Image Analyzer TRF_PRO D DB Creatives Banned Media Guard Manual Audit Queue. 4 Media Trust (3rd Party) Flash Checker 12/4/2013

5. IA Internals - Modeler 5 12/4/2013

6. IA Internals - Classifier 6 12/4/2013

7. Performance – Precision and Recall Precision 0.81818 0.81818 0.81818 0.8125 0.8125 0.76522 0.74638 0.72327 0.67895 0.65198 0.55102 0.41163 0.34281 0.27941 0.22636 0.1649 Recall 0.00402 0.06827 0.13253 0.19679 0.26104 0.3253 0.38956 0.45382 0.51807 0.58233 0.64659 0.71084 0.7751 0.83936 0.90361 1 Threshold 3.64068 2.45085 1.85615 1.24538 0.91759 0.29167 0.06556 -0.18092 -0.52049 -0.78885 -1.18095 -1.75574 -2.07244 -2.52684 -3.13143 -4.45694 5 PrecisionVsRecall 4 1.2 3 1 2 0.8 1 0 0.6 Precision Recall PrecisionVsRecall -1 0.4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Threshold -2 0.2 -3 0 0 0.5 -4 1 -5 7 12/4/2013

8. IA Integration with CT Image Analyzer HTTP Creative Tester Servlet Feature Extractor K Means 8 Histogram Classifier 12/4/2013

9. IA - API Example Response: Request: { { "responses":[ { "requests":[ { "imgid":"1", "imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d079 b5de50f6b30602e4a00b84a6e49e9443af7.jpg", "imgid":"1", "classifiers":[ { "classifier":"wnddlg", "status”:true, "result":true, "conf":0.40216639639794 }] } ] } "imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d0 79b5de50f6b30602e4a00b84a6e49e9443af7.jpg", "run_wnddlg":true } ] } 9 12/4/2013

10. Sample – Classified images Yahoo! Confidential & Proprietary. 10 12/4/2013

11. What Does Success Look Like • Who are the customers? – RMX and APT creative serving systems. – Moneyball (Going forward) • Success metrics – Reducing the manual effort needed in identifying win mimic based advertisements – This would be measured by the confidence score generated by the system, that would eventually help us do everything automated – Reduction in customer complaints. • Key business stakeholders who have/will validate success – Serving systems – Business teams – Manual review teams 11 12/4/2013

12. Competitive Landscape • 3rd party ad verification companies. etc., • What differentiates our product/Solution? – – – – Avoiding the need to expose and send out demand inventory. Flexibility to keep improvising the algorithms for higher precision/recall. Quick turn around time for validation. Building highly targeted models ( for ex: fake facebook, or fake adobe) 12 12/4/2013

Notes de l'éditeur

Cequer -> Ionix -> Can send to DLSor to CT -> set of tests like downloader, virus checker, flash checker etc.,classifier for landing page detection and also 3rd party validations.The results are tagged and sent back.Business logic on what action to take will be done by cequer.
Feature Extraction:In pattern recognition and in image processing, feature extraction is a special form of dimensionality reduction.When the input data to an algorithm is too large to be processed and it is suspected to be notoriously redundant (e.g. the same measurement in both feet and meters) then the input data will be transformed into a reduced representation set of features (also named features vector). Result of application of local neighborhood operation on the image. Neighborhood operation means, going to every point and then applying a function on that point, based on its neighbor. Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N)} Edge detection:Edge detection is the name for a set of mathematical methods which aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities. Corner detection:Interest point detection or corner can be defined as the intersection of two edges. A corner can also be defined as a point for which there are two dominant and different edge directions in a local neighborhood of the point. Blob detection:Informally, a blob is a region of a digital image in which some properties are constant or vary within a prescribed range of values; all the points in a blob can be considered in some sense to be similar to each other.Sift - Scale invariant feature transform:Sift algorithm will calculate features that are scale invariant. Which means the image can still be recognized, when it is rotated or scaled or when viewed from a different view point.CBOW - Contextual Bag of WordsIn computer vision, the bag-of-words model (BoW model) can be applied to image classification, by treating image features as words. In document classification, a bag of words is a sparse vector of occurrence counts of words; that is, a sparse histogram over the vocabulary. In computer vision, a bag of visual words is a sparse vector of occurrence counts of a vocabulary of local image features.SVM:a support vector machine constructs a hyperplane or set of hyperplanes in a high- or infinite-dimensional space, which can be used for classification, regression, or other tasks.