SlideShare une entreprise Scribd logo
1  sur  16
App behavioral
analysis using system
calls
Prajit Kumar Das
Advisor: Anupam Joshi,
PhD
Co-Advisor: Tim Finin,
PhD
Problem statement
We present Heimdall, a framework
for studying system calls made by
mobile apps in order to determine an
app’s behavior class and match such
behavior to their stated purpose.
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 2
UMBC Ebiquity Research Group | MobiSec 2017
Image courtesy: Android App Market
Motivation: App issues
Tuesday, May 2, 2017 3
Motivation: Permission inadequacy
Pre-Marshmallow Marshmallow
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 4
Do you read privacy policies and do you
understand them?
Source: lifehacker
Motivation: Software Limitation
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 5
Related Work
• System calls used for software analysis Kosoresow’97
• Three areas of research for mobile app analysis
• Malware classification Zhou’12
• NLP techniques Pandtia’13, Gorla’14
• Taint tracking Enck’10
• Google PHA taxonomy Google Android Security’16
• PrivacyGrade: Grading The Privacy Of Smartphone Apps
• Expectation and purpose: understanding users’ mental models
of mobile app privacy through crowdsourcing. Lin’12
• Modeling users’ mobile app privacy preferences, Lin’14
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 6
Architecture
Download
module
Annotations
module
System call capture module
Mobile device
emulator
Feature generation module
Classification module
Classifier outputUMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 7
Dataset distribution
10 annotated categories, 20 Google Play categories – 75% tool and
productivity
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 8
Experimental setup
• 1560 apps
• 534 successfully executed
• strace can only be used on emulator
• UI/Application exerciser tool used
• Android 6.0.1 December 2015 build used
• SVM, MLP, J48 and NB used
• Best F1-score from MLP at 0.44
• 1-hot vectors – Call present or absent
• TF-IDF weight vectors – Uniqueness and
significance of system calls for app
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 9
Similar TF-IDF scores - Scientific calculator vs To-do list
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 10
Results: Google categories
• Google’s categories are
developer provided
• Could be misleading
• System calls could classify
behavior better
• Additional features
required
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 11
Results: Annotated classes
• System calls not totally
useless
• 1-hot vs TF-IDF
Classifier TF-IDF 1-hot
MLP 0.44 0.26
SVM-Poly 0.31 0.23
SVM-RBF 0.32 0.21
J48 0.27 0.31
NB 0.27 0.27
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 12
Challenges faced
• Annotating app behavior class
• Emulator instabilities
• App limitations/bugs
• User credentials required
• Multi-behavior apps
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 13
Conclusions
Can system calls be used to
distinguish between how an app
“behaves” and it’s perceived/stated
purpose?
• System calls – insufficient as
features
• Emulator – better ones required
• Coarser behavior classes
required
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 14
Future work
• System call bi-grams, tri-grams – for capturing call sequences
• Better emulator – Genymotion
• Malware classification – less scope (state-of-the-art is 96.7%)
• Generating contextual policies from behavior estimation
• Features planned for future:
• App descriptions
• App ratings
• PHA app analysis – require samples for Mobile Unwanted Software
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 15
Questions?
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 16

Contenu connexe

En vedette

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

En vedette (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

App behavioral analysis using system calls

  • 1. App behavioral analysis using system calls Prajit Kumar Das Advisor: Anupam Joshi, PhD Co-Advisor: Tim Finin, PhD
  • 2. Problem statement We present Heimdall, a framework for studying system calls made by mobile apps in order to determine an app’s behavior class and match such behavior to their stated purpose. UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 2
  • 3. UMBC Ebiquity Research Group | MobiSec 2017 Image courtesy: Android App Market Motivation: App issues Tuesday, May 2, 2017 3
  • 4. Motivation: Permission inadequacy Pre-Marshmallow Marshmallow UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 4
  • 5. Do you read privacy policies and do you understand them? Source: lifehacker Motivation: Software Limitation UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 5
  • 6. Related Work • System calls used for software analysis Kosoresow’97 • Three areas of research for mobile app analysis • Malware classification Zhou’12 • NLP techniques Pandtia’13, Gorla’14 • Taint tracking Enck’10 • Google PHA taxonomy Google Android Security’16 • PrivacyGrade: Grading The Privacy Of Smartphone Apps • Expectation and purpose: understanding users’ mental models of mobile app privacy through crowdsourcing. Lin’12 • Modeling users’ mobile app privacy preferences, Lin’14 UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 6
  • 7. Architecture Download module Annotations module System call capture module Mobile device emulator Feature generation module Classification module Classifier outputUMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 7
  • 8. Dataset distribution 10 annotated categories, 20 Google Play categories – 75% tool and productivity UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 8
  • 9. Experimental setup • 1560 apps • 534 successfully executed • strace can only be used on emulator • UI/Application exerciser tool used • Android 6.0.1 December 2015 build used • SVM, MLP, J48 and NB used • Best F1-score from MLP at 0.44 • 1-hot vectors – Call present or absent • TF-IDF weight vectors – Uniqueness and significance of system calls for app UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 9
  • 10. Similar TF-IDF scores - Scientific calculator vs To-do list UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 10
  • 11. Results: Google categories • Google’s categories are developer provided • Could be misleading • System calls could classify behavior better • Additional features required UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 11
  • 12. Results: Annotated classes • System calls not totally useless • 1-hot vs TF-IDF Classifier TF-IDF 1-hot MLP 0.44 0.26 SVM-Poly 0.31 0.23 SVM-RBF 0.32 0.21 J48 0.27 0.31 NB 0.27 0.27 UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 12
  • 13. Challenges faced • Annotating app behavior class • Emulator instabilities • App limitations/bugs • User credentials required • Multi-behavior apps UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 13
  • 14. Conclusions Can system calls be used to distinguish between how an app “behaves” and it’s perceived/stated purpose? • System calls – insufficient as features • Emulator – better ones required • Coarser behavior classes required UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 14
  • 15. Future work • System call bi-grams, tri-grams – for capturing call sequences • Better emulator – Genymotion • Malware classification – less scope (state-of-the-art is 96.7%) • Generating contextual policies from behavior estimation • Features planned for future: • App descriptions • App ratings • PHA app analysis – require samples for Mobile Unwanted Software UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 15
  • 16. Questions? UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 16

Notes de l'éditeur

  1. RQ1: Can system calls be used to distinguish between how an app “behaves” and it’s perceived/stated purpose?
  2. Why should we care? Apps collect user data Emails, Messages, Documents, Sensor data – Highly Personal Data Can’t App permissions handle privacy and security of data? App permissions – “Take it or leave it” Is user okay with sharing location in public place not private place, no way to control that Use Privacy and Security module to implement context-dependent Rules
  3. play cat freq alarm_clock 128 battery_saver 93 drink_recipes 15 file_explorer 72 lunar_calendar 12 pdf_reader 22 scientific_calculator 61 to_do_list 102 video_playback 5 wifi_analyzer 24
  4. We had 61 scientific calculators and 102 to-do list apps in 534 that’s a combined percentage of 21% Very similar system calls but different app categories fstatat - get file status relative to a directory file descriptor ftruncate - truncate a file to a specified length pwrite - read from or write to a file descriptor at a given offset sigreturn - return from signal handler and cleanup stack frame sendmsg - send a message on a socket sigaction - examine and change a signal action
  5. System calls aren’t useless because the random choice for a 10 classes gives us a probability of 0.1 and we have 0.44 at best and 0.27 at worst Classifiers perform better with TF-IDF features; understandable because just like document classification presence of a system call is significant while presence of a unique system call is a better feature than presence of a particular system call or a group of system calls