Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Crime Analysis using Data Analysis

1 626 vues

Publié le

Crime Analysis using Data Analysis

Publié dans : Formation
  • Soyez le premier à commenter

Crime Analysis using Data Analysis

  1. 1. CRIME ANALYSIS AND PREDICTION USING DATA MINING CHETAN HIREHOLI, M.TECH, SOFTWARE ENGINEERING
  2. 2. Data Mining, what is it? Data mining is about finding new information in a lot of data. • Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. • Data mining software is one of a number of analytical tools for analyzing data.
  3. 3. Timeline John W. Tukey- Exploratory Data Analysis, 1962 Gregory Piatetsky- Shapiro organizes and chairs the first Knowledge Discovery in Databases (KDD) workshop, 1989 BusinessWeek publishe s a cover story on “Database Marketing”, 1994 For the first time, the term “data science” is included in the title of the conference (“Data science, classification, and related methods”), 1996 by IFCS The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades… - Hal Varian, Google’s Chief Economist, 2009
  4. 4. Application and Trends…  Financial Data Analysis  Retail Industry  Telecommunication Industry  Biological Data Analysis  Other Scientific Applications  Intrusion Detection
  5. 5. Feel Good, Do Good! “Crime Analysis and Prediction Using Data Mining” Shiju Sathyadevan, Devan M.S and Surya Gangadharan. S, 2014 IEEE
  6. 6. Abstract  What is Crime analysis?- Crime analysis is a law enforcement function that involves systematic analysis for identifying and analyzing patterns and trends in crime and disorder.  The proposed system has an approach between computer science and criminal justice to develop a data mining procedure that can help solve crimes faster.
  7. 7. Introduction  It is only within the last few decades that the technology made spatial data mining a practical solution for wide audiences of Law enforcement officials which is affordable and available.  Huge chunks of data to be collected- web sites, news sites, blogs, social media, RSS feeds etc.  So the main challenge in front of us is developing a better, efficient crime pattern detection tool to identify crime patterns effectively.
  8. 8. Doing analysis is a hard job!  The reason for choosing this(Clustering):  Only known data present with us  Classification technique will not predict well  Also nature of crimes change over time  So in order to be able to detect newer and unknown patterns in future, clustering techniques work better.
  9. 9. Steps in doing Crime Analysis Data Collection Classification Pattern Prediction Visualization
  10. 10. Related Work Using Series Finder will get me more Films!  Series Finder for finding the patterns in burglary.  For achieving this they used the modus operandi of offender and they extracted some crime patterns which were followed by offender.  The algorithm constructs modus operandi of the offender. In your dreams… You can’t catch me!, I’m KRISHH!
  11. 11. Methodology  Data Collection  Collecting data from various sources like news sites, blogs, social media, RSS feeds etc.  But the data we got is ‘VERY UNSTRUCTURED’!, and how do we store it?!  The advantage of NoSQL database over SQL database is that it allows insertion of data without a predefined schema.  Object-oriented programming- hence is easy to use and flexible.  Unlike SQL database it not need to know what we are storing in advance, specify its size etc. Okay! Enough of humor, come lets get serious, and look into how it actually works!
  12. 12. Methodology  Classification  Naïve Bayes- a supervised learning method as well as a statistical method  The algorithm classifies a news article into a crime type to which it fits the best Eg. "What is the probability that a crime document D belongs to a given class C?“ Thomas Bayes
  13. 13. Methodology  Classification  Naïve Bayes has it’s advantages:  Simple, and converges quicker than logistic regression.  Compared to SVM (Support Vector Machine), it is easy to implement and comes with high performance. Also in case of SVM as size of training set increases the speed of execution decreases.  Works well for small amount of training to calculate the classification parameters.  Also it fixes the Zero-frequency problem!
  14. 14. Methodology  Classification  Using Naive Bayes algorithm we create a model by training crime data related to vandalism, murder, robbery, burglary, sex abuse, gang rape, arson, armed robbery, highway robbery, snatching etc.  Test results shows that Naive Bayes shows more than 90% accuracy!!
  15. 15. Pseudo code for Naïve Bayes
  16. 16. Methodology  Classification  Named Entity Recognition(NER)- also known as Entity Extraction finds and classify elements in text into predefined categories such as the person names, organizations, locations, date, time etc. Sample NER
  17. 17. Methodology  Classification  Coreference Resolution- Find the referenced entities in a text. Input: E.g.: A pillion bike rider snatched away a gold mangalsutra worth Rs 85,000 of a 60-year-old woman pedestrian in sector 19, Kharghar on Friday. The victim, Shakuntala Mande, was walking towards a vegetable outlet around 9.40am, when a bike came close to her and the pillion rider snatched her mangalsutra. A robbery case has been registered at Kharghar police station.
  18. 18. Methodology  Pattern Identification  Apriori algorithm- used to determine association rules which highlight general trends  The result of this phase is the crime pattern for a particular place.  After getting a general crime pattern for a place, when a new case arrives and if it follows the same crime pattern then we can say that the area has a chance for crime occurrence.  Information regarding patterns helps police officials to facilitate resources in an effective manner.
  19. 19. Methodology  Prediction  Decision tree- It is simple to understand and interpret!  Its robust nature and also it works well with large data sets. Root node Leaf node Splitting ?
  20. 20. Methodology  Visualization  A heat map which indicates level of activity, usually darker colors to indicate low activity and brighter colors to indicate high activity.
  21. 21. Methodology  Visualization  In the x-axis all main locations in India are plotted whereas in y-axis the crime rate is plotted.  The graph shows the regions which has maximum crime rate.  The data plotted here is based on the historical records.
  22. 22. Methodology  Visualization  Shows the rate/percentage of crime occurrence in places like airport, temples, bus station, railway stations, bank, casino, jewelry shops, bar, ATM, airport, bus station, highways etc..  In the x axis the main spots like temple, bank, bus station, railway station, ATM etc. are plotted while in y-axis the rate of crime is plotted.
  23. 23. Future Work  Criminal Profiling  Helps the crime investigators to record the characteristics of criminals.  The main goal of doing criminal profiling is that:  To provide crime investigators with a social and psychological assessment of the offender  To evaluate belongings found in the possession of the offender.  For doing this, the maximum details of each criminals is collected from criminal records and the modus operandi is found out
  24. 24. Future Work  Criminal Profiling  Sifting through each crime record after a particular crime occurrence is tedious task.  So instead we can use some visualization mechanisms to represent the criminal details in a human understandable form.
  25. 25. Future Work  Criminal Profiling
  26. 26. Conclusion Data Collection • Web sites, news channels, blogs, etc. Classification • Using Naïve Bayes theorem, a predictor is created Patten Identification • Apriori Algorithm Prediction • Decision Tree Visualization • Neo4j • GraphDB

×