by Sébastien Heymann
seb@linkurio.us
Kick-start Graph
Visualization Projects.
...with software.
Co-founder of the Gephi project - 2008
Co-founder of the Linkurious startup - 2013
PhD in computer scien...
Open source project started in 2008
Built to solve large graph visualization problems
Latest version downloaded ~ 400,000 ...
A few words about me / Gephi
A few words about me / Linkurious
Started by a collaboration with Stanford - Mapping the
Republic of Letters and DensityDe...
A few words about me / Linkurious
Beautiful but
unreadable
pictures?
Let’s make graph visualization
useful.
0. Why?
1. Key takeaways
a. The 5 questions
b. User stories
c. Design visualization + interaction
2. Fraud detection use c...
0. Why graph
visualization?
Huh...
What is a graph?
This is a graph.
Father Of
Father Of
Siblings
What is a graph? / Nodes & relationships
A graph is a set of nodes
linked by relationships.
Father Of
Father Of
Siblings
T...
People, objects, movies,
restaurants, music...
Antennas, servers, phones,
people...
Supplier, roads, warehouses,
products....
Graph visualization
can help you in many
ways.
Do you have a graph project?
“The greatest
value of a picture
is when it forces
us to notice what
we never
expected to see.”
Why?
John Tukey
(1962)
How to create and use graph visualization successfully?
1. Key takeaways
to kick-start your
projects.
a. Ask 5 questions.
...
Ask 5 questions / Q1: Data, tadaa?
You need data.
sourcing - cleaning - update
sensemaking - scale - complexity
Ask 5 questions / Q1: Data, tadaa?
Can you model data as graphs?
image: Martin Grandjean
Hypothesis discovery,
evidence finding
Impact analysis, reportingData modelling, database
administration
Set up your goal....
Ask 5 questions / Q3: Who will use it?
Define personas.
data scientist business analyst
developer public audience
images: ...
Short-term memory
max 7 items otherwise the ability to make
decisions drops
Vision
more than 10 000 nodes is generally use...
50 nodes – 1B nodes Graph size
Machine performances
Server side VS client side rendering
Interactive VS print
Ask 5 questi...
individual use VS collaborative work
artwork VS integrated into an application
Ask 5 questions / Q5: How is it used?
Defin...
1. What are the data?
2. What is your goal?
3. Who is your end-user?
4. What are the constraints?
5. How is it used?
Ask 5...
Ask 5 questions / Your turn!
Answer the 5
questions of
your project.
PRACTICE
How to create and use graph visualization successfully?
1. Key takeaways
to kick-start your
projects.
a. Ask 5 questions.
...
I define a data model.
I generate a significant graph sample.
I create a business query with Cypher.
I visualize the query...
Write user story / Your turn!
Write your
own user story.
PRACTICE
How to create and use graph visualization successfully?
1. Key takeaways
to kick-start your
projects.
a. Ask 5 questions.
...
Graph visualization in practice
Design visualization
How to represent
graphs?
(a) Nodes are ordered as rows and columns; connections are indicated as filled cells.
(b) A matrix representation of a typ...
(a) A directed graph typical of a biological pathway. (b) An undirected graph with
nodes arranged in a circle. (c) A sprin...
Design visualization
Let’s choose node-
link diagrams
because it’s more
common.
Design visualization
Map data to visual variables.
proximity hierarchy group
Expand
Search
Design interaction
Add interactivity
Details on demand
Filter
Design visualization and interaction / Graph Viz 101
Learn more at
http://linkurio.us/graph-viz-101
How to create and use graph visualization successfully?
1. Key takeaways
to kick-start your
projects.
a. Ask 5 questions.
...
Use case
2. Bank loan
fraud detection
use case.
Use case / The cost of fraud
$28.6B
AITE Group estimates that first party
fraud will cost $28.6 billion in credit
card los...
A criminal uses the fake
identity to register a bank
account. He acts like a
normal customer and tries to
secure a loan.
O...
Use case / How do we set up a graph-based fraud detection system?
Let’s ask our 5 questions.
1. What are the data?
2. What...
Use case / Q1: What are the data?
We model customer data as a graph.
Loan
$25k
Home address
58, Eisenhower Square
Customer...
Use case / Q1: What are the data?
In a fraud ring people share the same
information.
58, Eisenhower Square
14, Roses Stree...
Use case / Q2: What is your goal?
We want to detect
fake customer
identities.
She is a fraud expert but has limited
data and computer skills.
She works with a team of data
analysts for a large bank.
W...
Thousands of new loans per month.
Time: a few days
Investigate before transferring more money.
Interaction
Detect fraud ri...
Use case / Q5: How it is used?
The visualization is embedded
in a business process.
Lifecycle events trigger
security chec...
Use case / The user story
The fraud teams acts faster
and more fraud cases can be
avoided.
If something suspicious comes u...
Max 200 nodes visualized
Relationships information is important
Multiple node categories (address, phone, ..)
-> node-link...
Use case
Proof-of-concept
demo with
Linkurious.
Conclusion
Graph visualization
can add a great
value to your project,
learn to leverage it.
Contact us to
discuss your
projects:
contact@linkurio.us
Case studies: http://linkurio.us/blog
Follow @linkurious #GraphVi...
Q&A
3. Q&A
Detailed use case on our blog :
● Part 1 : http://linkurio.us/how-to-detect-bank-loan-fraud-with-graphs-part-1/
● Part 2 :...
Research papers
Visual Analysis of Complex Networks for Business Intelligence with Gephi. Sébastien Heymann and Bénédicte ...
Linkurious technology
Cloud ready and open source based.
Prochain SlideShare
Chargement dans…5
×

Kick start graph visualization projects

4 337 vues

Publié le

Create and use graph visualizations efficiently in your projects.

Publié dans : Données & analyses
0 commentaire
18 j’aime
Statistiques
Remarques
  • Soyez le premier à commenter

Aucun téléchargement
Vues
Nombre de vues
4 337
Sur SlideShare
0
Issues des intégrations
0
Intégrations
612
Actions
Partages
0
Téléchargements
57
Commentaires
0
J’aime
18
Intégrations 0
Aucune incorporation

Aucune remarque pour cette diapositive

Kick start graph visualization projects

  1. 1. by Sébastien Heymann seb@linkurio.us Kick-start Graph Visualization Projects.
  2. 2. ...with software. Co-founder of the Gephi project - 2008 Co-founder of the Linkurious startup - 2013 PhD in computer science, UPMC LIP6 - 2013 A few words about me I democratise graph thinking (with pink titles) makes graphs handy
  3. 3. Open source project started in 2008 Built to solve large graph visualization problems Latest version downloaded ~ 400,000 times http://gephi.org A few words about me / Gephi makes graphs handy
  4. 4. A few words about me / Gephi
  5. 5. A few words about me / Linkurious Started by a collaboration with Stanford - Mapping the Republic of Letters and DensityDesign in 2012. Now French startup of 3 people. Linkurious helps companies make sense of data with user- friendly visualization software. We help business analysts, R&D teams, developers and scientists.
  6. 6. A few words about me / Linkurious
  7. 7. Beautiful but unreadable pictures? Let’s make graph visualization useful.
  8. 8. 0. Why? 1. Key takeaways a. The 5 questions b. User stories c. Design visualization + interaction 2. Fraud detection use case 3. Q&A How to create and use graph visualization successfully? Agenda PRACTICE PRACTICE
  9. 9. 0. Why graph visualization? Huh...
  10. 10. What is a graph? This is a graph. Father Of Father Of Siblings
  11. 11. What is a graph? / Nodes & relationships A graph is a set of nodes linked by relationships. Father Of Father Of Siblings This is a node This is a relationship
  12. 12. People, objects, movies, restaurants, music... Antennas, servers, phones, people... Supplier, roads, warehouses, products... Graphs can be used to model many domains. Supply chains Social networks Communications Differents domains where graphs are important
  13. 13. Graph visualization can help you in many ways. Do you have a graph project?
  14. 14. “The greatest value of a picture is when it forces us to notice what we never expected to see.” Why? John Tukey (1962)
  15. 15. How to create and use graph visualization successfully? 1. Key takeaways to kick-start your projects. a. Ask 5 questions. b. Write user stories. c. Design visualization and interaction.
  16. 16. Ask 5 questions / Q1: Data, tadaa? You need data. sourcing - cleaning - update
  17. 17. sensemaking - scale - complexity Ask 5 questions / Q1: Data, tadaa? Can you model data as graphs? image: Martin Grandjean
  18. 18. Hypothesis discovery, evidence finding Impact analysis, reportingData modelling, database administration Set up your goal. Administrate Understand Monitor Ask 5 questions / Q2: Why using graph visualization in your project? images: XKCD & the web
  19. 19. Ask 5 questions / Q3: Who will use it? Define personas. data scientist business analyst developer public audience images: PhdComics & Despicable Me
  20. 20. Short-term memory max 7 items otherwise the ability to make decisions drops Vision more than 10 000 nodes is generally useless Ask 5 questions / Q4: What are the constraints? Acknowledge human limits.
  21. 21. 50 nodes – 1B nodes Graph size Machine performances Server side VS client side rendering Interactive VS print Ask 5 questions / Q4: What are the constraints? Acknowledge technical limits.
  22. 22. individual use VS collaborative work artwork VS integrated into an application Ask 5 questions / Q5: How is it used? Define scope.
  23. 23. 1. What are the data? 2. What is your goal? 3. Who is your end-user? 4. What are the constraints? 5. How is it used? Ask 5 questions / Summary The 5 questions
  24. 24. Ask 5 questions / Your turn! Answer the 5 questions of your project. PRACTICE
  25. 25. How to create and use graph visualization successfully? 1. Key takeaways to kick-start your projects. a. Ask 5 questions. b. Write user stories. c. Design visualization and interaction.
  26. 26. I define a data model. I generate a significant graph sample. I create a business query with Cypher. I visualize the query result. I iterate on the data model until it is satisfying. Write user story / The developer story “I am creating a Neo4j graph database for my application.”
  27. 27. Write user story / Your turn! Write your own user story. PRACTICE
  28. 28. How to create and use graph visualization successfully? 1. Key takeaways to kick-start your projects. a. Ask 5 questions. b. Write user stories. c. Design visualization and interaction.
  29. 29. Graph visualization in practice
  30. 30. Design visualization How to represent graphs?
  31. 31. (a) Nodes are ordered as rows and columns; connections are indicated as filled cells. (b) A matrix representation of a typical biological pathway. in (Gehlenborg 2012) Design visualization / Common graph representations Matrices
  32. 32. (a) A directed graph typical of a biological pathway. (b) An undirected graph with nodes arranged in a circle. (c) A spring-embedded layout of data from b. in (Gehlenborg 2012) Design visualization / Common graph representations Node-link diagrams
  33. 33. Design visualization Let’s choose node- link diagrams because it’s more common.
  34. 34. Design visualization Map data to visual variables. proximity hierarchy group
  35. 35. Expand Search Design interaction Add interactivity Details on demand Filter
  36. 36. Design visualization and interaction / Graph Viz 101 Learn more at http://linkurio.us/graph-viz-101
  37. 37. How to create and use graph visualization successfully? 1. Key takeaways to kick-start your projects. a. Ask 5 questions. b. Write user stories. c. Design visualization and interaction.
  38. 38. Use case 2. Bank loan fraud detection use case.
  39. 39. Use case / The cost of fraud $28.6B AITE Group estimates that first party fraud will cost $28.6 billion in credit card losses a year by 2016. http://news.alaric.com/industry-news/fraud/a-new-approach-to-first-party-fraud-reducing-bad-debt/ http://bankinganalyticsblog.fico.com/2013/02/first-party-fraud-it-was-me.html
  40. 40. A criminal uses the fake identity to register a bank account. He acts like a normal customer and tries to secure a loan. Once the criminal feels he cannot get access to more money he carefully prepares his exit : in a short amount of time he empties all of his accounts and disappears. A criminal or a group of criminal mix pieces of information (addresses, phone numbers, social security number) to create a “synthetic-identity”. A look at a common fraud scenario banks face. Create a fake identity Go to the bank, ask for a loan Disappear with the money Use case / A common fraud scenario
  41. 41. Use case / How do we set up a graph-based fraud detection system? Let’s ask our 5 questions. 1. What are the data? 2. What is your goal? 3. Who is your end-user? 4. What are the constraints? 5. How is it used?
  42. 42. Use case / Q1: What are the data? We model customer data as a graph. Loan $25k Home address 58, Eisenhower Square Customer name J. Smith Phone number +33 5 68 98 25 74 Credit card 1 234$ ID J. Smith A graph showing a legitimate customer and the information she is linked to.
  43. 43. Use case / Q1: What are the data? In a fraud ring people share the same information. 58, Eisenhower Square 14, Roses Street +33 6 75 89 22 14 $7k P. Martin $12,5k +331 42 58 66 00 J. Smith SSN 17873897893 31195855 $20k E. Selmati SSN 1787576553 $45k P. Smith SSN 1787579953 SSN 1267576553 31184274
  44. 44. Use case / Q2: What is your goal? We want to detect fake customer identities.
  45. 45. She is a fraud expert but has limited data and computer skills. She works with a team of data analysts for a large bank. When an alert is triggered, she checks if the customer account belongs to a potential fraud ring. Use case / Q3: Who is your end-user? Our user is a fraud analyst. image: PhdComics
  46. 46. Thousands of new loans per month. Time: a few days Investigate before transferring more money. Interaction Detect fraud rings by exploring the graph gradually. Use case / Q4: What are the constraints? We have a large graph on a single database.
  47. 47. Use case / Q5: How it is used? The visualization is embedded in a business process. Lifecycle events trigger security checks A new customer opens an account An existing customer asks for a loan A customer skips a loan payment A Neo4j Cypher query runs to detect patterns An analyst visualizes the connections to make an informed decision.
  48. 48. Use case / The user story The fraud teams acts faster and more fraud cases can be avoided. If something suspicious comes up, the analysts can use Linkurious to quickly assess the situation. Linkurious allows the fraud teams to go deep in the data and build cases against fraud rings. Treat false positives Investigate serious cases Save money Linkurious allows you to control the alerts and make sure your customers are not treated like criminals.
  49. 49. Max 200 nodes visualized Relationships information is important Multiple node categories (address, phone, ..) -> node-link diagram -> icons or node colors by category Interactivity : yes Display node and rels information on demand Expand node connections on demand Use case / Visualization and interaction design Design
  50. 50. Use case Proof-of-concept demo with Linkurious.
  51. 51. Conclusion Graph visualization can add a great value to your project, learn to leverage it.
  52. 52. Contact us to discuss your projects: contact@linkurio.us Case studies: http://linkurio.us/blog Follow @linkurious #GraphViz101
  53. 53. Q&A 3. Q&A
  54. 54. Detailed use case on our blog : ● Part 1 : http://linkurio.us/how-to-detect-bank-loan-fraud-with-graphs-part-1/ ● Part 2 : http://linkurio.us/how-to-detect-bank-loan-fraud-with-graphs-part-2/ ● Neo4j data set : https://www.dropbox.com/s/wk8k5r23syp6kbx/fraud%20detection.zip GraphGist by Kenny Bastani : http://gist.neo4j.org/?github-neo4j-contrib%2Fgists%2F%2Fother% 2FBankFraudDetection.adoc Video demonstration : https://vimeo.com/76891393 (around the 12 minutes mark) Graph Visualization 101: http://linkurio.us/graph-viz-101/ Resources Resources
  55. 55. Research papers Visual Analysis of Complex Networks for Business Intelligence with Gephi. Sébastien Heymann and Bénédicte Le Grand. to appear in the Proceedings of the 1st International Symposium on Visualisation and Business Intelligence, in conjunction with the 17th International Conference Information Visualisation (IV 2013 - VBI). Gephi: an open source software for exploring and manipulating networks. Mathieu Bastian, Sébastien Heymann and Mathieu Jacomy. in Proceedings of the Third International AAAI Conference on Weblogs and Social Media (ICWSM'09), in American Journal of Sociology (2009), pp.361-362 Points of View: Bar Charts and Box Plots. M Streit and N Gehlenborg. Nature Methods 11(2):117 (2014). Book chapters Exploratory Network Analysis: Visualization and Interaction. Sébastien Heymann and Bénédicte Le Grand. to appear in Hocine Cherifi (editor), Complex Networks and their Applications, Cambridge University Press. Gephi. Sébastien Heymann. to appear in the Encyclopedia of Social Networks and Mining (ESNAM), Springer. Books Exploratory data analysis. Tukey, J. W. (1977). References References
  56. 56. Linkurious technology Cloud ready and open source based.

×