SlideShare une entreprise Scribd logo
1  sur  17
ACDC: An Algorithm for
Comprehension-Driven Clustering
         By: Jimmy Carlos
              ejemplo
Highest cohesion clustering

   SS1         SS6          SS9


               SS5                SS12
  SS2
                      SS8
         SS4
                                  SS11
                SS7
  SS3                 SS10
Contents of SS11
Essential comprehension features
 Effective cluster naming
 Bounded cluster cardinality
 Familiarity
   Comprehension as pattern recognition
   Certain subsystem patterns emerge often in
   manual decompositions of software systems
Source file pattern
File1                 File2

  Proc1                               Proc4
                       Proc5
           Proc2
                              Proc6       Var2
  Proc3
          Var1                   Var3
Directory structure pattern
 Dir1               Dir2

   File1                           File4
                     File5
            File2
                           File6       File8
   File3
           File7              File9
Body-header pattern

                 bob.c
  alice.c

                         bob.h
      alice.h
Leaf collection pattern



    sin.c     cos.c       tan.c
Support library pattern



   busy.c    tired.c      weary.c
Central dispatcher pattern


            dispatcher.c
Subgraph dominator pattern
             dominator.c



       a.c      b.c        c.c



d.c    e.c       f.c       g.c   z.c
The ACDC algorithm
 Two stages:
   Using a pattern-driven approach, a “skeleton”
   of the final decomposition is created.
   Subsystems are named appropriately.
   The decomposition is completed by applying an
   extended version of the Orphan Adoption
   algorithm
Skeleton construction
 Source file clusters
 Body-header conglomeration
 Leaf collection and support library
 identification
 Ordered and limited subgraph domination
 Creation of “support.ss”
Orphan Adoption
 Incremental clustering technique
 Orphan: a newly introduced resource to a
 software system
 Orphans are adopted by the subsystem that
 interacts mostly with them
 Assuming that a substantial skeleton has
 been constructed in the first stage, the same
 technique can be applied here
ACDC properties
 Subsystems have familiar or intuitive names
 The cardinality of the subsystems is
 bounded
 The final decomposition is nested and
 unbalanced
 Limited use of the directory pattern
 Magic numbers not important
Algorithm validation
 We experimented with two different
 software systems, TOBEY and Linux.
 We measured the following:
   Performance     54 sec   84 sec
   Stability       81.3%    69.4%
   Skeleton size   64.3%    51.1%
   Quality         64.2%    55.7%
Conclusions
 Clustering approaches should focus on
 comprehension
 Pattern-driven approach appears to perform
 satisfactorily
 Impact of ACDC’s features on
 comprehension remains to be determined

Contenu connexe

Similaire à Acdc

Clonewise - Automatically Detecting Package Clones and Inferring Security Vu...
Clonewise  - Automatically Detecting Package Clones and Inferring Security Vu...Clonewise  - Automatically Detecting Package Clones and Inferring Security Vu...
Clonewise - Automatically Detecting Package Clones and Inferring Security Vu...Silvio Cesare
 
Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Simeon Warner
 
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERSVTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERSvtunotesbysree
 
Java Hates Linux. Deal With It.
Java Hates Linux.  Deal With It.Java Hates Linux.  Deal With It.
Java Hates Linux. Deal With It.Greg Banks
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppthothyfa
 
Applying Machine Learning to Software Clustering
Applying Machine Learning to Software ClusteringApplying Machine Learning to Software Clustering
Applying Machine Learning to Software Clusteringbutest
 
Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Li Shen
 
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006 Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006 Real Nobile
 
Finding Similar Files in Large Document Repositories
Finding Similar Files in Large Document RepositoriesFinding Similar Files in Large Document Repositories
Finding Similar Files in Large Document Repositoriesfeiwin
 
Plank
PlankPlank
PlankFNian
 
Managing ADLS gen2 using Apache Spark
Managing ADLS gen2 using Apache SparkManaging ADLS gen2 using Apache Spark
Managing ADLS gen2 using Apache SparkDatabricks
 
Distributed System by Pratik Tambekar
Distributed System by Pratik TambekarDistributed System by Pratik Tambekar
Distributed System by Pratik TambekarPratik Tambekar
 
Advances in File Carving
Advances in File CarvingAdvances in File Carving
Advances in File CarvingRob Zirnstein
 

Similaire à Acdc (20)

Ashg2014 grc workshop_schneider
Ashg2014 grc workshop_schneiderAshg2014 grc workshop_schneider
Ashg2014 grc workshop_schneider
 
4
44
4
 
Clonewise - Automatically Detecting Package Clones and Inferring Security Vu...
Clonewise  - Automatically Detecting Package Clones and Inferring Security Vu...Clonewise  - Automatically Detecting Package Clones and Inferring Security Vu...
Clonewise - Automatically Detecting Package Clones and Inferring Security Vu...
 
Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)
 
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERSVTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
VTU 3RD SEM UNIX AND SHELL PROGRAMMING SOLVED PAPERS
 
Java Hates Linux. Deal With It.
Java Hates Linux.  Deal With It.Java Hates Linux.  Deal With It.
Java Hates Linux. Deal With It.
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppt
 
Applying Machine Learning to Software Clustering
Applying Machine Learning to Software ClusteringApplying Machine Learning to Software Clustering
Applying Machine Learning to Software Clustering
 
Smashing The Stack
Smashing The StackSmashing The Stack
Smashing The Stack
 
Hdfs architecture
Hdfs architectureHdfs architecture
Hdfs architecture
 
Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015
 
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006 Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
Actors, a Unifying Pattern for Scalable Concurrency | C4 2006
 
Finding Similar Files in Large Document Repositories
Finding Similar Files in Large Document RepositoriesFinding Similar Files in Large Document Repositories
Finding Similar Files in Large Document Repositories
 
Google
GoogleGoogle
Google
 
Plank
PlankPlank
Plank
 
Managing ADLS gen2 using Apache Spark
Managing ADLS gen2 using Apache SparkManaging ADLS gen2 using Apache Spark
Managing ADLS gen2 using Apache Spark
 
Distributed System by Pratik Tambekar
Distributed System by Pratik TambekarDistributed System by Pratik Tambekar
Distributed System by Pratik Tambekar
 
Bin carver
Bin carverBin carver
Bin carver
 
Zipnotes
ZipnotesZipnotes
Zipnotes
 
Advances in File Carving
Advances in File CarvingAdvances in File Carving
Advances in File Carving
 

Dernier

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 

Dernier (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Acdc

  • 1. ACDC: An Algorithm for Comprehension-Driven Clustering By: Jimmy Carlos ejemplo
  • 2. Highest cohesion clustering SS1 SS6 SS9 SS5 SS12 SS2 SS8 SS4 SS11 SS7 SS3 SS10
  • 4. Essential comprehension features Effective cluster naming Bounded cluster cardinality Familiarity Comprehension as pattern recognition Certain subsystem patterns emerge often in manual decompositions of software systems
  • 5. Source file pattern File1 File2 Proc1 Proc4 Proc5 Proc2 Proc6 Var2 Proc3 Var1 Var3
  • 6. Directory structure pattern Dir1 Dir2 File1 File4 File5 File2 File6 File8 File3 File7 File9
  • 7. Body-header pattern bob.c alice.c bob.h alice.h
  • 8. Leaf collection pattern sin.c cos.c tan.c
  • 9. Support library pattern busy.c tired.c weary.c
  • 11. Subgraph dominator pattern dominator.c a.c b.c c.c d.c e.c f.c g.c z.c
  • 12. The ACDC algorithm Two stages: Using a pattern-driven approach, a “skeleton” of the final decomposition is created. Subsystems are named appropriately. The decomposition is completed by applying an extended version of the Orphan Adoption algorithm
  • 13. Skeleton construction Source file clusters Body-header conglomeration Leaf collection and support library identification Ordered and limited subgraph domination Creation of “support.ss”
  • 14. Orphan Adoption Incremental clustering technique Orphan: a newly introduced resource to a software system Orphans are adopted by the subsystem that interacts mostly with them Assuming that a substantial skeleton has been constructed in the first stage, the same technique can be applied here
  • 15. ACDC properties Subsystems have familiar or intuitive names The cardinality of the subsystems is bounded The final decomposition is nested and unbalanced Limited use of the directory pattern Magic numbers not important
  • 16. Algorithm validation We experimented with two different software systems, TOBEY and Linux. We measured the following: Performance 54 sec 84 sec Stability 81.3% 69.4% Skeleton size 64.3% 51.1% Quality 64.2% 55.7%
  • 17. Conclusions Clustering approaches should focus on comprehension Pattern-driven approach appears to perform satisfactorily Impact of ACDC’s features on comprehension remains to be determined