Welcome to Software Clustering Encyclopedia

Software clustering methods group entities of a software system, such as classes or source files, into subsystems, and compute views that will uncover buried facts about the software system.

Such methods have been successfully applied to solve many reverse engineering problems. The amount of software clustering related research published in the literature these days demonstrates the importance that the research community places on the development of effective software clustering techniques.

Research on software clustering has taken three directions:

Development of better software clustering algorithms or the improvement of existing ones
Evaluation of existing software clustering methods
Investigation of further application for clustering methodologies in a software context

Cluster Analysis

Cluster Analysis is a group of multivariate techniques whose primary purpose is to group entities based on their attributes. Entities are classified according to predetermined selection criteria, so that similar objects are placed in the same cluster. The objective of any clustering algorithm is to sort entities into groups, so that the variation between clusters is maximized relative to variation within clusters.

The typical stages of cluster analysis techniques are as follows:

Fact Extraction and Filtering
Analysis
1. Compute resemblance coefficients
2. Cluster Creation
Results Visualization
User Feedback Collection

The process may now repeat until satisfactory results are obtained.

Software Clustering

Table of Contents

Welcome to Software Clustering Encyclopedia

Cluster Analysis

See also