User Tools

Site Tools


rules

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
rules [2014/09/25 17:11] manzerrules [2014/09/25 17:59] (current) manzer
Line 18: Line 18:
 An interesting study was conducted by Kniesel and Binun \cite{Kniesel2009}. The study combined five design pattern detection tools and proposed an approach called \textit{Data Fusion} for combining their results in order to identify patterns not detected by individual tools. They evaluated the similarity scoring tool, DP-Miner, PINOT, PTIDEJ, and FUJABA and reached the following finding. Due to property relaxation, a weaker set of constraints of a design pattern holds. Therefore, tools identify general patterns more that the actual patterns. They proposed notations of subpattern and superpattern relationships that are purely technical and not related to the intent. All subpattern instances are also instances of the superpattern. Making the superpattern a more general version. Moreover, they say a pattern is smaller if it is defined by a weaker set of constraints. Therefore, smaller patterns are identified more reliably with higher confidence by more than one tool. On the other hand, big patterns are rarely classified identically by the tools, and if they are identified identically then they have a very high chance of being true positives. They also noticed that if tools do not agree on big patterns, sometimes they agree on their superpatterns. A superpattern is a witness of a subpattern. EDPs are also considered as witnesses. They presented a set of rules for combining the results of the five studied tools. Their approach succeeded in increasing the recall and the precision rates for some design patterns, while no improvement was achieved in other patterns due to the high number of false positives provided by the tools.\\ An interesting study was conducted by Kniesel and Binun \cite{Kniesel2009}. The study combined five design pattern detection tools and proposed an approach called \textit{Data Fusion} for combining their results in order to identify patterns not detected by individual tools. They evaluated the similarity scoring tool, DP-Miner, PINOT, PTIDEJ, and FUJABA and reached the following finding. Due to property relaxation, a weaker set of constraints of a design pattern holds. Therefore, tools identify general patterns more that the actual patterns. They proposed notations of subpattern and superpattern relationships that are purely technical and not related to the intent. All subpattern instances are also instances of the superpattern. Making the superpattern a more general version. Moreover, they say a pattern is smaller if it is defined by a weaker set of constraints. Therefore, smaller patterns are identified more reliably with higher confidence by more than one tool. On the other hand, big patterns are rarely classified identically by the tools, and if they are identified identically then they have a very high chance of being true positives. They also noticed that if tools do not agree on big patterns, sometimes they agree on their superpatterns. A superpattern is a witness of a subpattern. EDPs are also considered as witnesses. They presented a set of rules for combining the results of the five studied tools. Their approach succeeded in increasing the recall and the precision rates for some design patterns, while no improvement was achieved in other patterns due to the high number of false positives provided by the tools.\\
  
 +
 +Example
 +\section{Stage 2: Field Type Probing}
 + \label{field_type_probing}
 +For this stage, we developed a tool called VariableExplorer, available at \url{http://www.cse.yorku.ca/~haneend/FINDER/scripts/VariableExplorer.jar} \footnote{ VariableExplorer source code is available at \url{http://www.cse.yorku.ca/~haneend/VariableExplorer}}. VariableExplorer aims to find additional information about the type of objects contained in collections, as Javex fails to provide such information. It collects information about methods invoked on fields, and the parameters sent to those methods. It also collects information about field types. This information is later used to estimate the type of objects in collections.\\
 +Java source files for the system being examined are required for VariableExplorer. This tool consists of two parts: The first parses the program to an AST (discussed in Section \ref{ast}). The second traverses the AST to extract facts about method invocations. The traversal is performed using a visitor class called VariableExplorerVisitor in VariableExplorer. VariableExplorerVisitor is an extension to the org.eclipse.jdt.core.dom.ASTVisitor class which provides a visitor for abstract syntax trees. VariableExplorerVisitor visits the AST nodes in the provided AST. For this stage, VariableExplorerVisitor visits MethodInvocation nodes. A MethodInvocation node represents a method invocation expression. Below is an example of a MethodInvocation node:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +FieldA.MethodA(FieldB)
 +\end{Verbatim}
 +
 + FieldA represents the node expression, MethodA represents the method invoked on FieldA, and FieldB represents the node argument.
 +FieldDeclaration, SingleVariableDeclaration, and MethodInvocation nodes. Below is a description of each:
 +\begin{itemize}
 + \item FieldDeclaration: represents a field declaration statement. Below is an example of a FieldDeclaration node:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +ClassA FieldA
 +\end{Verbatim}
 + ClassA represents the node type. FieldA  represents the node name.
 + \item VariableDeclarationStatement: represents a local variable declaration statement. Below is an example of a VariableDeclarationStatement node:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +Method(){
 + ClassA VariableA
 +}
 +\end{Verbatim}
 +
 + ClassA represents the node type. VariableA represents the node name.
 + \item SingleVariableDeclaration: Single variable declaration nodes are used in a limited number of places, including formal parameter lists and catch clauses. They are not used for field declarations and regular variable declaration statements. Below is an example of a SingleVariableDeclaration node:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +MethodA(ClassA VariableA)
 +\end{Verbatim}
 + ClassA represents the node type. VariableA represents the node name.
 + \item MethodInvocation: represents a method invocation expression. Below is an example of a MethodInvocation node:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +FieldA.MethodA(FieldB)
 +\end{Verbatim}
 +
 + FieldA represents the node expression, FieldB represents the node argument.
 +\end{itemize}
 +VariableExplorer produces a factbase specifying information about fields and their invoked methods. The output is a ternary relation called MethodInvocation.  Below is an example of a MethodInvocation relation:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +MethodInvocation ClassC FieldA FieldB
 +\end{Verbatim}
 +
 +This means that ClassC invokes a method on FieldA passing FieldB as a parameter
 +
 +
 +\section{Stage 3: Static Facts Refinement and Integration}
 + \label{static_facts_integration}
 +For this stage, we use QL to integrate the static facts produced in the first stage using QL with the facts produced in the second stage by VariableExplorer. The corresponding script is available at \url{http://www.cse.yorku.ca/~haneend/FINDER/scripts/fact_integrate.ql}. \\Assume that we have the following Java code: 
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +public class ClassA{
 +    ClassB variable1;
 +    List<ClassB> variable2;
 +}
 +\end{Verbatim}
 +This example represents a class ClassA that contains two variables: variable1 and variable2. variable1 is of type ClassB. variable2 is a strongly-typed list of type ClassB. The first stage will produce the following facts about ClassA:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_variable ClassA variable1
 +class_variable ClassA variable2
 +variable_descriptors variable1 ClassB
 +variable_descriptors variable2 Collection
 +\end{Verbatim}
 +Note that the first stage is not able to capture the type of the objects in variable2. Instead, variable2 is expressed as a generic collection.\\
 +Next, we investigate the method invocations on generic collections to estimate the type of the objects in the collection. This is done by using the MethodInvocation relation provided by VariableExplorer. Assume that we have the following relation produced by VariableExplorer:
 +ClassVariableType ClassA variable1 ClassB
 +ClassVariableType ClassA variable3 Collection
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +MethodInvocation ClassA variable2 variable1
 +\end{Verbatim}
 +
 +This example represents a method invocation on the collection variable2 in ClassA, passing variable1 as a parameter. Method invocation on collections are the add, remove, and other methods supported by the Collection class in Java, and the passed parameters usually represent an element in the collection. Based on the type of the passed element, we can determine the type of the objects in the generic collection. Since variable1 is of type ClassB, the generic-typed collection variable2 is changed to a strongly-typed collection of type ClassB. \\
 +
 +One limitation of VariableExplorer is that it is not able to find the full variable type of method parameters. VariableExplorer gathers information from AST nodes, which contain information about field and variable types represented as strings. Assume that the full name for ClassB is ClassC.ClassB, and we have a the following FieldDeclaration node in ClassA:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +ClassB variable1
 +\end{Verbatim}
 +
 +VariableExplorer will produce the following relation:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +ClassVariableType ClassA variable1 ClassB
 +\end{Verbatim}
 +
 +VariableExplorer will not get the full name of ClassB as ClassC.ClassB.\\
 +In this stage we solve this limitation by using the uses relationship described in the first stage. Assume that we have the following factbase:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +uses ClassA ClassC.ClassB
 +ClassVariableType ClassA variable1 ClassB
 +\end{Verbatim}
 +
 +The first fact was created in the first stage, and the second fact was created in the second stage. The facts represent a class ClassA that uses a class ClassC.ClassB, where ClassA contains a a field of type ClassB. Since ClassA uses ClassC.ClassB, we can guess the full type of variable1 is ClassC.ClassB. And we modify strongly-typed collections with the variable full type.\\
 +The final factbase consists of all the relations produced by the second stage, in addition to the relations shown in Table \ref{integrate_relations}:
 +
 +\begin{table}
 +\caption{Relations produced by Stage 3 \label{integrate_relations}}
 +\begin{center}
 +{\scriptsize
 +
 +\begin{tabular}{|l|l|p{5cm}|}
 + \hline
 + \textbf{Relation Name} & \textbf{Example}  & \textbf{Explanation}\\
 + \hline
 + class\_typed\_collections & class\_typed\_collections ClassA CollectionC ClassB &  ClassA contains a strongly-typed collection CollectionC of type ClassB. Moreover, if a class contains multiple separate references of another class, it is also considered a strongly-typed collection of the latter class type. \\
 + \hline
 + generic\_typed\_collections & generic\_typed\_collections ClassA CollectionC &  ClassA contains a generic-typed collection CollectionC.\\
 + \hline
 +\end{tabular}
 +}
 +\end{center}
 +\end{table}
 +
 +\section{Stage 4: Design pattern Instance Recovery}
 + \label{instance_recovery}
 +In the last stage, we use QL to filter the factbase for instance candidates that satisfy the static definition of design patterns that we want to detect. For each design pattern, we define its participating roles. And for each role, we define the relations that it needs to satisfy. These static definitions have been created by studying each design pattern's structure, intent, and implementation. These definitions are denoted using the FINDER notation discussed in section \ref{model}, and translated to rules written in QL script. These QL scripts are available at \url{http://www.cse.yorku.ca/~haneend/FINDER/scripts/ql/}. Following is an example of a QL script of a design pattern that consists of three roles: 
 +
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +DP[ClassA,ClassB,ClassC] = {
 + //ClassA uses ClassB
 + uses[ClassA,ClassB]; 
 + //ClassB uses ClassC
 + uses[ClassB,ClassC]; 
 + //ClassA contains a method that returns ClassB
 + class_methods[ClassA, method_id];
 + method_return[method_id, ClassB];
 + //ClassB contains a static method that returns ClassC
 + class_methods[ClassB, method2_id];
 + method_static[method2_id];
 + method_return[method2_id, ClassC];
 +}
 +\end{Verbatim}
 +
 +
 +The output of this script is a list of instance candidates that satisfy these rules. A single design pattern instance candidate consists of role candidate class names corresponding to the design pattern roles. The following shows an example of two candidates for the previous script:\\
 +
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +//DP RoleA, RoleB, RoleC
 +  DP ClassD ClassE ClassF
 +  DP ClassG ClassH ClassI
 +\end{Verbatim}
 +
 +The first line defines the design pattern roles: RoleA, RoleB, and RoleC. The other two lines represent two instance candidates . The number of classes in each line has to match the number of roles. The first instance candidate consists of ClassD (a candidate for RoleA), ClassE (a candidate for RoleB), and ClassF (a candidate for RoleC).\\ In addition to design pattern roles, design pattern instance candidates may contain a method or a field that has an important function in the design pattern, such as the factory method in the FactoryMethod design pattern, and the Collection object in the Composite design pattern.
 +\section{Example}
 +\begin{figure}[ht]
 + \caption{State \label{example:model:state}}
 + \begin{center}
 + \includegraphics[height=3.3in]{state}
 + \end{center}
 + \end{figure}
 +In this section, we show how the FINDER model of the State design pattern discussed in Section \ref{FINDER_state} is translated to a set of rules written in QL. The State design pattern detection script is available at \url{http://www.cse.yorku.ca/~haneend/FINDER/scripts/ql/state.ql}.\\
 +Below are the rules represented by this model followed by the QL script that they are translated to:
 + \begin{enumerate}
 + \item  ConcreteState must inherit from State:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +inherits[concreteState,state];
 +\end{Verbatim}
 + \item  ConcreteState must be concrete:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_concrete[concreteState];
 +\end{Verbatim}
 + \item Context must contain a field of type State:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_variables[context, variable_id];
 +variable_descriptors[variable_id,state];
 +\end{Verbatim}
 + \item  Context must contain a method that calls a method in State:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_methods[concreteState, method_id];
 +class_methods[context, method2_id];
 +method_invoke[method2_id, method_id];
 +\end{Verbatim}
 + \item Variation1:
 + \begin{itemize}
 + \item CreationInContext:\\ Context must contain a method that creates a new ConcreteState:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_methods[context, method_id];
 +method_new[method_id, concreteState];
 +\end{Verbatim}
 + \item  CreationInConcreteState:\\ ConcreteState must contain a method that creates a new ConcreteState: 
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_methods[concreteState, method_id];
 +method_new[method_id, concreteState];
 +\end{Verbatim}
 + Also, the Context gets State passed to it through method parameters:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_methods[context, method_id];
 +method_parameters[method_id, state];
 +\end{Verbatim}
 + Or by calling a method that returns a Context:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_methods[context, method_id];
 +method_invoke_direct[method_id,method2_id];
 +method_return[method2_id, state];
 +\end{Verbatim}
 + \end{itemize}
 + \item Option1 (\textit{ContextUsage}):\\ ConcreteState must contain a reference to Context. Either by having a field of type Context:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_variables[concreteState, variable_id];
 +variable_descriptors[variable_id,context];
 +\end{Verbatim}
 + Or having Context passed to it through method parameters:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_methods[concreteState, method_id];
 +method_parameters[method_id, context];
 +\end{Verbatim}
 + Or by calling a method that returns Context:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +class_methods[concreteState, method_id];
 +method_invoke_direct[method_id,method2_id];
 +method_return[method2_id, context];
 +\end{Verbatim}
 +
 + 
 +\end{enumerate}
 +The result of the previous scripts is a relation called DP that consists of three roles: Context, State, and ConcreteState. This relation contains instance candidates for the State design pattern that satisfy the FINDER rules specified in the State FINDER definition. Each line represents a single instance candidate. Below is a sample output for the DP relation:
 +
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +//DP Context State ConcreteState
 +  DP Class1 Class2 Class3
 +  DP Class1 Class2 Class4
 +  DP Class1 Class2 Class5
 +  DP Class6 Class7 Class8
 +  DP Class6 Class8 Class9
 +\end{Verbatim}
 +
 +
 +\begin{enumerate}
 +\setcounter{enumi}{6}
 + \item Post processing rule: There must be least two ConcreteState role candidates in the same State anchor instance:
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +contextCandidates = dom DP
 +for context in contextCandidates
 +{
 +  state_concreteState = {context} . DP
 +  stateCandidates = dom state_concreteState
 +  for state in stateCandidates
 +  {
 +    concreteStates = {state} . state_concreteState
 +    if(#concreteStates[&0] < 2)
 +    {
 +      DP = DP - DP[&0 =~ context && &1 =~ state]
 +    }
 +  }
 +}
 +\end{Verbatim}
 +
 +Running this post processing rule eliminates instance candidates that do not satisfy this rule. The last two instance candidates in DP will be deleted and the final output will be as follows:
 +\pagebreak[1]
 +\begin{Verbatim}[baselinestretch=1,frame=single]
 +//DP Context State ConcreteState
 + DP Class1 Class2 Class3
 + DP Class1 Class2 Class4
 + DP Class1 Class2 Class5
 +\end{Verbatim}
 +
 +The results show that there are three instance candidates of the State design pattern. Class1 plays the role of Context, Class2 plays the role of State, and Class3, Class4, and Class5 play the role of Concrete States.
 +\end{enumerate}
  
  
  
rules.txt · Last modified: 2014/09/25 17:59 by manzer