DragonBreath
An Optimization Engine based on Constraint Programming and Local Search


[HOMEPAGE] -> [Documentation] -> [Interfaces to the Engine] -> [Java] -> [Structural Constraints] -> [Internal Structure]

  [General Approach]   [Docking Graph]   [Test Graph]   [Graph Repair]


Docking Graph

In this section the search for structural constraint instances in the CSP graph is described.

Subgraph Isomorphism
Features of the Problem
Structural Induction
Splitting the Graph
Class Structure DockingGraph
Finding One Part
Saving Matches
Combining the Parts
One Step in the Search Process

Subgraph Isomorphisms

The problem of finding a docking in the CSP graph is a subproblem of the general problem of finding a subgraph isomorphism. Unfortunately the subgraph isomorphism problem is np-complete. To solve such problems in a decent time frame, as is required for DragonBreath (Requirement 12), there are two general approaches:

We can not choose the first possibility, as we need all matches of the docking graph into the CSP graph, and none less. So we have to try and find special features of the given problem that could speed up the search process.

Features of the Problem

What is typical for our problem?

The graphs can take any form. They can be cyclic, or in tree form. They can be big or consist of only one single node. So the form of the graphs can not be used to speed up search. And the application conditions follow other matching rules then the rest of the graph.

Nevertheless there are some features that can be used:
In contrast to the general subgraph isomorphism problem we only need those isomorphisms of the docking graph that map the whole graph to the CSP graph. Furthermore we have graphs with attributes which also speeds up search.

In addition the CSP graph is created step by step. This is due to the used local search. The repair heuristics of the structural and the conventional constraints change the graph locally and step by step a correct graph is created. The graph is only finalized, when a solution to the SCSP is found. The steady changes require a constant update of the matches. We can use this fact and spread the costs of match search throughout the whole creational process -- with little work after each step.

Structural Induction

So we want to use the fact that the SCP-Graph is constructed in a incremental way. The idea is similar to a structural induction. From knowledge of a part of the graph and a special step we infer knowledge of the whole graph.

The induction hypothesis: If we know all the matches in the whole graph, and a new node n is added then we only have to test for matches containing n, and have again all the matches in the graph.

The induction start: For the the empty graph we obviously know all the matches.

The induction step: If we know each match with the nodes already in the graph, then a new match has to use the new node. If a new match does not use it, it would exist independent of the addition. Hence the match existed already before $n$ was added. But that contradicts the hypothesis that all the matches in the old graph were known. So if we found all the new matches containing n we again know all the matches of the graph.

This proceeding guarantees that with every new node only very local search for graph matches has to take place. Of course that is only true as long as there are no new structural constraints added after nodes were already build, as nothing would be known about matches of the new dockinggraph with the existing graph. Fortunately that should not be a problem, as normally all the structural constraints are already known before the search for a solution is started.

When a node is removed we can analogous remove all matches that contained the deleted node and again have all existing matches in the whole graph. No match that did not contain said node could have been destroyed.

But there are exceptions for the application conditions:
PAC If a node of a PAC is removed that does not automatically mean that also all instances that contained the PAC are now invalid. There might allways exist a substitute for the PAC that keeps the docking graph match valid.
NAC NACs work the other way around: When a NAC is found the match gets invalid. So with each extension of the CSP graph it has to be tested if new NACs were created. And if a node was removed that belonged to a NAC match all docking graph matches that were invalid due to the NAC have to be created.

Splitting the Graph

The docking graph has some hyphenation points namely where the application conditions meet the normal graph, and where parts of the docking graph are not connected at all.

Each contiguous section of the graph that is neither PAC nor NAC is called a mainpart.

Wherever a mainpart and an application condition are connected the node of the mainpart that establishes the connection is included in both sections. When the graph is combined both nodes must be guaranteed to have the same interpretation. The main parts are allways searched first, so they have their interpretation before the ACs. So the connecting nodes of the ACs get a link to their counterpart in the main parts. Whenever such an AC node gets a new interpretation assigned it can test via the link if it is compatible with the interpretation of it's sibling. The next figure shows the decomposition of a docking graph into parts.
Split Graph

Now we can search for the parts separately. Why would we want to do that? Because as every part is contiguous it is easy to find each match that contains the new node. We simply have to follow the edges. An other reason is that the parts are different, a positive application condition has to be handled in a different way then a mainpart or a negative application condition.

We no longer know every docking graph match in the existing CSP-Graph, but every match of every part. With these part matches we can construct a match for the whole graph.

The next figure shows a combination of the graph parts. Matches in the CSP graph have a gray background matches of PACs a lighter matches of NACs a darker one. In the beginning at the left the graph parts a) and d) are known twice. Then the node x is added. So the mainpart b) can be found twice. Now all matches of the docking graph that contain the new b) matches have to be found. At the right of the graphic the two newly found matches are shown. One of them is invalid as a NAC can also be found.

Graph Composition

The algorithm implements the principle of devidea and conquer. The spliting of our graph in multiple pieces rises at first sight the complexity, as the sum of search of all pieces has the same cost as the search of one big piece. Plus the pieces have to be combined. But with every new node only some pieces, and not all of them have to be searched, as normally not all parts will contain a node of the type of the new node. Only when a new piece was found the whole graph has to be constructed. And during construction some found part matches can be used for multiple docking graph matches, and do not have to be found for each one again.

Once more the whole algorithm:

For each new node all the parts containing it are searched and stored. For each new found part match all the whole docking graph matches are constructed, using the new part and the parts that were found in the rest of the graph. When a whole match for the docking graph was found a new instance of the Structural Constraint is created.

Class Structure DockingGraph

Docking Class Diagramm

The splitting of the graph is reflected in the class structure of the DockingGraph. A StructuralConstraintGraph consists of mainParts and NACs and PACs. The application condition matches may be moved, i.e. it is only important whether or not a match of an AC exists, not where. Therefore ApplicationConditions are of the type MovableSubgraph. And each MovableSubgraph can consist of several parts that can be freely moved until a match is found.

The docking graph that is used to find new matches is the SCTDockingGraph, the StructuralConstraintTypeDockingGraph. The found matches are represented in the SCIDockingGraph, the StructuralConstraintInstanceDockingGraph.

StructuralConstraintGraph Class Diagramm

Each StructuralConstraintGraphPart consists of nodes and edges. Whenever a new node is added to the CSP graph the searchedNodes are used to find out if a given StructuralConstraintGraphPart needs to check for new matches.

The searchedNodes container is not allways identical with the nodes container. PACs do not allways have to be checked (see PAC), so sometimes the searchedNodes container is empty. Also leaf nodes do not allways have to be searched. We want to recognize, when a CSP graph change changed the matchings. So all the nodes that are affected are tested. But if a leaf node is added, it only leads to a new match, if it is connected to the rest of a valid match. In that case the one neighbor of the leaf node also gets notified, as it's edges are changed. So the match can be found via the neighbor of the leaf. Therefore leaf nodes are not included in the NodeContainer.

The NodeContainer is also used to retieve nodes based on their Type as defined in the Registry.

Finding One Part

Our algorithm profits from the fact that contiguous graphs can easily be found if one node that should be contained in the match is allready known.

So how do we find all matches that contain a certain node?

Each SCNode nd in the tested part, that has the same type as the new node np is taken once as starting point for the following algorithm. These nodes are retrieved via NodeContainer.containsNode(NodeType n)

First np is used as interpretation of nd, then a simple breadth first algorithm assigns all nodes and edges in the docking graph to matching nodes and edges in the problem graph, traveling by the edges through the whole docking graph.

This is done in the findOneEmbedding() function of StructuralConstraintPart. If findOneEmbedding() returns true all the nodes and edges of the tested graph part have nodes in the CSP graph assigned. The found match can be used to create a whole dockinggraph match.

After the first embedding is found there might be more matches with the pair (nd, np). They can be retrieved using the function findNextEmbedding(). As long as findNextEmbedding() returns true another embedding was found. The different matches are found via backtracking.

Saving Matches

We save found matches of graphparts, so we can reuse them for later dockingGraph matches. The easiest way of storing a match is by storing all edge and node assignments. But that requires a lot of storage. For our purpose it would be sufficiente to store only the position of the found matches in the CSP graph by storing one node-interpretation pair. Then we could find the whole match from the pair with the algorithm above.

To be able to combine the stored matches to a preliminary dockingGraph match, we can store little more then just the position. If we store not only one node-interpretation pair, but all the nodes that connect to other graph parts we can easily do a first check whether two parts are compatible or not. The data saved for a match is stored in a GraphPartFrame.

Combining the Parts

When a mainPart was found the match has to be saved and all docking graph matches that contain the mainpart match have to be created.

The saving of a mainPart is done as a GraphPartFrame. Each mainPart contains a list of all the found matches of the type. The frames are used to save memory requirements. They extract the most important features of a match, it's position and the information how it can be combined with matches of neighboring graph parts.

The full matches matches are created in SCTDockingGraph.findAllMatchesWithPartFrame(StructuralConstraintGraphPart part). The algorithm is the same as before. One match is found by combining all fitting matches of all parts. Further matches are found via backtracking. The matches that are combined are the mainParts and all PAC parts that connect more then one mainPart.

It would be possible to combine first all mainparts and then test the application conditions. But this way we can stop the search process as soon as one of the connecting PAC parts can not be found. So search is faster. NACs can only be tested as soon as all mainParts were assigned, as they may not use the same nodes as mainParts. And the nodes the mainParts use are not determined until all mainParts were found. So searching for NACs can not result in an interrupt of the search.

But first of all not part matches are combined but GraphPartFrames. When a correct combination of the frames was found all the frames are recursively filled with real matches. There might be more then one real match with a given combination of frames. During the filling of the frames it has to be guaranteed that no node that is assigned to a mainPart is used by an other graph part. Except for the nodes that establish a link between parts.

Intersections

When an AC is made of several parts it is necessary to find a combination of the matches of the single parts that has no intersections with each other and with the mainParts. In the following figure such an intersection is displayed.

Subgraph Intersections

The same problem exists for TestGraphs where also only one match and not each match has to be found. The mechanisms of sorting a given set of matches of subgraphs so that they do not intersect is therefore extracted into a special superclass, that can be used by both, ACs and the TestGraphs.

Movable Subgraph Class Diagramm

There are two ways of resolving such an intersection. One is by systematically trying all possible combinations of the matches of the parts, until one is found that has no intersections. The other one by only moving parts that are in the way. Both are implemented in the MovableSubgraph class.

One Step in the Search Process

So the primary feature we use is the step by step manner the graph is build in. What change can occur in every step of the search process?

Nodes can be added, removed, connected and disconnected. And thats it! A value change does not affect the structural constraints, and every complex change of structure, like substituting a whole subtree, can be divided into a series of the four elementary changes. As described in "Applying Local Search to Structural Constraint Satisfaction" each docking graph contains normal nodes and edges and PACs and NACs. Each type of subgraph is affected differently by the different operations on the graph.

MainPart

First there is the normal graph, with nodes and edges that have to match for the constraint to match. When a node is added it might be the last missing node in the match of such a mainGraph, so all isomorphisms that contain this node have to be found, as shown in the next figure.

New MainPart Match
With each node or edge added to the constraint graph all mainParts of the dockingGraph are tested for new matches. In the example the node e was added, and in the dockingGraph the node :3 has the same type as the new node. These two nodes are unified, and the dockingGraph is searched with the new interpretation for :3. If that is the case all the embeddings containing the new element are searched via the algorithm presented in section Finding One Part.

The connection of a node does the same, the last missing connection could have been added to a full match, so all isomorphisms that contain the new connection have to be found.

When an existing node is deleted it might have been one of an already found embedding, so all existing embeddings that contain the node have to be removed. The same thing happens if a connection is removed: every docking graph containing it has to be removed. In the next figure node a is removed and one mainPart match and with it one docking graph match is removed.

MainPart Match Node removed

Positive Application Condition

The second type of subgraph is the positive application condition. All of them have to match for the docking graph to match.

If an existing PAC is destroyed the docking graph instance might still be valid, as there could exist an other match for the same PAC. In the next figure first the node d is destroyed, but as there is an other match for the pac via node b the instance stays valid.

PAC Destroyed, Found Again

We allways want to know all the matches in the current CSP-graph, for MainParts it is clear that we have to find all these parts. But we are not interested in all matches of PACs in the graph. If we know that one match exists for a PAC part then we do not have to search for more (as a new match would not change any found graph instance). This fact is used to search only the PAC parts necessary.

Application condition parts can be connected to no main part, to one main part and to multiple main parts (a, b and c in the next figure).

AC Parts

These parts all have to be handled in different ways. Therefore each type of AC part is implemented in a special class, that represents the diverging behaviour.

PAC Class Diagramm

Connected to only one mainPart

They are only searched whenever the connected mainPart was found. This is done to spare computing time. As long as the associated mainPart does not exist the PAC would not contribute to any match. So it is not needed. If the PAC part did not exist in the problem graph at the moment the mainPart was found, then from now on the part is searched with each new node or edge (just as the mainParts are searched.

It is a necessary condition that for a dockingGraph match all the PACs are fullfilled. Therefore if there is a PAC or even a part of one that is not fullfilled with the given interpretation for the mainPart, it is clear that the mainPart can not be used in a whole match.

But once the PAC is found it is well possible, that the mainPart can now be used for a match. Therefore a reference to the mainPart match is stored for the search of the PAC part, so that it can be tried again, when the PAC part is found.

The whole process is illustrated in th next figure. In 2) the pac is created by a new edge, but as the PAC part is not searched it is not found. Then it is destroyed again. In 5) the mainPart is found, but the PAC is missing. The mainPart is stored, and search for the PAC starts. Then in 6) a new node edge creates the PAC, and the mainPart match is checked again, and a match for the dockingGraph is found.

Connected PAC Part

Unconnected

If a part of a PAC is not connected to any mainPart it can not be found easily once the mainParts are found, as there is no fixpoint given as with the connected parts (there all the nodes connected to the mainPart are allready known). They allways have to be searched, so that at each moment all matches in the whole constraint graph are known. It could be reasoned that as only one match for a PAC has to exist, only one match for unconnected parts has to be stored. But this single stored match might use the same nodes as an other part of the same PAC or as a mainPart. Therefore we have to know all the instances of the unconnected parts. This enables us to choose one that is really valid, i.e. not overlapping, with the rest of the dockingGraph.

When a dockingGraph match is not found due to a missing unconnected PAC part that part is marked as missing. Information to restore the rest of the match is stored with the part, so that when it is found the match can be reconstructed.

The process is illustrated in the following figure. There it also becomes clear why more then one match for unconnected parts has to be stored.

Unconnected PAC Part

Connected to multiple mainParts

They also have to be searched allways as each new match for such a part might be the one missing to connect its mainParts. In the following figure first the two mainParts do not exist. The found match is not used. Then a connecting PAC match is created which is compatible with matches for both connected mainParts. The dockingGraph can be constructed.

Connecting PAC Part

For all PAC Parts

After construction of the whole docking graph it is necessary to check again all the PACs. The included mainParts might have used the same nodes, and rendered the PACs invalid. In the next figure that happenes. But there is a substitute for the match used first.

PAC got invalid

Negative Application Condition

The last type of subgraph is the negative application condition. If a match of a NAC is found the whole docking graph does not match any more. The following figure illustrates how an existing docking graph match is destroyed by the new node c wich completes the NAC.

NAC found Match invalid

And, as shown in the next figure, a disconnected or deleted node can do the opposite and destroy an NAC and thereby create a match.

NAC removed => Match found

In contrast to PACs NACs can only be validated once all the mainParts were assigned. Any mainPart could use a node needed by a NAC, and thereby destroy it again. So all NACs connected to mainParts only have to be checked at the end of the matching procedure.

If an embedding could not created due to a NAC, the found dockingGraph match is stored with the NAC embedding. Whenever a node used in the NAC embedding is destroyed the dockingGraph match is tried again. For unconnected NAC parts the argumentation is the same as for unconnected PAC parts: as the part can not be found easily from a mainPart all the instances have to be found in advance.

The class structure for NACs differenciates only between NACs that are connected to any mainPart and those that are not connected at all.

NAC Class Diagramm

Sequence

The constraintType calls the SCTDockingGraph to check the changed nodes. Then first the pending NAC parts have to be found. That are the unconnected ones and the parts that were found and are impeding matches. This has to be done before the mainParts are searched, because when a mainPart match is found and combined to a whole match the NACs should allready be known, to avert the creation of a match that will be removed again anyway, when the NAC was tested. When a mainPart is tested it finds all matches with the given nodes. When a connecting PAC part was not created due to the missing of this mainPart it is created now.

If the PACs connected exclusively to the mainPart were all found all dockingGraph matches with the new match are searched. When a combination of the stored GraphPartFrames was found with tryNextBranch() they are filled in checkAllGraphParts() and then all the ACs are checked again. When the ACs were fulfilled too, an instance of the dockingGraph is created.


[HOMEPAGE] -> [Documentation] -> [Interfaces to the Engine] -> [Java] -> [Structural Constraints] -> [Internal Structure]

  [General Approach]   [Docking Graph]   [Test Graph]   [Graph Repair]



For questions, comments or suggestions, please visit our feedback forum.

Last update:
May 21, 2003 by Tim Baermann