# Matching like Patterns

We consider procedures which amount to performing a complete match between two structures

Matching like Patterns

We consider procedures which amount to performing a complete match between two structures

The match will be accomplished by comparing the two structures and testing for equality among the corresponding parts

Pattern variables will be used for instantiations of some parts subject to restrictions Matching Substrings

A basic function required in many match algorithms is to determine if a substring S2 consisting of m characters occurs somewhere in a string S1 of n characters, m ≤ n

A direct approach to this problem is to compare the two strings character-by-character, starting with the first characters of both S1 and S2

If any two characters disagree, the process is repeated, starting with the second character of S1 and matching again against S2 character-by-character until a match is found or disagreements occurs again

This process continues until a match occurs or S1 has no more characters

Let i and j be position indices for string S1 and a k position index for S2

We can perform the substring match with the following algorithm

:= 0

While i ≤ (n – m + 1) do

begin

i := i + 1;

j := i;

k :=1;

while S1(j) = S2(k) do

begin

if k = m

writeln(“success”)

else do

begin

j := j +1;

k := k + 1;

end

end

end

writeln(“fail”)

end

This algorithm requires m(n - m) comparisons in the worst case

A more efficient algorithm will not repeat the same comparisons over and over again

One such algorithm uses two indices I and j, where I indexes the character positions in S1 and j is set to a “match state” value ranging from 0 to m

The state 0 corresponds to no matched characters between the strings, while state 1 corresponds to the first letter in S2 matching character i in S2

State 2 corresponds to the first two consecutive letters in S2 matching letters i and i+1 in S1 respectively, and so on, with state m corresponding to a successful match

Whenever consecutive letters fail to match, the state index is reduced accordingly Matching Graphs

Two graphs G1 and G2 match if they have the same labeled nodes and same labeled arcs and all node-to-node arcs are the same

If G2 with m nodes is a subgraph of G1 with n nodes, where n ≥ m

In a worst cas match, this will require n!/(n - m)! node comparison and O(m2) arc comparisons

Finding subgraph isomorphisms is also an important matching problem

An isomorphism between the graphs G1 and G2 with vertices V1, V2 and edges E1, E2, that is, (V1, E1) and (V2, E2) respectively, is a one-to-one mapping to f between V1 and V2, such that for all v1 € V1, f(v1) = v2, and for each arc e1 € E1 connecting v1 and v1 , there is a corresponding arc e2 € E2 connecting f(v1) and f(v1)

Matching Sets and Bags

An exact match of two sets having the same number of elements requires that their intersection also have the number of elements

Partial matches of two sets can also be determined by taking their intersections

If the two sets have the same number of elements and all elements are of equal importance, the degree of match can be the proportion of the total members which match

If the number of elements differ between the sets, the proportion of matched elements to the minimum of the total number of members can be used as a measure of likeness

When the elements are not of equal importance, weighting factors can be used to score the matched elements

For example, a measure such as

s(S1,S2)                    =   (∑ wi N(ai))/m

could be used, where wi =1 and N(ai) = 1 if ai is in the intersection; otherwise it is 0

An efficient way to find the intersection of two sets of symbolic elements in LISP is to work through one set marking each element on the elements property list and then saving all elements from the other list that have been marked

The resultant list of saved elements is the required intersection

Matching two bags is similar to matching two sets except that counts of the number of occurrences of each element must also be made

For this, a count of the number of occurrences can be used as the property mark for elements found in one set. This count can then be used to compare against a count of elements found in the second set

Matching to Unify Literals

One of the best examples of nontrivial pattern matching is in the unification of two FOPL literals

For example, to unify P(f(a,x), y ,y) and P(x, b, z) we first rename variables so that the two predicates have no variables in common

This can be done by replacing the x in the second predicate with u to give P(u, b, z)

Compare the two symbol-by-symbol from left to right until a disagreement is found

Disagreements can be between two different variables, a nonvariable term and a variable, or two nonvariable terms. If no disagreements is found, the two are identical and we have succeeded

If disagreements is found and both are nonvariable terms, unification is impossible; so we have failed

If both are variables, one is replaced throughout by the other. Finally, if disagreement is a variable and a nonvariable term, the variable is replaced by the entire term

In this last step, replacement is possible only if the term does not contain the variable that is being replaced. This matching process is repeated until the two are unified or until a failure occurs

For the two predicates P, above, a disagreement is first found between the term f(a,x) and variable u. Since f(a,x) does not contain the variable u, we replace u with f(a,x) everywhere it occurs in the literal

This gives a substitution set of {f(a,x)/u} and the partially matched predicates P(f(a,x),y ,y) and P(f(a,x), b, z)

Proceeding with the match, we find the next disagreements pair y and b, a variable and term, respectively

Again we replace the variable y with the terms b and update the substitution list to get {f(a,x)/u, b/y}

The final disagreements pair is two variables. Replacing the variable in the second literal with the first we get the substitution set {f(a,x)/u, b/y, y/z}or {f(a,x), b/y, b/z}

For example, a LISP program which uses both the open and the segment pattern matching variables to find a match between a pattern and a clause

(defun match (pattern clause)

(cond ((equal pattern clause) t )                      ; return t if

((or (null pattern) (null clause)) nil)                ; equal, nil

; if not

((or (equal (car pattern) (car clause))     ;not, ?x

; binds

(equal (car pattern) ‘?x))                                           ; to single

(match (cdr pattern) (cdr clause)))                            ; term, *y

; binds

((equal (car pattern) ‘*y)                                           ; to several

(or (match (cdr pattern) (cdr clause))                        ;contiguous

(match pattern (cdr clause))))))                                 ; terms

When a segment variable is encountered (the *y), match is recursively executed on the cdrs of both pattern and clause or on the cdr of clause and pattern as *y matches one or more than one item respectively

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Related Topics