Carleton University
Technical Report TR-234
March 1994

String Taxonomy Using Learning Automata

B. John Oommen & Edward V. de St. Croix

Abstract

A typical syntactic pattern recognition (PR) problem involves comparing a noisy string with every element of a dictionary, H. The problem of classification can be greatly simplified if the dictionary is partitioned into a set of sub-dictionaries. In this case, the classification can be hierarchical — the noisy string is first compared to a representative element of each sub-dictionary and the closest match within the sub-dictionary is subsequently located. Indeed, the entire problem of sub-dividing a set of strings into subsets where each subset contains “similar” strings has been referred to as the “String Taxonomy Problem”. To our knowledge there is no reported solution to this problem (see footnote on Page 2). In this paper we shall present a learning-automaton based solution to string taxonomy. The solution utilizes the Object Migrating Automaton (OMA) whose power in clustering objects and images [33,35] has been reported. The power of the scheme for string taxonomy has been demonstrated using random strings and garbled versions of string representations of fragments of macromolecules.

TR-234.pdf