Carleton University
Technical Report TR-182
October 1990
Breaking Substitution Cyphers using Stochastic Automata
Abstract
Let A be a finite plaintext alphabet, and V be a cypher alphabet with the same cardinality as A. In all one-to-one substitution cyphers there exists the property that each element in A maps onto exactly one element in V. This mapping of V onto A is represented by a function T”‘ which maps any v e V onto some A. e A (i.e., T*(v) = A). In this paper we consider the problem of learning the mapping T”‘ (or its inverse (T”‘)-1) by processing a sequence of cypher text. Traditional methods which break the cypher utilize the statistical information contained in the unigrams, and bigrams of the language from which the plaintext has been derived. A fast and more elegant method which uses trigrams of the language from which the plaintext has been derived is the one due to Peleg et. al. which utilizes the concept of relaxation [8,9]. In this paper we present a new learning automaton approach to break the cypher. A new finite state learning machine called the Cypher Learning Automaton (CLA) has been proposed which sequentially processes the cyphertext and a finite dictionary which is used as a model for the language from which the plaintext has been derived. This method is fast and the advantages of the scheme in terms of time and space requirements over the relaxation method [8,9] have been listed. The paper contains various simulation results comparing both the cypher breaking techniques.