{"id":14590,"date":"2022-05-10T22:09:15","date_gmt":"2022-05-11T02:09:15","guid":{"rendered":"https:\/\/carleton.ca\/scs\/?page_id=14590"},"modified":"2026-06-02T14:59:22","modified_gmt":"2026-06-02T18:59:22","slug":"tr-16-the-noisy-substring-matching-problem","status":"publish","type":"page","link":"https:\/\/carleton.ca\/scs\/research\/scs-technical-reports\/technical-reports-1983\/tr-16-the-noisy-substring-matching-problem\/","title":{"rendered":"TR-16: The Noisy Substring Matching Problem"},"content":{"rendered":"\n<section class=\"w-screen px-6 cu-section cu-section--white ml-offset-center md:px-8 lg:px-14\">\n    <div class=\"space-y-6 cu-max-w-child-5xl  md:space-y-10 cu-prose-first-last\">\n\n            <div class=\"cu-textmedia flex flex-col lg:flex-row mx-auto gap-6 md:gap-10 my-6 md:my-12 first:mt-0 max-w-5xl\">\n        <div class=\"justify-start cu-textmedia-content cu-prose-first-last\" style=\"flex: 0 0 100%;\">\n            <header class=\"font-light prose-xl cu-pageheader md:prose-2xl cu-component-updated cu-prose-first-last\">\n                                    <h1 class=\"cu-prose-first-last font-semibold !mt-2 mb-4 md:mb-6 relative after:absolute after:h-px after:bottom-0 after:bg-cu-red after:left-px text-3xl md:text-4xl lg:text-5xl lg:leading-[3.5rem] pb-5 after:w-10 text-cu-black-700 not-prose\">\n                        TR-16: The Noisy Substring Matching Problem\n                    <\/h1>\n                \n                                \n                            <\/header>\n\n                    <\/div>\n\n            <\/div>\n\n    <\/div>\n<\/section>\n\n<p>Carleton University<br>\n<a href=\"https:\/\/carleton.ca\/scs\/research\/scs-technical-reports\/technical-reports-1983\/\">Technical Report<\/a> <strong>TR-16<\/strong><br>\nJanuary 1983<\/p>\n\n\n\n<h2 id=\"the-noisy-substring-matching-problem\" class=\"wp-block-heading tr_t1\">The Noisy Substring Matching Problem<\/h2>\n\n\n\n<p>R.L. Kashyap &amp; B.J. Oommen<\/p>\n\n\n\n<h3 id=\"abstract\" class=\"wp-block-heading\">Abstract<\/h3>\n\n\n\n<p>Let T(U) be the set of words in the dictionary H which contains U as a substring. The problem considered here is the estimation of the set T(U) when U is not known, . but Y, a noisy version of U is available. The suggested set estimate S*(Y) of T(U) is a proper subset of H such that its every element contains at least one substring which resembles Y most according to the Levenshtein metric. The proposed algorithm for the computation of S*(Y) requires cubic time. The algorithm uses the recursively comput\u00adable dissimilarity measure Dk(X,Y), termed as the k-th distance between two strings X and Y which is a dissimilarity measure between Y and a certain subset of the set of contiguous susbtrings of X. Another estimate of T(U), namely SM(Y) is also suggested. The accuracy of SM(Y) is only slightly less than that of s\u2022(Y), but the computation time of SM(Y) is substantially less than that of S*(Y). Experimental results involving 1,900 noisy substrings and dictionaries which are subsets of 1,023 most common English words [HJ indicate that the accuracy of the estimate S*(Y) is around 99 percent and that of sM(Y) is about 98 percent.<\/p>\n\n\n\n<h3 id=\"download\" class=\"wp-block-heading\">Download<\/h3>\n\n\n\n<p><a href=\"https:\/\/carleton.ca\/scs\/wp-content\/uploads\/sites\/260\/TR-16.pdf\">TR-16.pdf<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Carleton University Technical Report TR-16 January 1983 The Noisy Substring Matching Problem R.L. Kashyap &amp; B.J. Oommen Abstract Let T(U) be the set of words in the dictionary H which contains U as a substring. The problem considered here is the estimation of the set T(U) when U is not known, . but Y, a [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":11785,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"_cu_dining_location_slug":"","footnotes":"","_links_to":"","_links_to_target":""},"cu_page_type":[88],"class_list":["post-14590","page","type-page","status-publish","hentry","cu_page_type-technical-report"],"acf":{"cu_post_thumbnail":false},"_links":{"self":[{"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages\/14590","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/comments?post=14590"}],"version-history":[{"count":1,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages\/14590\/revisions"}],"predecessor-version":[{"id":14592,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages\/14590\/revisions\/14592"}],"up":[{"embeddable":true,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages\/11785"}],"wp:attachment":[{"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/media?parent=14590"}],"wp:term":[{"taxonomy":"cu_page_type","embeddable":true,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/cu_page_type?post=14590"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}