{"id":13295,"date":"2021-12-09T20:18:12","date_gmt":"2021-12-10T01:18:12","guid":{"rendered":"https:\/\/carleton.ca\/scs\/?page_id=13295"},"modified":"2026-06-02T14:59:23","modified_gmt":"2026-06-02T18:59:23","slug":"tr-09-03-lightweight-hierarchical-clustering-of-network-packets-using-pn-grams","status":"publish","type":"page","link":"https:\/\/carleton.ca\/scs\/research\/scs-technical-reports\/technical-reports-2009\/tr-09-03-lightweight-hierarchical-clustering-of-network-packets-using-pn-grams\/","title":{"rendered":"TR-09-03: Lightweight Hierarchical Clustering of Network Packets Using (p,n)-grams"},"content":{"rendered":"\n<section class=\"w-screen px-6 cu-section cu-section--white ml-offset-center md:px-8 lg:px-14\">\n    <div class=\"space-y-6 cu-max-w-child-5xl  md:space-y-10 cu-prose-first-last\">\n\n            <div class=\"cu-textmedia flex flex-col lg:flex-row mx-auto gap-6 md:gap-10 my-6 md:my-12 first:mt-0 max-w-5xl\">\n        <div class=\"justify-start cu-textmedia-content cu-prose-first-last\" style=\"flex: 0 0 100%;\">\n            <header class=\"font-light prose-xl cu-pageheader md:prose-2xl cu-component-updated cu-prose-first-last\">\n                                    <h1 class=\"cu-prose-first-last font-semibold !mt-2 mb-4 md:mb-6 relative after:absolute after:h-px after:bottom-0 after:bg-cu-red after:left-px text-3xl md:text-4xl lg:text-5xl lg:leading-[3.5rem] pb-5 after:w-10 text-cu-black-700 not-prose\">\n                        TR-09-03: Lightweight Hierarchical Clustering of Network Packets Using (p,n)-grams\n                    <\/h1>\n                \n                                \n                            <\/header>\n\n                    <\/div>\n\n            <\/div>\n\n    <\/div>\n<\/section>\n\n<p>Carleton University<br>\n<a href=\"https:\/\/carleton.ca\/scs\/research\/scs-technical-reports\/technical-reports-2009\/\">Technical Report<\/a> TR-09-03<br>\nFebruary 2, 2009<\/p>\n\n\n\n<h2 id=\"lightweight-hierarchical-clustering-of-network-packets-using-pn-grams\" class=\"wp-block-heading\">Lightweight Hierarchical Clustering of Network Packets Using (p,n)-grams<\/h2>\n\n\n\n<div class=\"tr_t3\">\n<div class=\"tr_t3\">\n<div class=\"tr_t3\">\n<div class=\"tr_t3\">\n<div class=\"tr_t3\">\n<div class=\"tr_t3\">\n<div class=\"tr_t3\">\n<div class=\"tr_t3\">\n<p class=\"tr_t3\">A. Hijazi, H. Inoue, A. Matrawy, P.C. van Oorschot, A. Somayaji<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div>\n<h3>Abstract<\/h3>\n<p>The complexity of current Internet applications makes understanding network traffic a challenging task. By providing larger-scale aggregates for analysis, unsupervised clustering approaches can greatly aid in the identification of new applications, attacks, and other changes in network usage patterns. In this paper we introduce ADHIC, a new algorithm that clusters similar network traffic together without prior knowledge of protocol structures. Packet similarity is determined through comparisons of (p, n)-grams (substrings within packets at distinguishing offsets). ADHIC is notable in that it 1) assumes no prior knowledge of packet structure, 2) produces a hierarchical decomposition of network traffic, and 3) has the potential to cluster packets at wire speeds. We find that ADHIC appropriately segregates well-known protocols, clusters together traffic of the same protocol running on multiple ports, and segregates traffic from applications, such as p2p, that do not use standard ports. Potential applications for ADHIC include network performance analysis, real-time alerts of flash crowds or worm activity, and dynamic DoS-resistant bandwidth management. We also introduce NetADHICT, our implementation of ADHIC.<\/p>\n<p><a href=\"https:\/\/carleton.ca\/scs\/wp-content\/uploads\/sites\/260\/TR-09-02.pdf\">TR-09-02.pdf<\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Carleton University Technical Report TR-09-03 February 2, 2009 Lightweight Hierarchical Clustering of Network Packets Using (p,n)-grams A. Hijazi, H. Inoue, A. Matrawy, P.C. van Oorschot, A. Somayaji Abstract The complexity of current Internet applications makes understanding network traffic a challenging task. By providing larger-scale aggregates for analysis, unsupervised clustering approaches can greatly aid in the [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":12434,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"_cu_dining_location_slug":"","footnotes":"","_links_to":"","_links_to_target":""},"cu_page_type":[],"class_list":["post-13295","page","type-page","status-publish","hentry"],"acf":{"cu_post_thumbnail":false},"_links":{"self":[{"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages\/13295","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/comments?post=13295"}],"version-history":[{"count":1,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages\/13295\/revisions"}],"predecessor-version":[{"id":13296,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages\/13295\/revisions\/13296"}],"up":[{"embeddable":true,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/pages\/12434"}],"wp:attachment":[{"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/media?parent=13295"}],"wp:term":[{"taxonomy":"cu_page_type","embeddable":true,"href":"https:\/\/carleton.ca\/scs\/wp-json\/wp\/v2\/cu_page_type?post=13295"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}