<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Luigi Grimaudo</style></author><author><style face="normal" font="default" size="100%">Marco Mellia</style></author><author><style face="normal" font="default" size="100%">Elena Baralis</style></author><author><style face="normal" font="default" size="100%">Ram Keralapura</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">SeLeCT: Self-Learning Classifier for Internet Traffic</style></title><secondary-title><style face="normal" font="default" size="100%">IEEE Transactions on Network and Service Management</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">clustering</style></keyword><keyword><style  face="normal" font="default" size="100%">self-seeding</style></keyword><keyword><style  face="normal" font="default" size="100%">Traffic Classification</style></keyword><keyword><style  face="normal" font="default" size="100%">unsupervised machine learning</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2014</style></year><pub-dates><date><style  face="normal" font="default" size="100%">06/2014</style></date></pub-dates></dates><volume><style face="normal" font="default" size="100%">11</style></volume><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Network visibility is a critical part of traffic engineering, network management, and security. The most popular&amp;nbsp;current solutions - Deep Packet Inspection (DPI) and statistical&amp;nbsp;classification, deeply rely on the availability of a training set.&amp;nbsp;Besides the cumbersome need to regularly update the signatures,&amp;nbsp;their visibility is limited to classes the classifier has been trained&amp;nbsp;for. Unsupervised algorithms have been envisioned as a viable&amp;nbsp;alternative to automatically identify classes of traffic. However,&amp;nbsp;the accuracy achieved so far does not allow to use them for traffic&amp;nbsp;classification in practical scenario.&lt;/p&gt;&lt;p&gt;To address the above issues, we propose SeLeCT, a Self-Learning Classifier for Internet Traffic. It uses unsupervised algorithms along with an adaptive seeding approach to automatically&amp;nbsp;let classes of traffic emerge, being identified and labeled. Unlike&amp;nbsp;traditional classifiers, it requires neither a-priori knowledge of&amp;nbsp;signatures nor a training set to extract the signatures. Instead,&amp;nbsp;SeLeCT automatically groups flows into pure (or homogeneous)&amp;nbsp;clusters using simple statistical features. SeLeCT simplifies label&amp;nbsp;assignment (which is still based on some manual intervention) so&amp;nbsp;that proper class labels can be easily discovered. Furthermore,&amp;nbsp;SeLeCT uses an iterative seeding approach to boost its ability to&amp;nbsp;cope with new protocols and applications.&lt;/p&gt;&lt;p&gt;We evaluate the performance of SeLeCT using traffic traces&amp;nbsp;collected in different years from various ISPs located in 3&amp;nbsp;different continents. Our experiments show that SeLeCT achieves&amp;nbsp;excellent precision and recall, with overall accuracy close to 98%.&amp;nbsp;Unlike state-of-art classifiers, the biggest advantage of SeLeCT&amp;nbsp;is its ability to discover new protocols and applications in an&amp;nbsp;almost automated fashion.&lt;/p&gt;</style></abstract><issue><style face="normal" font="default" size="100%">2</style></issue><section><style face="normal" font="default" size="100%">144</style></section></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Luigi Grimaudo</style></author><author><style face="normal" font="default" size="100%">Marco Mellia</style></author><author><style face="normal" font="default" size="100%">Elena Baralis</style></author><author><style face="normal" font="default" size="100%">Ram Keralapura</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Self-Learning Classifier for Internet Traffic</style></title><secondary-title><style face="normal" font="default" size="100%">The 5th IEEE International Traffic Monitoring and Analysis Workshop (TMA 2013)</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2013</style></year></dates><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Network visibility is a critical part of traffic engineering, network management, and security. Recently, unsupervised algorithms have been envisioned as a viable alternative&amp;nbsp;to automatically identify classes of traffic. However, the accuracy&amp;nbsp;achieved so far does not allow to use them for traffic classification&amp;nbsp;in practical scenario.&lt;br /&gt;In this paper, we propose SeLeCT, a Self-Learning Classifier&amp;nbsp;for Internet traffic. It uses unsupervised algorithms along with&amp;nbsp;an adaptive learning approach to automatically let classes of&amp;nbsp;traffic emerge, being identified and (easily) labeled. SeLeCT&amp;nbsp;automatically groups flows into pure (or homogeneous) clusters&amp;nbsp;using alternating simple clustering and filtering phases to remove&amp;nbsp;outliers. SeLeCT uses an adaptive learning approach to boost its&amp;nbsp;ability to spot new protocols and applications. Finally, SeLeCT&amp;nbsp;also simplifies label assignment (which is still based on some&amp;nbsp;manual intervention) so that proper class labels can be easily&amp;nbsp;discovered.&lt;br /&gt;We evaluate the performance of SeLeCT using traffic traces&amp;nbsp;collected in different years from various ISPs located in 3&amp;nbsp;different continents. Our experiments show that SeLeCT achieves&amp;nbsp;overall accuracy close to 98%. Unlike state-of-art classifiers, the&amp;nbsp;biggest advantage of SeLeCT is its ability to help discovering&amp;nbsp;new protocols and applications in an almost automated fashion.&lt;/p&gt;</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Ignacio Nicolas Bermudez</style></author><author><style face="normal" font="default" size="100%">Marco Mellia</style></author><author><style face="normal" font="default" size="100%">Maurizio M Munafo'</style></author><author><style face="normal" font="default" size="100%">Ram Keralapura</style></author><author><style face="normal" font="default" size="100%">Antonio Nucci</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">DNS to the rescue: Discerning Content and Services in a Tangled Web</style></title><secondary-title><style face="normal" font="default" size="100%">Internet Measurement Conference 2012</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">DNS</style></keyword><keyword><style  face="normal" font="default" size="100%">mPlane</style></keyword><keyword><style  face="normal" font="default" size="100%">passive measurement</style></keyword><keyword><style  face="normal" font="default" size="100%">WP2</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2012</style></year><pub-dates><date><style  face="normal" font="default" size="100%">11/2012</style></date></pub-dates></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://dl.acm.org/citation.cfm?id=2398776.2398819&amp;coll=DL&amp;dl=GUIDE&amp;CFID=225051145&amp;CFTOKEN=42401286</style></url></web-urls></urls><edition><style face="normal" font="default" size="100%">ACM</style></edition><publisher><style face="normal" font="default" size="100%">ACM</style></publisher><pub-location><style face="normal" font="default" size="100%">Boston, MA</style></pub-location><volume><style face="normal" font="default" size="100%">1</style></volume><pages><style face="normal" font="default" size="100%">413-426</style></pages><isbn><style face="normal" font="default" size="100%">978-1-4503-1705-4</style></isbn><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;div class=&quot;page&quot; title=&quot;Page 1&quot;&gt;&lt;div class=&quot;layoutArea&quot;&gt;&lt;div class=&quot;column&quot;&gt;&lt;p&gt;&lt;span&gt;A careful perusal of the Internet evolution reveals two major trends - explosion of cloud-based services and video stream- ing applications. In both of the above cases, the owner (e.g., CNN, YouTube, or Zynga) of the content and the organiza- tion serving it (e.g., Akamai, Limelight, or Amazon EC2) are decoupled, thus making it harder to understand the associ- ation between the content, owner, and the host where the content resides. This has created a tangled world wide web that is very hard to unwind, impairing ISPs’ and network administrators’ capabilities to control the traffic flowing in their networks. &lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span&gt;In this paper, we present DN-Hunter, a system that lever- ages the information provided by DNS traffic to discern the tangle. Parsing through DNS queries, DN-Hunter tags traf- fic flows with the associated domain name. This association has several applications and reveals a large amount of use- ful information: (&lt;/span&gt;&lt;span&gt;i&lt;/span&gt;&lt;span&gt;) Provides a fine-grained traffic visibility even when the traffic is encrypted (i.e., TLS/SSL flows), thus enabling more effective policy controls, (&lt;/span&gt;&lt;span&gt;ii&lt;/span&gt;&lt;span&gt;) Identifies flows even before the flows begin, thus providing superior net- work management capabilities to administrators, (&lt;/span&gt;&lt;span&gt;iii&lt;/span&gt;&lt;span&gt;) Un- derstand and track (over time) different CDNs and cloud providers that host content for a particular resource, (&lt;/span&gt;&lt;span&gt;iv&lt;/span&gt;&lt;span&gt;) Discern all the services/content hosted by a given CDN or cloud provider in a particular geography and time interval, and (&lt;/span&gt;&lt;span&gt;v&lt;/span&gt;&lt;span&gt;) Provides insights into all applications/services run- ning on any given layer-4 port number. &lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span&gt;We conduct extensive experimental analysis and show re- sults from real traffic traces (including FTTH and 4G ISPs) that support our hypothesis. Simply put, the information provided by DNS traffic is one of the key components re- quired for understanding the tangled web, and bringing the ability to effectively manage network traffic back to the op- erators.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;</style></abstract><num-vols><style face="normal" font="default" size="100%">1</style></num-vols></record></records></xml>