ADTool - Statistical Anomaly Detection

ADTool implements a Statistical-based Anomaly Detection Analysis Module running on top of the DBStream streaming data-warehouse system. The goal of ADTool is to detect macroscopic anomalies in the traffic served to a large number of users, meaning anomalies that involve multiple flows and/or affect multiple users at the same time. For this purpose, it resorts to the temporal analysis of the entire probability distributions of certain traffic descriptors or features.

The proposed statistical non-parametric anomaly detection algorithm works by comparing the current probability distribution of a traffic feature to a set of reference distributions describing its normal behavior. At each iteration, the algorithm determines a reference for normality by running a reference-set identification sub-routine. The purpose is to find distributions in the recent past (e.g., in an observation window of one or two weeks) which are best suited to represent the current one. In the testing phase the algorithm assesses the statistical compatibility of the current distribution against the distributions included in the reference set. If the current distribution is flagged as anomalous a warning is raised, and it is discharged to be considered as future reference of normality. The detection test checks if the average inter-distribution distance exceeds a certain anomaly upper bound. As for the distance metric between two distributions p and q, ADTool uses a symmetrized and normalized version of the well-known Kullback-Leibler (KL)-divergence:

Features distribution are computed on a temporal basis considering time bins of fixed length, referred to as time scale. Time scale is a design parameter that can range from 1 to 60 minutes. Functionally speaking, the algorithm consists of two phases: the training and the detection phase. During the training phase the algorithm accumulates distribution time series for a period ranging between 7 and 14 days (depending on the considered timescale). Then, during the detection phase, it uses the information accumulated to identify a suitable reference for normality for the distribution under test. Results of the anomaly detection test, for each traffic dimension, and for each timescale, are logged independently.

For further details on ADTool, we refer the reader to deliverables D4.1 and D4.3, as well as the references therein.

Note that ADTool requires suitable DBStream jobs to compute traffic feature distributions with the required time-granularity. It is designed to run online, i.e. it processes the distributions of features as soon as they are available in the DBStream views.

List of Modules composing ADTool

ADTool consists of several modules which permit to define configuration parameters and set specific detection thresholds and conditions. The following is a list of these modules:

Configs.pm
This module provides an interface between the XML configuration file and the rest of the software. The parsing of the XML file is done by the \verb|XML::Simple| Perl standard module.

DataSrc.pm
This module provides an interface between the PostgreSQL database used by DBStream and the rest of the software. It allows to connect to the database, to query for the last available data to compute and write back the output. Both the read and the write interactions with the database are done by the standard Perl DBI module via SQL queries and inserts.

ENKLd.pm
This module provides the computation of the normalized Kullback-Leibler divergence between two distribution of values. The two distributions are passed to this module as array references and do not need to be normalized in advance.

RefSet.pm
This module defines the package RefSet for managing Reference Sets (collection of past distributions). After being instantiated, a "raw" RefSet object contains all the distributions in the specified reference window. The module provides functions to discard statistically-irrelevant distributions (e.g., not enough samples).

ADTest.pm

This module implements the testing logic of the CDN-AD algorithm. It requires the distribution to be tested, the reference set, and other algorithm parameters (i.e., $\alpha$, $\gamma$).

DBStream Jobs for ADTool

In order to run ADTool, it is necessary to set-up a suitable DBStream job to compute counters of the feature for each variable and time bin. The output view of the job should have the following columns:

serial_time
<variable name>
<feature name>

Note that a single view can be used to collect multiple features if the <variable_name> and the time resolution are compatible.

ADTool Module configuration

The configuration of the software is done via an XML file. The available options are:

[database] host
[database] port
[database] username
[database] password
[database] features table name (output of DBStream job)
[database] flags table name (output of ADTool)
[analysis] start timestamp
[analysis] end timestamp (0 means run forever)
[analysis] name of variable upon whom the job has computed the distribution
[analysis] feature name
[refset] width (in days)
[refset] guard period (in hours)
[refset] min refset size (minimum number of distributions in refset)
[refset] min distr size (minimum number of samples in distribution)
[refset] m (number of top ranked distributions in refset)
[refset] k (currently unused)
[ADtest] alpha (algorithm's sensitivity)

A sample configuration file looks like the following:

<ADTool_config>

<Description>adtool on youtube (ip,imsi_cnt)</Description>

<Database host="localhost" port="5440" dbname="dbstream" user="dbstream" password="FT4hhyhL" >
<features_table>adtool_mw14_gg11_youtube_features_serverip_600</features_table>
<flags_table>adtool_youtube_flags_serverip_600</flags_table>
</Database>

<Analysis>
<start>1396648800</start>
<end>0</end>
<granularity>600</granularity>
<variable>server_ip</variable>
<feature>imsi_cnt</feature>
</Analysis>

<RefSet>
<width>7</width>
<guard>2</guard>

<min_distr_size>100</min_distr_size>
<min_refset_size>80</min_refset_size>
<slack_var>0.1</slack_var>
<m>50</m>
<k>2</k> 
</RefSet>

<ADTest>
<alpha>0.05</alpha>
</ADTest>
</ADTool_config

ADTool workflow

The logic is defined in the main executable \verb|adtool.pl|. The arguments for running the program are:

--config <XML_CONFIG_FILE>
--log <LOG_FILE>

The execution workflow of AdTool is described in the Figure below:

Upon completion of each iteration, the output is reported on STDOUT as well as on the database's flag table specified in the configuration. For each iteration running on a time-bin, the row inserted in the flag table is composed by the following column:

beginning timestamp of the timebin
feature name
output code (0,1,2,3,4)
score
gamma
Phi_{alpha}
Output codes:
- 0: distribution is ``normal''
- 1: distribution is anomalous
- 2: distribution does not contain enough samples
- 3: refset does not contain enough distributions.
- 4: currently unused

ADTool code

--> version 1.0 (August 2014)

--> version 2.3 (August 2015, final release)

File:

952adtool.tar.gz

1238adtool2.3.tar.gz

Latest News

"Characterizing IPv4 Anycast Adoption and Deployment" awarded the IRTF Applied Networking Research Prize

We are proud to announce that the IRTF awarded the Applied Networking Research Prize 2016 (ANRP 2016) to the paper Cicalese, D., J. Auge, D. Joumblatt, T. ur Friedman, and D. Rossi, "Characterizing IPv4 Anycast Adoption and Deployment", ACM CoNEXT, Heidelberg, ACM, 12/2015.Congratulation to Dario's team, and to mPlane for supporting this research!! And thanks to the IRTF and ISOC for...

No QUIC anymore?

UPDATE: After 6 days, still no QUIC traffic... was it so bad? VP1 VP2VP3Saturday 5/12/2015 - It seems Google just stopped serving QUIC on all its servers. Bug or what? :)

mPlane talk @ IRTF RAIM meeting are available online

Notes from the 2015 IRTF & ISOC Workshop on Research and Applications of Internet Measurements (RAIM) in cooperation with ACM SIGCOMM are available online at http://tid.isoc.org:9001/p/raim-2015.Check talks from B. Trammell about mPlane architecture [video], P. Casas talking about results in 3G/4G networks [video].

mPlane Workshop registration is now open!

Registration for the mPlane workshop is now open!Come to see all the great work done in mPlane, to meet people, and to enjoy lively discussion with prominent researchers in the network measurement field!Check all information and how to register in the Workshop webpage.

Special issue on Machine learning, data mining and Big Data frameworks for network monitoring and troubleshooting

mPlane organizes an Elsevier Communication Networks special issue on "Machine learning, data mining and Big Data frameworks for network monitoring and troubleshooting". Please find the call for paper here. Call for papersThe complexity of the Internet has dramatically increased in the last few years, making it more important and challenging to design scalable network traffic monitoring...

EuCNC 2015 mPlane booth - Paris

mPlane will be present at the European Conference on Networks and Communications (EuCNC 2015)!mPlane will be present as exhibitors at the European Conference on Networks and Communications (EuCNC 2015), in Paris. Come to see us! You'll have the chance to see Demonstrations, Flyers, talk to our experts, and get in touch with the mPlane community!Enjoy the demos:Demonstration of the...

mPlane, RITE paper on ECN cited in Apple announcement

Apple has announced at WWDC 2015 (announcement around 34:30 here) that it is turning on ECN by default for client applications in the current developer builds of the next versions of Mac OS X and iOS. In doing so, it cited "Enabling Internet-Wide Deployment of Explicit Congestion Notification", a PAM 2015 paper that was joint work between the FP7 mPlane and RITE projects, on the current state of...

mPlane is technical sponsor of TRAC 2015

mPlane is technical sponsor of the 6th International Workshop on TRaffic Analysis and Characterization, TRAC 2015, which takes place in Dubrovnik, Croatia, from August 24-27 2015. The workshop is technically co-sponsored by IEEE.

mPlane is technical sponsor of TMA 2013

mPlane is technical sponsor of the 5th IEEE International Traffic Monitoring and Analysis Workshop, TMA 2013, which takes place in Turin, Italy, from April 14-19 2013,co-located with IEEE INFOCOM.

mPlane demonstration booth at EuCNC, June 29-July 2, 2015

mPlane has a booth at the European Conference on Networks and Communications (EuCNC) event, held in Paris between June 29 and July 2, 2015.Come and check out our demonstrations there!

mPlane final workshop co-located with ACM CoNEXT, Heidelberg November 30, 2015

NEC will host the final mPlane workshop event in Heidelberg on November 30, 2015.The event is co-localted with ACM CoNEXT.Please stay tuned for additional details.

IEEE JSAC Special issue on Measuring and Troubleshooting the Internet

mPlane organizes a JSAC special issue on "Measuring and Troubleshooting the Internet: Algorithms, Tools and Applications"Please find the call for paper here.Call for papersThe ubiquity of Internet access, and the wide variety of Internet-enabled devices and applications, have made the Internet a principal pillar of the Information Society. However, its distributed nature leads to operational...

mPlane industrial workshop in Barcelona, April 22 2015

mPlane Industrial WorkshopBarcellona - 22 April 2015 The research project mPlane (http://www.ict-mplane.eu), sponsored by the European Commission with the goal of measuring and troubleshooting Internet performance and availability by building an Intelligent Measurement Plane for Future Network and Application Management, organizes an industrial workshop to showcase the technology...

Tracebox mentioned on RIPE entry

Tracebox and mPlane did their first appearance in the Ripe website! Congratulations to the ULG team!

NEC develops new high speed solution for Internet performance monitoring

NEC Laboratories Europe is addressing the challenges presented by today’s distributed and diverse online environment by developing new monitoring and root cause analysis solutions in the research project mPlane.Full press available here.

Factsheet at the end of Second Year

mPlane reached the end of Second Year! Here is a summary of the achievements so far

Marry Christmas and Wonderful 2015

Best wishes for a Merry Christmas and a wonderful 2015 from all the mPlaners!

mPlane invited to participate to IFIP TC6 2014/2 Strategic Review Meeting in Dagstuhl

International Federation for Information Processing (IFIP) is an umbrella organization for national societies working in the field of information technology. The meeting brought together members of the Technical Committee (TC6) representing experts in the field of computer communication and networking, and a group of research leaders to provide expert input into the strategic direction for the...

The Cost of the “S” in HTTPS

The use of HTTPS is increasing and may become the default in HTTP 2.0. The privacy and security benefits of ubiquitous encryption are relatively clear, but what are the costs?Check the paper, the poster, and the presentation to see the answer!

mPlane poster

Approaching the end of the second year, mPlane is now going into dissemination and demonstration!We prepared a poster that summarizes the project aims and status. Thanks go to TI!

4th PhD School on Traffic Monitoring and Analysis (TMA)

The 4th Traffic Monitoring and Analysis (TMA) PhD School was successfully held in London, UK , April 14-16th 2014, with about 40 participants. The school was operated in cooperation with ACM SIGCOMM that kindly sponsored the event, and was for the first time held in conjunction with the 6th International Workshop on Traffic Monitoring and Analysis , increasing the interaction of PhDs with...

mPlane invited to participate in the FIRE-GENI workshop

mPlane has been invited to partcipate to the Second GENI/FIRE Collaboration Workshop - May 5-6, 2014 Cambridge MA. "mPlane – an Intelligent Measurement Plane for Future Network and Application Management"The focus is on Instrumentation and Measurement - interoperability among monitoring testbed - a clear example where the mPlane architecture can be a winner.

Brian Trammell appointed as IAB member!

On 13 February 2014, the NomCom announced the selection of the IAB slate whose terms will start at IETF 89 in March 2014: Mary Barnes Marc Blanchet (incumbent) Ted Hardie Joe Hildebrand Eliot Lear (incumbent, 1 year term) Brian TrammellCongratulation to Brian!!!

4th PhD School on Traffic Monitoring and Analysis

The 4th PhD School on Traffic Monitoring and Analysis (TMA) will be held in London, right after the TMA workshop.Deadline to register is March 12th 2014. Registration gives free access to the school, and to the TMA workshop!

Dagsthul Seminar on "Global Measurement Framework" is over

It was a very interesting opportunity to share ideas and have some discussions in a very friedly environment. Wine was not so bad either .Now it's time to tag mPlanner from the official photo

mPlane is technical sponsor of TRAC 2014

mPlane is technical sponsor of the 5th International Workshop on TRaffic Analysis and Characterization, TRAC 2014, which takes place in Nicosia, Cyprus, from August 4-8 2014. The workshop is technically co-sponsored by IEEE, and is chaired by Pedro Casas and Brian Trammell.

mPlane paper "IP Mining: Extracting Knowledge from the Dynamics of the Internet Addressing Space" got the Best Paper Award at ITC25

The paper Pedro Casas, Pierdomenico Fiadino, and Arian Bär from FTW received the Best Paper Award for the paper "IP Mining: Extracting Knowledge from the Dynamics of the Internet Addressing Space", presented at the 25th International Teletraffic Congress, ITC25, 2013, got the BEST PAPER AWARD!Congratulation to Pedro, Pierdomenico, Arian and all the FTW people!

mPlane paper appears on Slashdot

The IMC paper "Benchmarking Personal Cloud Storage" has been mentioned on Slashdot!The work has been funded within mPlane. Congratulations to Idilio and Enrico!

mPlane@IMC13

Great experience this year in Barcelona for IMC13!mPlane is Gold sponsor, and four mPlane papers presented there!Drago, I., E. Bocchi, M. Mellia, H. Slatman, and A. Pras, "Benchmarking Personal Cloud Storage", Internet Measurement Conference - IMC, Barcelona (ES), ACM, 10/2013. Vanaubel, Y., J-J. Pansiot, P. Mérindol, and B. Donnet, "Network...

TMA workshop website and CFP is available

The web site and Call for Paper of 6th Workshop on Traffic Monitoring and Analysis is up! Given the topic, this is a relevant workshop for mPlanners.Deadline for submission: November 15!

mPlane mentioned at the AGCOM workshop in Rome

The mPlane project has been mentioned during the Italian Regulatory Agency AGCOMM Workshop that has been held in Roma. The workshop focus has been on the "Qualità dell’accesso ad Internet da rete fissa in Italia" [Quality of Internet Access lines in Italy].See here for more information (in Italian)

Deliverable D3.1 completed

The Deliverable D3.1 - Basic Network Data Analysis has been completed and made available to the public. It describes the requirements, input, output for the algorithms needed to perform analytic tasks on a large amount of data, in the context of WP3. Starting from the use cases defined in WP1, we identify the algorithms needed to address the various scenario requirements.

External Advisory Board complete!

The mPlane External Advisory Board is now complete!Welcome to Mark, Fabian, Alberto and Lukasz and many thanks for their time and support to the mPlane project!

Dagsthul Seminar on "Global Measurement Framework"

mPlane contributes to the organisation of the Dagsthul Seminar on "Global Measurement Framework".OrganizersPhilip Eardley (BT Research, GB)Marco Mellia (Politecnico di Torino, IT)Jörg Ott (Aalto University, FI)Jürgen Schönwälder (Jacobs University – Bremen, DE)Henning Schulzrinne (Columbia University, US)The Dagsthul will take place from Sunday, November 17 to Wednesday,...

2nd plenary meeting 27,28,29 May in Paris!

The 2nd mPlane plenary meeting will be held in Paris, kindly hosted by ENST. Book your agenda!- sloppy start monday 27 may (11h30 -18h00)- full day tuesday 28 may (9h-18h00)- sloppy end wed 29 may (9h-16h00)

"Inside Dropbox: Understanding Personal Cloud Storage Services" awarded the IRTF Applied Networking Research Prize

We are proud to announce that the IRTF awarded the Applied Networking Research Prize 2013 (ANRP 2013) to the paperDrago, I., M. Mellia, M. Munafo', A. Sperotto, R. Sadre, and A. Pras, "Inside Dropbox: Understanding Personal Cloud Storage Services", Internet Measurement Conference - IMC, Boston, MA, ACM, 11/2012Congratulation to Idilio, and to mPlane for...

Happy 2013 from mPlane

Check below some statistics on how people celebrated the new year's eve @ midnight!It seems people stopped using the WEB after all, and started drinking some good glasses of spumante in the real world!And let's call my friend to whish him a very nice 2013!! Check the number of VoIP calls:Same anomaly from a larger time scale:Happy new 2013 to all mPlanners!

Check the pictures of the mPlane kick off meeting in Torino

A selection of pictures during the mPlane kickoff meeting are available. Feel free to find yourself

mPlane officially supports the 5th IEEE International Traffic Monitoring and Analysis Workshop (TMA 2013)

mPlane is proud to anounce the officiail support to the 5th IEEE International Traffic Monitoring and Analysis Workshop (TMA 2013) that will be help in Torino, Italy, April 19, 2013. We look forward to see interesting paper on traffic analysis, and some exicting news from mPlane.

Presentation of mPlane at the Cloud-based Service Platforms for the Future Internet workshop, ICCLab

Presentation of mPlane at the Cloud-based Service Platforms for the Future Internet workshop, ICCLab, Zürich University of Applied Sciences, Winterthur, Switzerland [PDF]

mPlane on the national newspaper "La Stampa"

mPlane made it to the "La stampa" national newspaper. You can try google translator if you really wish.

Previous Pause Next

Intranet

News RSS

Main menu

Public

You are here