<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Konstantin Kutzkov</style></author><author><style face="normal" font="default" size="100%">Mohamed Ahmed</style></author><author><style face="normal" font="default" size="100%">Sofia Nikitaki</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Weighted Similarity Estimation in Data Streams</style></title><secondary-title><style face="normal" font="default" size="100%">CIKM</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2015</style></year><pub-dates><date><style  face="normal" font="default" size="100%">10/2015</style></date></pub-dates></dates><publisher><style face="normal" font="default" size="100%">ACM</style></publisher><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Similarity computation between pairs of objects is often a bottleneck in many applications that have to deal with massive volumes of data. Motivated by applications such as collaborative filtering in large-scale recommender systems, and influence probabilities learning in social networks, we present new randomized algorithms for the estimation of weighted similarity in data streams.&lt;/p&gt;&lt;p&gt;&lt;span style=&quot;font-size: 12.1599998474121px; line-height: 20.6719989776611px;&quot;&gt;Previous works have addressed the problem of learning binary similarity measures in a streaming setting. To the best of our knowledge, the algorithms proposed here are the first that specifically address the estimation of weighted similarity in data streams. The algorithms need only one pass over the data, making them ideally suited to handling massive data streams in real time.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style=&quot;font-size: 12.1599998474121px; line-height: 20.6719989776611px;&quot;&gt; We obtain precise theoretical bounds on the approximation error and complexity of the algorithms. The results of evaluating our algorithms on two real-life datasets validate the theoretical findings and demonstrate the applicability of the proposed algorithms.&lt;/span&gt;&lt;/p&gt;</style></abstract></record></records></xml>