Soheil Hassas Yeganeh

Learn More
A basic step in integration is the identification of linkage points, i.e., finding attributes that are shared (or related) between data sources, and that can be used to match records or entities across sources. This is usually performed using a match operator, that associates attributes of one database to another. However, the massive growth in the amount(More)
Many Web data sources and APIs make their data available in XML, JSON, or a domain-specific semi-structured format, with the goal of making the data easily accessible and usable by Web application developers. Although such data formats are more machine-processable than pure text documents, managing and analyzing such data in large scale is often nontrivial.(More)
TCP is designed to operate in a wide range of networks. Without any knowledge of the underlying network and traffic characteristics, TCP is doomed to continuously increase and decrease its congestion window size to embrace changes in network or traffic. In light of emerging popularity of centrally controlled Software-Defined Networks (SDNs), one might(More)
Simplicity is a prominent advantage of Software-Defined Networking (SDN), and is often exemplified by implementing a complicated control logic as a simple control application on a centralized controller. In practice, however, SDN controllers turn into distributed systems due to performance and reliability limitations, and the supposedly simple control(More)
In this paper, we present the design and implementation of Beehive, a distributed control platform with a simple programming model. In Beehive, control applications are centralized asynchronous message handlers that optionally store their state in dictionaries. Beehive's control platform automatically infers the keys required to process a message, and(More)
— Clustering is the problem of finding relations in a data set in an supervised manner. These relations can be extracted using the density of a data set, where density of a data point is defined as the number of data points around it. To find the number of data points around another point, region queries are adopted. Region queries are the most expensive(More)
—Among different traffic classification approaches, Deep Packet Inspection (DPI) methods are considered as the most accurate. These methods, however, have two drawbacks: (i) they are not efficient since they use complex regular expressions as protocol signatures, and (ii) they require manual intervention to generate and maintain signatures, partly due to(More)