Bhaskar Ghosh

Learn More
We consider the following general problem modeling load balancing in a variety of distributed settings. Given an arbitrary undirected connected graph G=(V,E) and a weight distribution w 0 on the nodes, determine a schedule to move weights across edges in each step so as to (approximately) balance the weights on the nodes. We focus on diffusive schedules for(More)
This paper presents an analysis of the following load balancing algorithm. At each step, each node in a network examines the number of tokens at each of its neighbors and sends a token to each neighbor with at least 2d+1 fewer tokens, where d is the maximum degree of any node in the network. We show that within O(∆/α) steps, the algorithm reduces the(More)
The fundamental problems in dynamic load balancing and job scheduling in parallel and distributed computers involve moving load between processors. In this paper, we consider a new model for load movement in synchronous parallel and distributed machines. In each step of our model, each processor can transfer load to at most one neighbor; also, any amount of(More)
Espresso is a document-oriented distributed data serving platform that has been built to address LinkedIn's requirements for a scalable, performant, source-of-truth primary store. It provides a hierarchical document model, transactional support for modifications to related documents, real-time secondary indexing, on-the-fly schema evolution and provides a(More)
This paper describes the new architecture and optimizations for parallel SQL execution in the Oracle 10g database. Based on the fundamental shared-disk architecture underpinning Oracle's parallel SQL execution engine since Oracle7, we show in this paper how Oracle's engine responds to the challenges of performing in new grid-computing environments. This is(More)
We consider the following general problem modeling load balancing in a variety of distributed settings. Given an arbitrary undirected connected graph G = (V. E) and a weight distribution W“ on the nodes, determine a schedule to move weights in each step across edges so as to (approximately) balance the weights on the nodes. We focus on dtflusivr schedules(More)
In Internet architectures, data systems are typically categorized into source-of-truth systems that serve as primary stores for the user-generated writes, and derived data stores or indexes which serve reads and other complex queries. The data in these secondary stores is often derived from the primary data through custom transformations, sometimes(More)
Linked In is among the largest social networking sites in the world. As the company has grown, our core data sets and request processing requirements have grown as well. In this paper, we describe a few selected data infrastructure projects at Linked In that have helped us accommodate this increasing scale. Most of those projects build on existing open(More)
We introduce and formalize a novel constrained path optimization problem that is the heart of the real-time ad serving task in the Yahoo! (formerly RightMedia) Display Advertising Exchange. In the Exchange, the ad server's task for each display opportunity is to compute, with low latency, an optimal valid path through a directed graph representing the(More)