Shanchan Wu

  • Citations Per Year
Learn More
Web pages consist of not only actual content, but also other elements such as branding banners, navigational elements, advertisements, copyright etc. This noisy content is typically not related to the main subjects of the webpages. Identifying the part of actual content, or clipping web pages, has many applications, such as high quality web printing,(More)
Reading online content for educational, learning, training or recreational purposes has become a very popular activity. While reading, people may have difficulty understanding a passage or wish to learn more about the topics covered by it, hence they may naturally seek additional or supplementary resources for the particular passage. These resources should(More)
The phenomenal growth in both scale and importance of social media such as blogs, micro-blogs and user-generated content, has created a need for tools that monitor information diffusion and make recommendations within these platforms. An essential element of social media, particularly blogs, is the hyperlink graph that connects various pieces of content.(More)
As part of the explosion in educational software, online tools, and open educational resources there has been a rapid devaluation of printed textbooks. While digital texts have advantages, printed textbooks still provide irreplaceable value over online media. Therefore technology should enhance, rather than eliminate printed text. To this end, this paper(More)
We present a microblog recommendation system that can help monitor users, track conversations, and potentially improve diffusion impact. Given a Twitter network of active users and their followers, and historical activity of tweets, retweets and mentions, we build upon a prediction tool to predict the Top K users who will retweet or mention a focal user, in(More)
The phenomenal growth of social media, both in scale and importance, has created a unique opportunity to track information diffusion and the spread of influence, but can also make efficient tracking difficult. Given data streams representing blog posts on multiple blog channels and a focal query post on some topic of interest, our objective is to predict(More)