Learn More
Large volumes of content (bookmarks, reviews, videos, etc.) are currently being created on the “Social Web”, i.e. on Web 2.0 community sites, nd this content is being annotated and commented upon. The ability to view an individual’s entire contribution to the Social Web would be an nteresting and valuable service, particularly important as social networks(More)
Web pages are discriminated based on their topic and genre. Web page genres are capable to improve the modern search engines to focus on the user's information need. In this paper, web pages are represented using character n-grams. Character n-gram representation is language independent and allows automatic extraction of features from a web page.(More)
The World-Wide Web (WWW) is a vast repository of information, much of which is valuable but very often hidden to the user. The anarchic nature of the WWW presents unique challenges when it comes to information extraction and categorization. We view the WWW as a valuable resource for the gathering of information for Digital Libraries. In this paper we will(More)