What's in a session: tracking individual behavior on the web

@article{Meiss2009WhatsIA,
  title={What's in a session: tracking individual behavior on the web},
  author={Mark R. Meiss and John F. Duncan and Bruno Gonçalves and Jos{\'e} J. Ramasco and Filippo Menczer},
  journal={ArXiv},
  year={2009},
  volume={abs/1003.5325}
}
We examine the properties of all HTTP requests generated by a thousand undergraduates over a span of two months. Preserving user identity in the data set allows us to discover novel properties of Web traffic that directly affect models of hypertext navigation. We find that the popularity of Web sites--the number of users who contribute to their traffic--lacks any intrinsic mean and may be unbounded. Further, many aspects of the browsing behavior of individual users can be approximated by log… 
Modeling Traffic on the Web Graph
TLDR
An agent-based model is presented that can reproduce the behaviors observed in empirical data, especially heterogeneous session lengths, reconciling the narrowly focused browsing patterns of individual users with the extreme variance in aggregate traffic measurements.
Remembering what we like: Toward an agent-based model of Web traffic
TLDR
A more realistic navigation model is introduced in which agents maintain individual lists of bookmarks that are used as teleportation targets, which reproduces aggregate traffic patterns such as site popularity, while also generating more accurate predictions of diversity, link traffic, and return time distributions.
Agents, bookmarks and clicks: a topical model of web navigation
TLDR
The resulting model reproduces individual behaviors from empirical data, reconciling the narrowly focused browsing patterns of individual users with the extreme heterogeneity of aggregate traffic measurements, and leading the way to more sophisticated, realistic, and effective ranking and crawling algorithms.
DOBBS: Towards a Comprehensive Dataset to Study the Browsing Behavior of Online Users
  • C. Weth, M. Hauswirth
  • Computer Science
    2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)
  • 2013
TLDR
DOBBS provides a browser add-on which keeps track of users' browsing behavior and is an effort to create a dataset in a non-intrusive, completely anonymous and privacy-preserving way.
Analysing Parallel and Passive Web Browsing Behavior and its Effects on Website Metrics
TLDR
First results of DOBBS are presented, showing that browsing the Web is no longer a dedicated task for many users, and the concepts of parallel browsing and passive browsing are investigated, showing their impact on the calculation of a user's dwell time.
Towards the Characterization of Individual Users through Web Analytics
TLDR
An analysis of the way individual users navigate in the Web indicates a rich variety of individual behaviors and seems to preclude the possibility of defining a characteristic frequency for each user in his/her visits to a single site.
Random surfers on a web encyclopedia
TLDR
The results suggest that the behavior of the random surfer is almost similar to those of users---as long as users do not use search engines, and that classical website navigation structures only exercise limited influence on user navigation anymore.
Online multitasking and user engagement
TLDR
This paper studies the effect of online multitasking on two widely used engagement metrics designed to capture users browsing behavior with a site, and introduces a new representation of user sessions: tree-streams.
Taking into account Tabbed Browsing in Predictive Web Usage Mining
TLDR
A new strategy to better take into account tabbing activity is proposed and an accuracy similar to the one of a state-of-the-art model that implicitly takes into account parallel browsing is provided.
...
...

References

SHOWING 1-10 OF 23 REFERENCES
What do web users do? An empirical analysis of web use
TLDR
It is shown that web page revisitation is a much more prevalent activity than previously reported, that most pages are visited for a surprisingly short period of time, that users maintain large (and possibly overwhelming) bookmark collections, and that there is a marked lack of commonality in the pages visited by different users.
Remembering what we like: Toward an agent-based model of Web traffic
TLDR
A more realistic navigation model is introduced in which agents maintain individual lists of bookmarks that are used as teleportation targets, which reproduces aggregate traffic patterns such as site popularity, while also generating more accurate predictions of diversity, link traffic, and return time distributions.
Ranking web sites with real user traffic
TLDR
The traffic-weighted Web host graph obtained from a large sample of real Web users is analyzed, finding that while search is directly involved in a surprisingly small fraction of user clicks, it leads to a much larger fraction of all sites visited.
Query-Log Based Authority Analysis for Web Information Search
TLDR
A new method is presented that incorporates the notion of query nodes into the PageRank model and integrates the implicit relevance feedback given by click streams into the automated process of authority analysis, and indicates significant improvements in the precision of search results.
Human dynamics revealed through Web analytics
TLDR
This work analyzes properly anonymized logs detailing the access history to Emory University's Web site and finds that linear preferential linking, priority-based queuing, and the decay of interest for the contents of the pages are the essential ingredients to understand the way users navigate the Web.
BrowseRank: letting web users vote for page importance
TLDR
Experimental results show that BrowseRank indeed outperforms the baseline methods such as PageRank and TrustRank in several tasks.
On the lack of typical behavior in the global Web traffic network
TLDR
Analysis of the amount of traffic handled by clients and servers and their number of connections highlights non-trivial correlations between information flow and patterns of connectivity as well as the presence of anomalous statistical patterns related to the behavior of users on the Web.
Characterizing Browsing Strategies in the World-Wide Web
Analysis of User Web Traffic with A Focus on Search Activities
TLDR
This study analysis of a real Web access trace collected over a period of two and half months from the UCLA Computer Science Department indicates that search engines influence about 13.6% of the users’ Web traffic directly and indirectly.
Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results
TLDR
It is shown that a modest amount of randomness leads to improved search results, in the context of an economic objective function based on aggregate result quality amortized over time.
...
...