This paper describes a cluster-based high-performance web spider architecture. Its architecture has been designed for handling a very large number of web pages with both URLs contents compression. The method we used to fetch URLs has been designed for achieving maximum performance with respect to well-known spider's considerations. In experiments, our… (More)
Intrusion detection has been performed at network and host level for detecting various attacks. Port scanning could be classified as one of the network intrusions. This paper presents a method for detecting port scanning attacks using rule-based state diagram techniques. A set of rules corresponding with the appropriate thresholds was designed for intrusion… (More)
A common problem of large scale search engines and web spiders is how to handle a huge number of encountered URLs. Traditional search engines and web spiders use hard disk to store URLs without any compression. This results in slow performance and more space requirement. This paper describes a simple URL compression algorithm allowing efficient compression… (More)
With the speed and bandwidth offered by the next generation Internet technology, there is a need for large and scalable Internet server that can provides an adequate computing power and storage for the new generation Internet applications. This requires a huge investment in a very large and expensive commercial server system. Recently, the emergence of… (More)
This paper presents the latest status of Thai web servers. Quantitative measurements are based on database crawling on July 2000. Our experiment shows that the Heaps' and Zipf's laws apply strongly to documents on the Thai web. A visualization tool is developed to show servers connectivity.
restoration largely reduces the number of state synchronization transactions when the number of firewall nodes fluctuates. Therefore, the high-scalability and load balancing are gained with minimal state replications.
—In the IPv4/IPv6 dual-stack environment, enterprises critically need a captive portal based authentication system that can bind a user account to both IPv4 and IPv6 addresses, on the machine the user log-in, and release the binding when the user log-out. Aggravating users by requiring them to do multiple log-in, one per address, is out of the question. In… (More)
Search engines primary rely on web spiders to collect large amount of data for indexing and analysis. Data collection can be performed by several agents of web spiders running in parallel or distributed manner over a cluster of workstations. This parallelization is often necessary in order to cope with a large number of pages in a reasonable amount of time.… (More)
This paper presents a method for detecting TCP SYN flooding attack using BENEF model. Our model relies on the significant parameters of anomalous network packets, the statistic of system behavior, and the decision with threshold and fuzzy rule-based technique. With fuzzy technique, rules or a set of rules corresponding with the appropriate membership value… (More)