You Chen

Learn More
The Intrusion detection system deals with huge amount of data which contains irrelevant and redundant features causing slow training and testing process, higher resource consumption as well as poor detection rate. Feature selection, therefore, is an important issue in intrusion detection. In this paper we introduce concepts and algorithms of feature(More)
Collaborative information systems (CIS) are deployed within a diverse array of environments, ranging from the Internet to intelligence agencies to healthcare. It is increasingly the case that such systems are applied to manage sensitive information, making them targets for malicious insiders. While sophisticated security mechanisms have been developed to(More)
Intrusion detection is a critical component of secure information systems. Network anomaly detection has been an active and difficult research topic in the field of Intrusion Detection for many years. However, it still has some problems unresolved. They include high false alarm rate, difficulties in obtaining exactly clean data for the modeling of normal(More)
Collaborative information systems (CISs) are deployed within a diverse array of environments that manage sensitive information. Current security mechanisms detect insider threats, but they are ill-suited to monitor systems in which users function in dynamic teams. In this paper, we introduce the community anomaly detection system (CADS), an unsupervised(More)
Healthcare organizations are deploying increasingly complex clinical information systems to support patient care. Traditional information security practices (e.g., role-based access control) are embedded in enterprise-level systems, but are insufficient to ensure patient privacy. This is due, in part, to the dynamic nature of healthcare, which makes it(More)
In role-based access control (RBAC), roles are traditionally defined as sets of permissions. Roles specified by administrators may be inaccurate, however, such that data mining methods have been proposed to learn roles from actual permission utilization. These methods minimize variation from an information theoretic perspective, but they neglect the expert(More)
Computational phenotyping is the process of converting heterogeneous electronic health records (EHRs) into meaningful clinical concepts. Unsupervised phenotyping methods have the potential to leverage a vast amount of labeled EHR data for phenotype discovery. However, existing unsupervised phenotyping methods do not incorporate current medical knowledge and(More)
Collaborative information systems (CIS) enable users to coordinate efficiently over shared tasks. T hey are often deployed in complex dynamic systems that provide users with broad access privileges, but also leave the system vulnerable to various attacks. Techniques to detect threats originating from beyond the system are relatively mature, but methods to(More)
Web forum has become an important resource on the Web due to its rich information contributed by millions of Internet users every day. Meanwhile, thousands of junk or valueless messages exist in web forum. Recognizing high-quality topics should be fundamental tasks in Search Engine and Web Mining systems. However, it is not a trivial problem to quantify(More)
OBJECTIVE Data in electronic health records (EHRs) is being increasingly leveraged for secondary uses, ranging from biomedical association studies to comparative effectiveness. To perform studies at scale and transfer knowledge from one institution to another in a meaningful way, we need to harmonize the phenotypes in such systems. Traditionally, this has(More)