Dongqing Yang

Learn More
Methods for efficient mining of frequent patterns have been studied extensively by many researchers. However, the previously proposed methods still encounter some performance bottlenecks when mining databases with different data characteristics, such as dense vs. sparse, long vs. short patterns, memory-based vs. disk-based, etc. In this study, we propose a(More)
A database design methodology is defined for the design of large relational databases. First, the data requirements are conceptualized using an extended entity-relationship model, with the extensions being additional semantics such as ternary relationships, optional relationships, and the generalization abstraction. The extended entity-relationship model is(More)
Many database applications require the analysis and processing of data streams. In such systems, huge amounts of data arrive rapidly and their values change over time. The variations on streams typically imply some fundamental changes of the underlying objects and possess significant domain meanings. In some data streams, successive events seem to recur in(More)
With the advent of cloud computing, it becomes desirable to utilize cloud computing to efficiently process complex operations on large graphs without compromising their sensitive information. This paper studies shortest distance computing in the cloud, which aims at the following goals: i) preventing outsourced graphs from neighborhood attack, ii)(More)
Previous studies have shown mining closed patterns provides more benefits than mining the complete set of frequent patterns, since closed pattern mining leads to more compact results and more efficient algorithms. It is quite useful in a data stream environment where memory and computation power are major concerns. This paper studies the problem of mining(More)
Density-based clustering is a sort of clustering analysis methods, which can discover clusters with arbitrary shape and is insensitive to noise data. The efficiency of data mining algorithms is strongly needed with data becoming larger and larger. In this paper, we present a new fast clustering algorithm called CURD, which means Clustering Using References(More)
In this paper, we present a useful data modeling methodology in data warehousing which integrates three existing approaches normally used in isolation: goal-driven, data-driven and user-driven. It comprises of four stages. Goal-driven stage produces subjects and KPIs(Key Performance Indicators) of main business fields. Data-driven stage produces subject(More)