Advances in Data Mining and Modeling: Hong Kong 27 - 28 June by Wai-Ki Ching, Michael Kwok-Po Ng

Information mining and knowledge modelling are below quick improvement. due to their huge functions and learn contents, many practitioners and lecturers are drawn to paintings in those parts. as a way to selling communique and collaboration one of the practitioners and researchers in Hong Kong, a workshop on facts mining and modelling used to be held in June 2002. Prof Ngaiming Mok, Director of the Institute of Mathematical study, The collage of Hong Kong, and Prof Tze Leung Lai (Stanford University), C.V. Starr Professor of the college of Hong Kong, initiated the workshop. This paintings comprises chosen papers provided on the workshop. The papers fall into major different types: facts mining and knowledge modelling. facts mining papers care for trend discovery, clustering algorithms, category and useful purposes within the inventory industry. information modelling papers deal with neural community types, time sequence versions, statistical types and sensible functions.

Our proposed method actually improves the final UDS solution obtained. Once the coordinates of these n objects are obtained, we can test whether there are large jumps or gaps in the coordinates. In Section 3, we introduce the Schwarz Information Criterion (SIC) given in Chen and Gupta for detecting jumps or gaps in the coordinates. We illustrate our proposed method by the Fisher's famous Iris data as well as by a simulated data. We find that our method outperform the K-means clustering method. Conclusive remarks and further extensions are discussed in Section 4.

Apply 0 to partition S , into k clusters. 4. ) 5. If the partition is accepted, go to step 6, otherwise, select a new k and go to step 3. 6 . Attach the clusters as the children of the partitioned node. Select one as the current node S,. 7. Validate S , to determine whether it is a terminal node or not. 8. If it is not a terminal node, go to step 2. If it is a terminal node, but not the last one, select another node as the current node S,, which has not been validated, and go to step 7. If it is the last terminal node in the tree, stop.

Then this rank ( T I , . . ,T ~ will ) be a good starting configuration for the Guttman’s updating algorithm (denoted by Gum) as well as the Pliner’s smoothing algorithm (denoted by PLm). The motivation of using this rank is as follows: if object k has the maximum total distance from other objects, object k should probably be the first or the last object in the UDS solution. The distances of other objects to object k, ( d k l , . ,d k n ) , should reflect how similar of these objects to object k.

