Merging K-means solutions for clustering large datasets

Speaker:

Semhar Michael, South Dakota State University

Date and Time:

Thursday, November 14, 2019 - 4:00pm to 4:30pm

Location:

Fields Institute, Stewart Library

Abstract:

The K-means algorithm is one of the most popular clustering procedures due to its computational speed and intuitive construction. Unfortunately, the application of K-means in its traditional form based on Euclidean distances is limited to cases with spherical clusters of approximately equal size. At the same time, it is a common practice to use the algorithm without checking the underlying assumption leading to meaningless or misleading solutions. We propose merging solutions obtained by K-means to produce meaningful groupings. The notion of pairwise overlap is used to measure the closeness of the groups in the obtained solution. The ideas are illustrated through examples and real data with good results.

The Fields Institute for
Research in Mathematical Sciences

Merging K-means solutions for clustering large datasets

Scheduled as part of

People and Contacts

Calendar and Events