Loading...
+1-9179056297
contact@mkscienceset.com

Become A Member – Exclusive Author Offer Join Our Exclusive Author Membership Program And Enjoy Unlimited Publications For One Year At A Special Discounted Rate Of $3,999 (Regular Fee: $15,000). Limited-Time Offer Valid Until January 2026.

Enhancing Clustering of News20 Dataset Using Cosine Similarity and K-Means: An Evaluation of Performance Metrics

Abstract:
Clustering categorizes a population N data point into K subgroups so that data points in one group are more similar to those in other groups. The fundamental goal of clustering is dividing data into reasonable groupings based on similarity. Clustering helps define and explore the internal structure of data. Clustering methods can be applied to detect abnormal behavior, segment customers on their buying patterns, and reduce large datasets into fewer related categories. This study used the cosine similarity with the K means clustering method to cluster a news20 dataset. The performance of a proposed system is evaluated using the homogeneity, completeness, V-measures, adjusted rand index, and silhouette coefficient metrics. The experimental findings of a proposed method show the proposed method achieved better performance for clustering of a News 20 dataset.