Parallel Clustering Algorithm for the k-medoids Problem in High-dimensional Space for Large-scale Datasets Full article
Conference |
Optimization Problems of Complex Systems : International Asian School-Seminar 14-22 Aug 2023 , Новосибирск |
||||||
---|---|---|---|---|---|---|---|
Source | 2023 19th International Asian School-Seminar on Optimization Problems of Complex Systems (OPCS) Compilation, IEEE. 2023. 6 c. ISBN 9798350331134. |
||||||
Output data | Year: 2023, Pages: 119-124 Pages count : 6 DOI: 10.1109/opcs59592.2023.10275752 | ||||||
Tags | k-medoids, clustering, facility location, Lagrangian relaxation, machine learning, p-median | ||||||
Authors |
|
||||||
Affiliations |
|
Funding (1)
1 | Sobolev Institute of Mathematics | FWNF-2022-0019 |
Abstract:
We present a robust, parallel primal-dual heuristic algorithm for the k-medoids clustering problem, a widely utilized method in data mining and machine learning. Our approach surpasses current algorithms by effectively addressing their limitations, such as time-consuming distance matrix calculations, inefficient nearest-neighbor searches, and difficulties in handling large-scale datasets. To overcome these challenges, we employ an efficient parallel implementation, combined with a pioneering subgradient search algorithm and a parallel column generation method. We evaluate our algorithm on the BIRCH and Stanford Dog datasets and demonstrate its superiority over existing kmedoids clustering algorithms in terms of solution quality and run time. Additionally, we introduce a novel vectorization technique that enables our algorithm to handle various types of data, such as images, text, and point data. Overall, our work contributes to the field of data mining and machine learning by providing an efficient and effective solution for the k-medoids clustering problem. The proposed algorithm offers improved performance, scalability, and versatility, making it a valuable tool for a wide range of applications.
Cite:
Vandanov S.
, Plyasunov A.V.
, Ushakov A.V.
Parallel Clustering Algorithm for the k-medoids Problem in High-dimensional Space for Large-scale Datasets
In compilation 2023 19th International Asian School-Seminar on Optimization Problems of Complex Systems (OPCS). – IEEE., 2023. – C.119-124. – ISBN 9798350331134. DOI: 10.1109/opcs59592.2023.10275752 Scopus OpenAlex
Parallel Clustering Algorithm for the k-medoids Problem in High-dimensional Space for Large-scale Datasets
In compilation 2023 19th International Asian School-Seminar on Optimization Problems of Complex Systems (OPCS). – IEEE., 2023. – C.119-124. – ISBN 9798350331134. DOI: 10.1109/opcs59592.2023.10275752 Scopus OpenAlex
Dates:
Published print: | Oct 13, 2023 |
Published online: | Oct 13, 2023 |
Identifiers:
Scopus: | 2-s2.0-85175470838 |
OpenAlex: | W4387620806 |