Package 'DBCVindex' reference manual

Title:	Calculates the Density-Based Clustering Validation Index (DBCV) Index
Description:	A metric called 'Density-Based Clustering Validation index' (DBCV) index to evaluate clustering results, following the <https://github.com/FelSiq/DBCV> 'Python' implementation by Felipe Alves Siqueira. Original 'DBCV' index article: Moulavi, D., Jaskowiak, P. A., Campello, R. J., Zimek, A., & Sander, J. (2014, April). "Density-based clustering validation", Proceedings of SDM 2014 -- the 2014 SIAM International Conference on Data Mining (pp. 839-847), <doi:10.1137/1.9781611973440.96>.
Authors:	Davide Chicco [aut, cre]
Maintainer:	Davide Chicco <[email protected]>
License:	GPL-3
Version:	1.1
Built:	2025-01-09 23:22:47 UTC
Source:	https://github.com/cranhaven/cranhaven.r-universe.dev

Function to compute pairwise distances and ensure matrix format

Description

Function to compute pairwise distances and ensure matrix format

Usage

compute_pair_to_pair_dists(data, metric = "euclidean")
compute_pair_to_pair_dists(data, metric = "euclidean")

Arguments

`data`	input clustering results
`metric`	metric of the distance, Euclidean by default

Value

a pairwise distances' matrix

Examples


n = 300; noise = 0.05; seed = 1782;
theta <- seq(0, pi, length.out = n / 2)
 x1 <- cos(theta) + rnorm(n / 2, sd = noise)
 y1 <- sin(theta) + rnorm(n / 2, sd = noise)
 x2 <- cos(theta + pi) + rnorm(n / 2, sd = noise)
 y2 <- sin(theta + pi) + rnorm(n / 2, sd = noise)
 X <- rbind(cbind(x1, y1), cbind(x2, y2))

dist_matrix <- compute_pair_to_pair_dists(X)
n = 300; noise = 0.05; seed = 1782;
theta <- seq(0, pi, length.out = n / 2)
 x1 <- cos(theta) + rnorm(n / 2, sd = noise)
 y1 <- sin(theta) + rnorm(n / 2, sd = noise)
 x2 <- cos(theta + pi) + rnorm(n / 2, sd = noise)
 y2 <- sin(theta + pi) + rnorm(n / 2, sd = noise)
 X <- rbind(cbind(x1, y1), cbind(x2, y2))

dist_matrix <- compute_pair_to_pair_dists(X)

Function that calculates the Density-Based Clustering Validation index (DBCV) of clustering results

Description

Function that calculates the Density-Based Clustering Validation index (DBCV) of clustering results

Usage

dbcv(data, labels, metric = "euclidean", noise_id = -1)
dbcv(data, labels, metric = "euclidean", noise_id = -1)

Arguments

`data`	input clustering results
`labels`	labels of the clustering
`metric`	metric of the distance, Euclidean by default
`noise_id`	the code of the noise cluster points, -1 by default

Value

a real value containing the Saturn coefficient

Examples


n = 300; noise = 0.05; seed = 1782;
theta <- seq(0, pi, length.out = n / 2)
 x1 <- cos(theta) + rnorm(n / 2, sd = noise)
 y1 <- sin(theta) + rnorm(n / 2, sd = noise)
 x2 <- cos(theta + pi) + rnorm(n / 2, sd = noise)
 y2 <- sin(theta + pi) + rnorm(n / 2, sd = noise)
 X <- rbind(cbind(x1, y1), cbind(x2, y2))
 y <- c(rep(0, n / 2), rep(1, n / 2))

cat("dbcv(X, y) = ", dbcv(X, y), "\n", sep="")
n = 300; noise = 0.05; seed = 1782;
theta <- seq(0, pi, length.out = n / 2)
 x1 <- cos(theta) + rnorm(n / 2, sd = noise)
 y1 <- sin(theta) + rnorm(n / 2, sd = noise)
 x2 <- cos(theta + pi) + rnorm(n / 2, sd = noise)
 y2 <- sin(theta + pi) + rnorm(n / 2, sd = noise)
 X <- rbind(cbind(x1, y1), cbind(x2, y2))
 y <- c(rep(0, n / 2), rep(1, n / 2))

cat("dbcv(X, y) = ", dbcv(X, y), "\n", sep="")

Function to remove duplicate samples from the input data

Description

Function to remove duplicate samples from the input data

Usage

remove_duplicates(data, labels)
remove_duplicates(data, labels)

Arguments

`data`	input clustering results
`labels`	labels of the clustering

Value

a list of data and labels without duplicates

Examples


n = 300; noise = 0.05; seed = 1782;
theta <- seq(0, pi, length.out = n / 2)
 x1 <- cos(theta) + rnorm(n / 2, sd = noise)
 y1 <- sin(theta) + rnorm(n / 2, sd = noise)
 x2 <- cos(theta + pi) + rnorm(n / 2, sd = noise)
 y2 <- sin(theta + pi) + rnorm(n / 2, sd = noise)
 X <- rbind(cbind(x1, y1), cbind(x2, y2))
 y <- c(rep(0, n / 2), rep(1, n / 2))

cat("remove_duplicates(X, y) = ")
print(remove_duplicates(X, y))
n = 300; noise = 0.05; seed = 1782;
theta <- seq(0, pi, length.out = n / 2)
 x1 <- cos(theta) + rnorm(n / 2, sd = noise)
 y1 <- sin(theta) + rnorm(n / 2, sd = noise)
 x2 <- cos(theta + pi) + rnorm(n / 2, sd = noise)
 y2 <- sin(theta + pi) + rnorm(n / 2, sd = noise)
 X <- rbind(cbind(x1, y1), cbind(x2, y2))
 y <- c(rep(0, n / 2), rep(1, n / 2))

cat("remove_duplicates(X, y) = ")
print(remove_duplicates(X, y))

Package 'DBCVindex'

Help Index

Function to compute pairwise distances and ensure matrix format

Description

Usage

Arguments

Value

Examples

Function that calculates the Density-Based Clustering Validation index (DBCV) of clustering results

Description

Usage

Arguments

Value

Examples

Function to remove duplicate samples from the input data

Description

Usage

Arguments

Value

Examples