# Getting started with kdtools

#### 2019-04-23

The kdtools package can be used to search for multidimensional points in a boxed region and find nearest neighbors in 1 to 9 dimensions. The package uses binary search on a sorted sequence of values. The current package is limited to matrices of real values. If you are interested in using string or mixed types in different dimensions, see the methods vignette.

Using kdtools is straightforward. There are four steps:

##### Step 1. Convert your matrix of values into a arrayvec object
library(kdtools)
x = matrix(runif(3e3), nc = 3)
y = matrix_to_tuples(x)
y[1:3, c(1, 3)]
#>            [,1]       [,2]
#> [1,] 0.31760246 0.26673670
#> [2,] 0.91873475 0.83353354
#> [3,] 0.06588832 0.05998507

The arrayvec object can be manipulated as if it were a matrix.

##### Step 2. Sort the data
kd_sort(y, inplace = TRUE, parallel = TRUE)
#>             [,1]       [,2]        [,3]
#> [1,] 0.100212932 0.03737731 0.013635131
#> [2,] 0.073604020 0.06123243 0.024205234
#> [3,] 0.038431772 0.07905850 0.074724742
#> [4,] 0.008127881 0.11080106 0.054616788
#> [5,] 0.014519467 0.22525169 0.006341059
#> (continues for 995 more rows)
##### Step 3. Search the data
rq = kd_range_query(y, c(0, 0, 0), c(1/4, 1/4, 1/4)); rq
#>             [,1]       [,2]       [,3]
#> [1,] 0.228518003 0.15631139 0.12409162
#> [2,] 0.100212932 0.03737731 0.01363513
#> [3,] 0.073604020 0.06123243 0.02420523
#> [4,] 0.038431772 0.07905850 0.07472474
#> [5,] 0.008127881 0.11080106 0.05461679
#> (continues for 16 more rows)
i = kd_nearest_neighbor(y, c(0, 0, 0)); y[i, ]
#> [1] 0.07360402 0.06123243 0.02420523
nns = kd_nearest_neighbors(y, c(0, 0, 0), 100); nns
#>            [,1]      [,2]       [,3]
#> [1,] 0.08993964 0.2667603 0.49467104
#> [2,] 0.09539294 0.5210289 0.20716790
#> [3,] 0.17432820 0.1044617 0.53108661
#> [4,] 0.14192878 0.5455333 0.03875645
#> [5,] 0.09197128 0.5459574 0.10459673
#> (continues for 95 more rows)
nni = kd_nn_indices(y, c(0, 0, 0), 10); nni
#>  [1]  8  9  6 10  5 16  4  3  1  2

The kd_nearest_neighbor and kd_nn_indices functions return row-indices. The other functions return arrayvec objects.

##### Step 4. Convert back to a matrix for use in R
head(tuples_to_matrix(rq))
#>             [,1]       [,2]        [,3]
#> [1,] 0.228518003 0.15631139 0.124091624
#> [2,] 0.100212932 0.03737731 0.013635131
#> [3,] 0.073604020 0.06123243 0.024205234
#> [4,] 0.038431772 0.07905850 0.074724742
#> [5,] 0.008127881 0.11080106 0.054616788
#> [6,] 0.014519467 0.22525169 0.006341059
#> [6,] 0.43326809 0.2484376 0.26075469