Given a citation matrix and a community structure, nearest_point calculates the point in the convex hull of community profiles that is nearest to a given journal profile.

nearest_point(idx, citations, communities, self = TRUE)

nearest_profile(idx = NULL, citations, communities, self = TRUE)

Arguments

idx

A journal name or index. Vectorised for nearest_profile()

citations

a matrix of citations (from columns to rows) or an igraph object

communities

A membership vector or igraph::communities object

self

logical. Include self-citations? If FALSE, they will not be counted.

Value

nearest_point returns the list generated by quadprog::solve.QP(). nearest_profile() returns a stochastic vector whose length is equal to the number of rows/vertices in citations. It is also vectorised, so if idx has length > 1, a matrix will be returned instead.

Details

This function uses quadratic programming to calculate the closest point by Euclidean distance. The citations matrix should be arranged so that citations are directed from columns to rows. If idx is a name, it should correspond to a row name from citations.

The function nearest_profile is a shorthand way to calculate the profile corresponding to the nearest point.

See also

quadprog::solve.QP(), community_profile()

Examples

counts <- citations[1:6, 1:6] comms <- setNames(c(1, 2, 3, 2, 2, 4), colnames(counts)) nearest_point('AoS', counts, comms) # Inside hull (value == 0), because AoS is itself a community.
#> $solution #> [1] -1.185735e-17 -3.733974e-16 1.000000e+00 0.000000e+00 #> #> $value #> [1] 0 #> #> $unconstrained.solution #> [1] -1.185735e-17 -3.733974e-16 1.000000e+00 0.000000e+00 #> #> $iterations #> [1] 1 0 #> #> $Lagrangian #> [1] 0 0 0 0 0 #> #> $iact #> [1] 0 #> #> $cosine #> [1] 1 #>
nearest_point('ANZS', counts, comms) # Outside hull, near to its own community.
#> $solution #> [1] 0.06662942 0.68937248 0.00000000 0.24399810 #> #> $value #> [1] 0.1555548 #> #> $unconstrained.solution #> [1] -0.007145116 1.036092159 -0.377113873 0.181585934 #> #> $iterations #> [1] 3 0 #> #> $Lagrangian #> [1] 0.07742596 0.00000000 0.00000000 0.07420000 0.00000000 #> #> $iact #> [1] 4 1 #> #> $cosine #> [1] 0.9222226 #>
# To which cluster should 'Biometrika' belong? distances <- as.dist(1 - cor(citations + t(citations))) clusters <- cutree(hclust(distances), h = 0.8) result <- nearest_point('Bka', citations, clusters) # Verify Euclidean distance is calculated correctly: point <- community_profile(citations, clusters) %*% result$solution result$value %==% sum((citations[, 'Bka'] / sum(citations[, 'Bka']) - point)^2)
#> [1] TRUE