Given a citation matrix and a community structure, nearest_point calculates the point
in the convex hull of community profiles that is nearest to a given journal profile.
nearest_point(idx, citations, communities, self = TRUE) nearest_profile(idx = NULL, citations, communities, self = TRUE)
| idx | A journal name or index. Vectorised for |
|---|---|
| citations | a matrix of citations (from columns to rows) or an igraph object |
| communities | A membership vector or igraph::communities object |
| self | logical. Include self-citations? If |
nearest_point returns the list generated by quadprog::solve.QP().
nearest_profile() returns a stochastic vector whose length is equal
to the number of rows/vertices in citations. It is also vectorised, so if
idx has length > 1, a matrix will be returned instead.
This function uses quadratic programming to calculate the closest point by Euclidean distance.
The citations matrix should be arranged so that citations are directed from columns to rows.
If idx is a name, it should correspond to a row name from citations.
The function nearest_profile is a shorthand way to calculate the profile corresponding
to the nearest point.
quadprog::solve.QP(), community_profile()
counts <- citations[1:6, 1:6] comms <- setNames(c(1, 2, 3, 2, 2, 4), colnames(counts)) nearest_point('AoS', counts, comms) # Inside hull (value == 0), because AoS is itself a community.#> $solution #> [1] -1.185735e-17 -3.733974e-16 1.000000e+00 0.000000e+00 #> #> $value #> [1] 0 #> #> $unconstrained.solution #> [1] -1.185735e-17 -3.733974e-16 1.000000e+00 0.000000e+00 #> #> $iterations #> [1] 1 0 #> #> $Lagrangian #> [1] 0 0 0 0 0 #> #> $iact #> [1] 0 #> #> $cosine #> [1] 1 #>nearest_point('ANZS', counts, comms) # Outside hull, near to its own community.#> $solution #> [1] 0.06662942 0.68937248 0.00000000 0.24399810 #> #> $value #> [1] 0.1555548 #> #> $unconstrained.solution #> [1] -0.007145116 1.036092159 -0.377113873 0.181585934 #> #> $iterations #> [1] 3 0 #> #> $Lagrangian #> [1] 0.07742596 0.00000000 0.00000000 0.07420000 0.00000000 #> #> $iact #> [1] 4 1 #> #> $cosine #> [1] 0.9222226 #># To which cluster should 'Biometrika' belong? distances <- as.dist(1 - cor(citations + t(citations))) clusters <- cutree(hclust(distances), h = 0.8) result <- nearest_point('Bka', citations, clusters) # Verify Euclidean distance is calculated correctly: point <- community_profile(citations, clusters) %*% result$solution result$value %==% sum((citations[, 'Bka'] / sum(citations[, 'Bka']) - point)^2)#> [1] TRUE