Given a citation matrix and a community structure, nearest_point
calculates the point
in the convex hull of community profiles that is nearest to a given journal profile.
nearest_point(idx, citations, communities, self = TRUE) nearest_profile(idx = NULL, citations, communities, self = TRUE)
idx | A journal name or index. Vectorised for |
---|---|
citations | a matrix of citations (from columns to rows) or an igraph object |
communities | A membership vector or igraph::communities object |
self | logical. Include self-citations? If |
nearest_point
returns the list generated by quadprog::solve.QP()
.
nearest_profile()
returns a stochastic vector whose length is equal
to the number of rows/vertices in citations
. It is also vectorised, so if
idx
has length > 1, a matrix will be returned instead.
This function uses quadratic programming to calculate the closest point by Euclidean distance.
The citations
matrix should be arranged so that citations are directed from columns to rows.
If idx
is a name, it should correspond to a row name from citations
.
The function nearest_profile
is a shorthand way to calculate the profile corresponding
to the nearest point.
quadprog::solve.QP()
, community_profile()
counts <- citations[1:6, 1:6] comms <- setNames(c(1, 2, 3, 2, 2, 4), colnames(counts)) nearest_point('AoS', counts, comms) # Inside hull (value == 0), because AoS is itself a community.#> $solution #> [1] -1.185735e-17 -3.733974e-16 1.000000e+00 0.000000e+00 #> #> $value #> [1] 0 #> #> $unconstrained.solution #> [1] -1.185735e-17 -3.733974e-16 1.000000e+00 0.000000e+00 #> #> $iterations #> [1] 1 0 #> #> $Lagrangian #> [1] 0 0 0 0 0 #> #> $iact #> [1] 0 #> #> $cosine #> [1] 1 #>nearest_point('ANZS', counts, comms) # Outside hull, near to its own community.#> $solution #> [1] 0.06662942 0.68937248 0.00000000 0.24399810 #> #> $value #> [1] 0.1555548 #> #> $unconstrained.solution #> [1] -0.007145116 1.036092159 -0.377113873 0.181585934 #> #> $iterations #> [1] 3 0 #> #> $Lagrangian #> [1] 0.07742596 0.00000000 0.00000000 0.07420000 0.00000000 #> #> $iact #> [1] 4 1 #> #> $cosine #> [1] 0.9222226 #># To which cluster should 'Biometrika' belong? distances <- as.dist(1 - cor(citations + t(citations))) clusters <- cutree(hclust(distances), h = 0.8) result <- nearest_point('Bka', citations, clusters) # Verify Euclidean distance is calculated correctly: point <- community_profile(citations, clusters) %*% result$solution result$value %==% sum((citations[, 'Bka'] / sum(citations[, 'Bka']) - point)^2)#> [1] TRUE