Speaker
Description
Clustering infections by genetic similarity is a common method of characterizing the risk structure of a population. A graph is constructed by connecting infections with genetic distances below a threshold, then connected components with two or more nodes are extracted as "clusters". Studies of HIV molecular epidemiology often use logistic regression to find associations between potential risk factors and the binary outcome of appearing in any cluster. This outcome is a crude proxy for transmission rate, and prevents us from resolving risk structure within large components. Our objective is to adapt community detection (CD) methods to broaden the standard definition of clusters.
We retrieved 12,560 HIV-1 pol sequences sampled in China and corresponding metadata (collection date, sex, age and risk factor) from GenBank. Sequences were aligned pairwise against the HXB2 reference using MAFFT, and the alignments were manually refined in Aliview. We used FastTree to reconstruct a phylogeny and FigTree to extract four major clades (subtypes and CRFs) for separate analyses. Pairwise distances were calculated using tn93, and networks were constructed at varying distance cutoffs. Finally, we used Python modules NetworkX and CDlib to extract connected components and apply different CD algorithms.
Different CD algorithms partitioned components for a given threshold by varying extents. For example, Louvain, SLPA and ANGEL reduced mean CRF07 cluster sizes (TN93<0.03) from 113.6 to 69.9, 91.2 and 20.7 sequences per cluster, respectively. Statistical associations were better resolved when large components were partitioned by CD into smaller clusters. For example, sex was independent of cluster ID for the five largest components in subtype B ($\chi^2$-test, $\textit{P}$=0.056), but was significantly associated with their Louvain partition ($\textit{P}$<10$^{-12}$). These preliminary findings underscore the utility of CD in characterizing risk structures by partitioning components into smaller communities.
| Expedited Notification | No thanks, I do not require Expedited Notification |
|---|