Speaker
Description
Introduction:
Since 2015, the veSEQ method, a bait-capture metagenomic sequencing technique, has advanced HIV genomic research via the PANGEA consortium and is being adopted in Uganda, Botswana, and Zambia. Notable for its affordability and high-throughput capability in sequencing diverse genomes, veSEQ shows promise for HIV drug resistance monitoring and pathogen studies. High computational demands for metagenomic data processing pose challenges, particularly in low and middle-income countries (LMICs). Here, we present a k-mer based approach that innovatively minimizes the computational complexity and the resulting financial cost.
Methods:
Our Located k-mer Assembler (LKA) begins with a local alignment of reads against a large reference HIV genome set, including HXB2. LKA's key innovation is in tracking the k-mer composition of reads and the position of each k-mer relative to HXB2 to allow for easy k-mer and therefore read comparison. This is especially useful for decontamination, a process which has quadratic complexity (O(N^2)) as more samples are compared. This decontamination is essential for quality assurance in clinical applications like drug resistance monitoring.
Results:
LKA accurately detects drug resistance by analyzing amino acid sequence abundance from aligned short k-mers. Decontamination of a sample can be achieved in minutes whereas naively comparing the billions of reads from an Illumina sequencing run would take years. Additionally, LKA streamlines De Bruijn graph generation for genome assembly, prioritizing the most frequent k-mers.
Conclusion:
LKA, integral to AMPHEUS, our solar-powered lab and genome sequencing platform in Zambia, marks a significant step towards affordable and rapid clinical and public health applications in LMICs. The k-merization concept not only simplifies bioinformatics workflows but also extends to genomic studies of other pathogens.