CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study

  title={CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study},
  author={Boxiang Liu and Kaibo Liu and He Zhang and Liang Zhang and Yuchen Bian and Liang Huang},
  journal={Journal of Medical Internet Research},
Background COVID-19 became a global pandemic not long after its identification in late 2019. The genomes of SARS-CoV-2 are being rapidly sequenced and shared on public repositories. To keep up with these updates, scientists need to frequently refresh and reclean data sets, which is an ad hoc and labor-intensive process. Further, scientists with limited bioinformatics or programming knowledge may find it difficult to analyze SARS-CoV-2 genomes. Objective To address these challenges, we developed… 

Figures from this paper

Semi-supervised identification of SARS-CoV-2 molecular targets
A novel semi-supervised pipeline for automated gene, protein, and functional domain annotation of SARS-CoV-2 genomes that differentiates itself by not relying on use of a single reference genome and by overcoming atypical genome traits is developed.
Semi-Supervised Pipeline for Autonomous Annotation of SARS-CoV-2 Genomes
This work comprehensively presents the molecular targets to refine biomedical interventions for SARS-CoV-2 with a scalable, high-accuracy method to analyze newly sequenced infections as they arise.
IDbSV: An Open-Access Repository for Monitoring SARS-CoV-2 Variations and Evolution
The International Database of SARS-CoV-2 Variations (IDbSV) is presented, the result of ongoing efforts in curating, analyzing, and sharing comprehensive interpretation of SAR's genetic variations and variants, to provide a novel surveillance tool to the scientific and public health communities.
Temporal Analysis of SARS-CoV-2 Variants during the COVID-19 Pandemic in Nepal
A need to structure public policy of Nepal to target the delta variant since it has become the predominant variant in Nepal, and to appropriately sample and sequence genomes of SARS-CoV-2 at regular intervals to understand the dynamics of variants in the population is highlighted.
SAS: A Platform of Spike Antigenicity for SARS-CoV-2
A platform of SAS (Spike protein Antigenicity for SARS-CoV-2) is provided, enabling predicting the resistant effect of emerging variants and the dynamic coverage of Sars-Cov-2 antibodies among circulating strains and suggesting the dynamic Coverage of representative mAbs/vaccines among the latest circulating strains.
An in-silico study of the mutation-associated effects on the spike protein of SARS-CoV-2, Omicron variant
The current study highlighted the potential structural basis for the enhanced transmissibility and pathogenicity of the Omicron variant, although further research is needed to investigate its epidemiological and biological implications.
Mutation in a SARS-CoV-2 Haplotype from Sub-Antarctic Chile Reveals New Insights into the Spike’s Dynamics
Comparative coarse-grained molecular dynamic simulations indicated that T307I and D614G belong to a previously unrecognized dynamic domain, interfering with the mobility of the receptor binding domain of the spike.


VAPiD: a lightweight cross-platform viral annotation pipeline and identification tool to facilitate virus genome submissions to NCBI GenBank
A portable, lightweight, user-friendly, internet-enabled, open-source, command-line genome annotation and submission package to facilitate virus genome submissions to NCBI GenBank is created.
VIGOR, an annotation program for small viral genomes
This is the first gene prediction program for rotavirus and rhinovirus for public access and VIGOR is able to accurately predict protein coding genes for the above five viral types and has the capability to assign function to the predicted open reading frames and genotype influenza virus.
NCBI Viral Genomes Resource
The NCBI Viral Genomes Resource is a reference resource designed to bring order to this sequence shockwave and improve usability of viral sequence data.
Nextstrain: real-time tracking of pathogen evolution
Nextstrain consists of a database of viral genomes, a bioinformatics pipeline for phylodynamics analysis, and an interactive visualisation platform that presents a real-time view into the evolution and spread of a range of viral pathogens of high public health importance.
A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff
It appears that the 5′ and 3′ UTRs are reservoirs for genetic variations that changes the termini of proteins during evolution of the Drosophila genus.
Data, disease and diplomacy: GISAID's innovative contribution to global health
The article finds that the Global Initiative on Sharing All Influenza Data contributes to global health in at least five ways: collating the most complete repository of high‐quality influenza data in the world; facilitating the rapid sharing of potentially pandemic virus information during recent outbreaks; supporting the World Health Organization's biannual seasonal flu vaccine strain selection process; developing informal mechanisms for conflict resolution around the sharing of virus data.
A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data
  • Heng Li
  • Biology, Computer Science
  • 2011
This work presents a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing association tests directly based on sequencing data without explicit genotyping or linkage-based imputation and demonstrates that this method achieves comparable accuracy to alternative methods for estimating site allele count, for inferring allele frequency spectrum and for association mapping.
MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability
This version of MAFFT has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update.
Selenium with Python. URL: [accessed 2020-09-23
  • 2020