Lassa fever is a hemorrhagic fever caused by an arenavirus, the Lassa virus (LASV), and can affect 150–200,000 persons per year in West Africa. The virus is hosted by several rodents, Mastomys natalensis and M. erythroleucus, Hylomyscus pamfi, and Mus baoulei. People can be contaminated at home or in the farms, by touching contaminated surfaces, eating contaminated food, or breathing aerosolized viral particles. Human-to-human transmission is occurring as well through infected bodily fluids. In Upper Guinea in particular, M. natalensis is the main host, with LASV prevalence of 14 per cent and IgG prevalence of 27 per cent. In humans, IgG prevalence is 40 per cent. This is, therefore, a hot spot for LASV transmission. In a previous phylogenetic study including 132 partial nucleoprotein (NP) sequences isolated from rodents, we showed that LASV could have emerged 90 years ago in the area. Here, we aim to revise the time of emergence upon analyzing the complete NP and polymerase genes of two strains coming from Upper Guinea: ‘Bantou 366’, a strain isolated from M. natalensis in 2003, and ‘Faranah’, a strain isolated from a human in 1996. They were aligned with 22 other LASV sequences belonging to all lineages and dated by their day of collection. In BEAST (v1.10) tree reconstruction, the following settings were used: GTR+gamma distributed rate variation (four discrete categories) across each codon position and constant population size demographic model. Four clock models were tested: strict, uncorrelated relaxed, random local, and fixed local. The best model was determined by comparing the resulting likelihoods using AICM model testing. Markov chain Monte Carlo (MCMC) sampling was performed for a total of 20 million states (sampling every 10,000 states) to obtain an effective sample size above 200 for all parameters. Results of MCMC sampling were examined in Tracer 1.6. The results showed that the Upper Guinea clade emerged 153 years ago when the phylogeny was reconstructed for partial NP (nt = 754, better model fit with strict clock), 208 years ago with complete NP (nt = 1,707, better model fit with random local clock), and 350 years ago with complete polymerase (nt = 6,681, better model fit with strict clock). The difference of emergence 1, 2, or 3 centuries ago, can be explained by the inclusion of some parts of the genome evolving slower than the partial NP. Therefore, the longer the sequence, the greater the divergence time. In order to have an accurate time of divergence, we suggest to use complete genes to perform a time-calibrated phylogeny.
Read full abstract