The first transcriptomic analysis of Uropsilus gracilis and Euroscaptor kuznetsovi


Talpidae includes 54 recognized species distributed in Eurasia and North America. Talpids are characterized by diverse ecomorphologies, and are promising taxa to examine adaptive evolution. Surprisingly, no transcriptome data has been collected ever for any species of this family. In this study, we sequenced the transcriptomes of the heart and lung of Uropsilus gracilis and spleen and lung of Euroscaptor kuznetsovi. U. gracilis is a terrestrial species representing a relict group endemic in southwestern China and adjacent Northern Myanmar, and E. kuznetsovi is a fully fossorial species of the tribe Talpini. The distribution of E. kuznetsovi in China is also reported here for the first time. By de novo assembling, we obtained 197 092 and 225 956 transcripts of U. gracilis and E. kuznetsovi, respectively, and 12 5427 and 94 023 unigene correspondingly. By comparison with data from genome annotation data from GenBank, we revealed 8 376 orthologous genes among three talpid species and 8 114 orthologs among talpid and shrew species. We found more than 10 tissue-specific genes in the highly expressed genes for each tissue. On the other hand, BUSCO analysis suggested that single-copy genes were 43.0% and 56.6% complete in U. gracilis and E. kuznetsovi, respectively, suggesting that the mRNA rapidly degraded after the death of the animals, and hindered the assembly of the transcriptomes. We revealed that 335 genes were highly expressed in the lung of E. kuznetsovi, including HMGB1, HSPD1, SF3B1, COL3A1, SUMO1 and JUNB by comparing the expression differences of lung genes between the two species, These genes were reported to be related with hypoxia or high-altitude adaptation.

Key words: Uropsilus gracilis, Euroscaptor kuznetsovi, transcriptome sequencing, De novo assembly


现生的鼹科动物分布于欧亚大陆和北美大陆,包括54种已知物种。鼹科动物有丰富的生态类型,是研究适应性进化的较好模型。本研究通过二代测序的方法分别获得了长吻鼩鼹和库氏长吻鼹两个物种心脏和肺脏以及脾脏和肺脏的转录组数据。这两个物种分别代表了分布于中国西南部、缅甸北部的没有特化的原始类群鼩鼹亚科以及高度适应地下生活的鼹亚科鼹族。我们首次报道了库氏长吻鼹在中国的分布。通过从头拼接,分别获得长吻鼩鼹和库氏长吻鼹197 092个和225 956个转录本,以及125 427个和94 023个unigene。通过与GenBank中的基因组注释文件比对,得到鼹科物种同源基因家族8 376个,及鼹科鼩鼱科同源基因家族8 114个。差异表达基因中各组织的高表达基因中均找到10个以上组织特异性基因。然而BUSCO分析确定完整单拷贝基因在两个物种中分别为43.0%和56.6%,提示死后mRNA迅速降解并影响转录组拼接。比较两个物种肺部基因的表达差异发现库氏长吻鼹335个相对高表达的基因,其中包括HMGB1HSPD1SF3B1COL3A1SUMO1JUNB等,有报道上述基因可能与低氧或高海拔适应有关。

关键词: 长吻鼩鼹, 库氏长吻鼹, 转录组测序, 从头组装