Google

蓝海人类学在线 Ryan WEI's Forum of Anthropology

 找回密码
 注册
查看: 645|回复: 15

华大基因及合作者的Cell杂志文章分析14万产前基因检测样本

[复制链接]
发表于 2018-10-7 07:54 | 显示全部楼层 |阅读模式
Genomic Analyses from Non-invasive Prenatal Testing Reveal Genetic Associations, Patterns of Viral Infections, and Chinese Population History

Siyang Liu et al.
Cell 175, 347-359, October 4, 2018

https://doi.org/10.1016/j.cell.2018.08.016
https://www.cell.com/cell/fulltext/S0092-8674(18)31032-8

SUMMARY

We analyze whole-genome sequencing data from 141,431 Chinese women generated for non-invasive prenatal testing (NIPT). We use these data to characterize the population genetic structure and to investigate genetic associations with maternal and infectious traits. We show that the present day distribution of alleles is a function of both ancient migration and very recent population movements. We reveal novel phenotype-genotype associations, including several replicated associations with height and BMI, an association between maternal age and EMB, and between twin pregnancy and NRG1. Finally, we identify a unique pattern of circulating viral DNA in plasma with high prevalence of hepatitis B and other clinically relevant maternal infections. A GWAS for viral infections identifies an exceptionally strong association between integrated herpesvirus 6 and MOV10L1, which affects piwi-interacting RNA (piRNA) processing and PIWI protein function. These findings demonstrate the great value and potential of accumulating NIPT data for worldwide medical and genetic analyses.

文中提到:
    ... ... To date, over ten millions of NIPT tests have been carried out globally, among which 70% were conducted on Chinese women. These samples can be leveraged for population genetic investigations of population history, large-scale genetic association studies, and viral screening if the technical issues regarding the use of very large, very low depth (0.06x - 0.1x) samples can be addressed.
    Here, we analyze NIPT sequencing data of 141,431 pregnant women with informed consent. We demonstrate that allele frequencies can be estimated with high accuracy, allowing further population genetic analyses. We also show that efficient genotype imputation is feasible and can provide considerable mapping power. We use the data to carry out the hitherto largest analysis of population genetic variation in the Chinese population, perform a genome-wide association study (GWAS) on multiple traits in pregnant Chinese women, and survey the distribution of circulating viral DNA in the maternal plasma.
发表于 2018-10-7 11:12 | 显示全部楼层
同意litis的观点,CHB应该不是北京本地人,而是在京读书或工作的外地人,因此那些偏红的省份都是在京人员较多的省份。
 楼主| 发表于 2018-10-7 08:09 | 显示全部楼层
哪里的人和千人基因组计划的CHB(北京)样本最接近?图S3G显示:江苏、浙江、福建。其次是:江西、安徽、山东。

chb.jpg
 楼主| 发表于 2018-10-7 08:15 | 显示全部楼层
哪里的人和千人基因组计划的CHS(中国南方)样本最接近?图S3I显示:福建。其次是:江西。接着是:浙江、湖南、广东。


chs.jpg
 楼主| 发表于 2018-10-7 08:23 | 显示全部楼层
哪里的人和千人基因组计划的JPT(日本东京)样本相对接近?图S3J显示:江苏。其次是:山东、浙江、吉林、安徽、福建(山东和浙江的颜色好像稍微深一点)。接着是:江西。
jpt.jpg
 楼主| 发表于 2018-10-7 08:50 | 显示全部楼层
哪里的汉族人(自填)和千人基因组计划的CDX(西双版纳傣族)样本最接近?图S3H显示:广西。其次是:广东。接着是:福建、湖南、江西(江西的颜色好像稍微浅一点)。
cdx.jpg
发表于 2018-10-7 09:59 | 显示全部楼层
现在北京居民的成分整体而言看来比较偏东部一些
发表于 2018-10-7 10:06 | 显示全部楼层
很感兴趣各地人群的D值分析结果~
发表于 2018-10-7 10:15 | 显示全部楼层
imvivi001 发表于 2018-10-7 09:59
现在北京居民的成分整体而言看来比较偏东部一些

CHB是北师大的学生, 不是北京居民吧. 北京居民还是北得多的.
北京郊区, 尤其是北郊几个区县的当地人给我感觉应该和山西陕北人是类似的, 比东北人更北...
发表于 2018-10-7 11:13 | 显示全部楼层
cpan0256 发表于 2018-10-7 08:50
哪里的汉族人(自填)和千人基因组计划的CDX(西双版纳傣族)样本最接近?图S3H显示:广西。其次是:广东。 ...

嗯,两广汉人确实有不少壮侗成分,这点我之前就提到过。
发表于 2018-10-8 17:53 | 显示全部楼层
怪不得CHB和CHS在分析中还是蛮接近的。原来CHB根本不具备北方人群的代表性。
发表于 2018-10-8 20:42 | 显示全部楼层
福建人不管和什么人都接近,厉害了。
发表于 2018-10-9 07:56 | 显示全部楼层
这份报告最大的特点就是数据量大(有国家基因库的名头就是不一样,呵呵),但是深度依然不够,尽管似乎发现了一些新的与身高体重相关的loci~
发表于 2018-10-10 11:06 | 显示全部楼层
以下节选自该文的Population Structure, Recent Population History, and Genetic Adaptations一节:
    A principal component analysis of all the 141,431 participants suggested that the first three principal components reflected sequencing read length, latitudinal genetic differentiation, and the sequencing error rate (Figures S3A–S3D). After removing participants with 49bp read length and with sequencing error rate >0.00325, a principal component analysis of 45,387 self-reported Han Chinese from the 31 administrative divisions showed that the greatest differentiation of Han Chinese is along a latitudinal gradient (Figures S3E and S3F), consistent with previous studies (Chen et al., 2009, Xu et al., 2009). In contrast, there is, perhaps surprisingly, very little differentiation from East to West. This observation may be explained by the fact that a large proportion of the western Han populations in China are recent immigrants organized by the central government starting from 1949 when the People’s Republic of China was founded (Liang and White, 1996). While the Han Chinese were found to be relatively genetically homogeneous, there was greater divergence among the minority ethnic groups for both latitude and longitude (Figures 2A and 2B). The most differentiated ethnic groups are the Turkic speaking Uyghur and Kazakhs, who reside in the Xinjiang province, and the Mongols residing primarily in Inner Mongolia. The Xibe, Tibetans, and Hui from central China, the Yi from southwestern China, and the Zhuang and Buyi minorities from southern China, also differ substantially from the Han Chinese that come from the same area. On the other hand, the Manchu from northeastern China were genetically closest to the Han Chinese in that area, consistent with historical accounts (Rhoads, 2000).
       We further explored the patterns of allele sharing between Han Chinese and major global ethnic groups using private alleles defined from the 1KG populations and using outgroup F3 statistics (Peter, 2016) (STAR Methods). In the northwest and central west, we observed private allele sharing with the 1KG European Central European of Utah (CEU) panel both for individuals self-identified as Han Chinese and for individuals self-identified as belonging to a minority group. The strongest level of private allele sharing with the CEU was observed for people in the most northwest provinces of Xinjiang and Gansu (Figure 2C), likely reflecting the Turkic speaking ancestry in these minorities. When only the Han Chinese were included, the strongest level of allele sharing with Europeans was observed for people in the Qinghai, Gansu, and Ningxia provinces (Figure 2D). These provinces are located in the Hexi corridor, the most important commercial hub on the Silk Road connecting China to the west since the establishment of the Han Dynasty (206 BC) (Yang et al., 2008). Thus, one potential explanation for the Western ancestry observed in these provinces is gene flow related to their location on the Silk Road. We also observed a pattern of increased allele sharing with the 1KG Indian ITU reference panels in southwestern populations from Xinjiang, Tibet, Yunnan, Guangxi, and Hainan provinces (Figures 2E and 2F), consistent with their geographic proximity to the Indian subcontinent (Yang et al., 2017). Analyses based on the F3 statistic are mostly consistent for the CEU analysis, but for the ITU analysis, we also show high affinity between the Han Chinese in northern provinces and the ITU, likely due to the shared ancestry of the CEU and ITU populations. Furthermore, we applied the F3 statistic to learn patterns of allele sharing between the Chinese provincial populations and 1KGP neighbor populations including three Chinese populations, the Japanese, and the Vietnamese. We observe a pattern of allele sharing among the 33 administrative divisions reflecting the geographical origin of the 1KGP populations (Figures S3G–S3K). Interestingly, we found that the CHB, although annotated as the Han Chinese from Beijing, did not have the closest affinity with Beijing individuals but tended to be closer to populations in the coastal provinces: Shandong, Zhejiang, Jiangsu, Fujian, and Jiangxi (Figure S3G). This likely reflects the recent multiethnic migration into Beijing consistent with the demographic information available for our samples. We also investigated the inter-provincial allele sharing between Han Chinese in the Chinese administrative divisions. The difference in f3 statistic among provinces is very small, but all southern provinces show more genetic affinity with other southern coastal provinces, while northern provinces associate with northern coastal provinces (results not shown). This observation likely reflects a combination of internal migration events organized by the central government since 1949 (Liang and White, 1996) and the country’s oriented movement of labor from the interior to the coastal areas since 1979 (Liang and Ma, 2004).
   ps:图发不了,谁能帮帮吗?

发表于 2018-10-10 11:11 | 显示全部楼层
海南汉族的印度成分似乎和云南汉族相当,何解?
发表于 2018-10-10 17:42 | 显示全部楼层
燕然山 发表于 2018-10-10 11:06
以下节选自该文的Population Structure, Recent Population History, and Genetic Adaptations一节:
     ...

具体解释如下:

While the Han Chinese were found to be relatively genetically homogeneous, there was greater divergence among the minority ethnic groups for both latitude and longitude (Figures 2A and 2B). 相对而言,全国汉族之间的基因结构比较一致,而各地区的少数民族,无论是不同经度还是不同纬度地区,都有明显的差异。

The most differentiated ethnic groups are the Turkic speaking Uyghur and Kazakhs, who reside in the Xinjiang province, and the Mongols residing primarily in Inner Mongolia.
偏离东亚比较明显的是突厥语人群如维族与哈萨克,这个不解释。

The Xibe, Tibetans, and Hui from central China, the Yi from southwestern China, and the Zhuang and Buyi minorities from southern China, also differ substantially from the Han Chinese that come from the same area. On the other hand, the Manchu from northeastern China were genetically closest to the Han Chinese in that area, consistent with historical accounts (Rhoads, 2000).
伯、藏族、华中回族、彝族、以及南方壮族布依族,均与本地的汉族有明显差异。 另一方面,满族与本地汉族极为接近。(这个我之前多次提示了)
The strongest level of private allele sharing with the CEU was observed for people in the most northwest provinces of Xinjiang and Gansu (Figure 2C), likely reflecting the Turkic speaking ancestry in these minorities. When only the Han Chinese were included, the strongest level of allele sharing with Europeans was observed for people in the Qinghai, Gansu, and Ningxia provinces (Figure 2D). These provinces are located in the Hexi corridor, the most important commercial hub on the Silk Road connecting China to the west since the establishment of the Han Dynasty (206 BC) (Yang et al., 2008). Thus, one potential explanation for the Western ancestry observed in these provinces is gene flow related to their location on the Silk Road.
与中西欧白人最接近的是中国最西北部的人群(新疆、甘肃),可能是突厥人组元成分的影响(这个本坛早有谈论,不奇怪,不过也未必,我之前说过,也有上古西北土著的影响)。 如果只观察汉族,欧洲成分最多的是青海、甘肃以及宁夏。这些省份正好处于古丝绸之路之上,因此具有来自西亚欧的成分是可以解释的。

We also observed a pattern of increased allele sharing with the 1KG Indian ITU reference panels in southwestern populations from Xinjiang, Tibet, Yunnan, Guangxi, and Hainan provinces (Figures 2E and 2F), consistent with their geographic proximity to the Indian subcontinent (Yang et al., 2017). Analyses based on the F3 statistic are mostly consistent for the CEU analysis, but for the ITU analysis, we also show high affinity between the Han Chinese in northern provinces and the ITU, likely due to the shared ancestry of the CEU and ITU populations.
新疆与云南、西藏、广西和海南的族群中,印度ITU成分开始增多,与付杨2017年的研究报告关于基因相似度与地理位置正相关的观察一致。
在北方汉族中也观察到类似印度ITU成分增多的信号,可能是因为中西欧CEU成分与印度ITU二者接近的原因。(本次华大检测的是低通量,所以无法正确区分CEU与ITU,也是可以理解的)

. We also investigated the inter-provincial allele sharing between Han Chinese in the Chinese administrative divisions. The difference in f3 statistic among provinces is very small, but all southern provinces show more genetic affinity with other southern coastal provinces, while northern provinces associate with northern coastal provinces (results not shown).

总体而言各地汉族之间的差异很小,不过南方汉族更接近南方沿海汉族、北汉亦是如此。(这个之前本坛也有讨论,本次进一步证实了)

您需要登录后才可以回帖 登录 | 注册

本版积分规则

小黑屋|手机版|Archiver|人类生物学在线 ( 苏ICP备16053048号 )

GMT+8, 2018-10-22 00:38 , Processed in 0.160923 second(s), 20 queries .

Powered by Discuz! X3.4

© 2001-2017 Comsenz Inc.

快速回复 返回顶部 返回列表