[1]李玉双,魏 东,吕艳芬.基于碱基组成和分布的DNA序列特征提取方法及应用[J].燕山大学学报,2018,42(1):059-66.[doi:10.3969/j.issn.1007-791X.2018.01.010]
 LI Yushuang,WEI Dong,LV Yanfen.Feature extraction of DNA sequence based on the base composition and distribution and its applications[J].Journal of YanShan University,2018,42(1):059-66.[doi:10.3969/j.issn.1007-791X.2018.01.010]
点击复制

基于碱基组成和分布的DNA序列特征提取方法及应用
分享到:

《燕山大学学报》[ISSN:1007-791X/CN:13-1219/N]

卷:
42
期数:
2018年第1期
页码:
059-66
栏目:
信息与计算机技术
出版日期:
2018-03-31

文章信息/Info

Title:
Feature extraction of DNA sequence based on the base composition and distribution and its applications
文章编号:
1007-791X(2018)01-0059-08
作者:
李玉双*魏  东吕艳芬
燕山大学 理学院,河北 秦皇岛 066004
Author(s):
LI Yushuang WEI DongLV Yanfen
关键词:
转移概率特征向量系统发生树必需基因支持向量机
Keywords:
School of SciencesYanshan UniversityQinhuangdaoHebei 066004China
分类号:
Q332
DOI:
10.3969/j.issn.1007-791X.2018.01.010
文献标志码:
A
摘要:
通过特征提取方式挖掘生物信息数据中潜在的规律是生物信息学研究的基本问题之一。基于DNA序列的碱基转移概率、含量和位置比三类特征构造了24维特征向量,成功应用于11物种的β珠蛋白基因完整编码序列和18哺乳动物线粒体基因组序列的相似性比较,构建的系统发生树与进化事实相符。基于该特征向量,结合支持向量机分类方法识别了28株细菌中的必需基因,平均AUC值高达0.808,高于部分识别方法。实验结果说明:生物序列基本构成元素的转移概率、含量和位置比可作为研究生物信息学中相关分类问题的选择性工具。
Abstract:
To exploit some potential rules in biological information data based on the feature extraction is one of the basic problems in bioinformatics.The constructed 24-D feature vector is composed of base transition probabilities,base contents and base position ratios,and is applied to compare complete coding sequences of βglobin genes of 11 species and whole mitochondrial genomes of 18 eutherian mammals respectively.The derived phylogenetic trees are quite agreement with the evolutionary relationship.In addition,the essential genes of 28 bacteria are successfully identified by combining the feature vector and the support vector machine.The average AUC value is 0.808,much higher than some other methods.The results of experiments demonstrate that the proposed three characteristics are alternative classifiers in related bioinformatics research.

相似文献/References:

[1]廖志伟,蒋锦萍,李玉榕,等.基于AR模型和神经网络的膝骨性关节炎诊断[J].燕山大学学报,2010,(2):169.
 LIAO Zhi-wei,JIANG Jin-ping,LI Yu-rong,et al.Diagnosis of knee osteoarthritis based on AR model and neural network[J].Journal of YanShan University,2010,(1):169.

备注/Memo

备注/Memo:
收稿日期:2017-01-04     责任编辑:孙峰
基金项目:河北省高等学校青年拔尖人才计划资助项目(BJ2014060);燕山大学“新锐工程”人才支持计划项目
作者简介:*李玉双(1980-),女,河北承德人,博士,副教授,主要研究方向为生物数学,Email:yushuangli@ysu.edu.cn
更新日期/Last Update: 2018-04-24