Re: About blast - 未名空间(mitbbs.com)

MITBBS.com

首页

分类讨论区

移民专栏

首页 - 分类讨论区 - 学术学科 - 生物学版 - 阅读文章

首页

Re: About blast

[同主题阅读] [版面：生物学] [作者：ldg] , 2003年09月13日22:46:10

	ldg
进入未名形象秀
	我的博客

[上篇] [下篇] [同主题上篇] [同主题下篇]

发信人: ldg (三十三块★breathe), 信区: Biology
标题: Re: About blast
发信站: Unknown Space - 未名空间 (Sat Sep 13 22:46:36 2003), 站内信件

下面是我一堂课上的讲义
Repetive elements in query sequences can cause spurious database
matches. These sequences lead to artifically high alignment scores
with database sequences when they are acturally not related to databases
sequences. Alu repeats, low-complexity regions (LCRs) with short period
repeats or overrepresented residues. In fact, one half of protein
sequences in databases contain at least one LCR.

Reason: the LCR sequences do not fit the residue-by-residue sequence
conservation and therefore do not reflect evolutionary relationship.
Methods for measuring statistical significance of alignment are based on
certain degree of randomness. However, certain patterns in unrelated
sequences violate this rule.

【在 feizj (cornell) 的大作中提到: 】
: Why all the public blast servers (NCBI, TIGR etc) use default filter (DUST and
: SEG) to mask such as repeat sequences?
: If I compare a DNA sequence having several repeats against itself using the
: above filter, it will not give me 100% identity. Sometimes even less than 90%
: identity. That is not true. But there should be some reasons for blast to do
: this. Can anyone tell me? Thanks.

--
※ 来源:．Unknown Space - 未名空间 mitbbs.com．[FROM: 165.91.]

[上篇] [下篇] [同主题上篇] [同主题下篇]

[转寄] [转贴] [回信给作者] [删除文章] [同主题阅读] [从此处展开] [返回版面] [快速返回]

赞助链接






将您的链接放在这儿