Re: what does "complete cds" mean?
[同主题阅读] [版面:生物学] [作者:DimSam] , 2002年05月11日01:27:49
发信人: DimSam (点心), 信区: Biology
标 题: Re: what does "complete cds" mean?
发信站: The unknown SPACE (Sat May 11 01:27:59 2002), 站内信件

NCBI Reference Sequences
Definition: The NCBI Reference Sequence project (RefSeq) provides reference
sequence standards for the naturally occurring molecules of the central dogm
a, from chromosomes to mRNAs to proteins. Toward this goal, intermediate lar
ger genomic regions, instantiated as accessions of the format NG_123456 (gen
omic sequence with curated annotation) or NT_123456 (computed assembly and a
nnotation) are also produced. RefSeq standards provide a foundation for the
functional annotation of the human genome. They provide a stable reference f
or gene characterization, mutation analysis, expression studies, and polymor
phism discovery.
Process: The NCBI RefSeq project represents a committment by the NCBI to pro
vide sequences that can be more easily maintained to reflect our current kno
wledge of sequence information and the corresponding biology, and which avoi
d the redundancy often present in the GenBank database archive. RefSeq proje
cts use distinct processes to provide different types of NCBI RefSeq records
. There are currently three primary RefSeq projects:
Curated RefSeq (regions, transcripts and proteins) [more...]
Genome Annotation (contigs, transcripts, and proteins) [more...]
Complete Genomes (genomes, chromosomes, and proteins)
Scope: Currently, RefSeq records are provided for the following molecule typ
es and genomes:
Accession Format Molecule Type Genome
NC_123456 Complete Genome Archaea
Complete Chromosome Eukaryote
Complete Sequence Plasmid
NG_123456 Genomic Region Homo sapiens
NM_123456 mRNA Arabidopsis thaliana
Danio rerio
Drosophila melanogaster
Homo sapiens
Mus musculus
Rattus norvegicus
NP_123456 Protein All of the above
NT_123456 Genomic Contig Homo sapiens[more...]
Mus musculus
XM_123456 mRNA Homo sapiens model mRNA provided by the Genome Annotation pro
cess; sequence corresponds to the genomic contig.[more...]
XR_123456 RNA Homo sapiens model non-coding transcripts provided by the Geno
me Annotation process; sequence corresponds to the genomic contig.[more...]
XP_123456 Protein Homo sapiens model proteins provided by theGenome Annotati
on process; sequence corresponds to the genomic contig.[more...]
Status Codes: RefSeq records are provided with a status code which provides
an indication of the level of review a RefSeq record has undergone. Expanded
descriptions of these status codes are provided in the RefSeq FAQ document.

Status Definition
REVIEWED The RefSeq record has been the reviewed by NCBI Staff. The review p
rocess includes reviewing available sequence data and frequently also includ
es a review of the literature. The genomic/mRNA/protein RefSeq records may h
ave expanded sequence and annotation including addition of publications and
features, as deemed relevant.
PROVISIONAL The RefSeq record has not yet been subject to individual review.
The sequence-to-gene name associations have been established by outside col
laborators and NCBI staff.
PREDICTED Some aspect of the RefSeq record is predicted and there is support
ing evidence that the locus is valid. The existence of the mRNA is supported
by existing cDNA, EST, and/or closely related homologous sequences. For mRN
A/protein records, this status code primarily indicates that while a mRNA pr
ovides experimental evidence for the locus, the protein has been predicted.
GENOME ANNOTATION This identifies the contig (NT_ accessions), mRNA (XM_), n
on-coding transcript (XR_), and protein (XP_) RefSeq records provided by the
NCBI Genome Annotation process. These records are provided via automated pr
ocessing. [more...].
Data Availability: Project Source Web Availability FTP
Curated RefSeq Entrez, BLAST, LocusLink GO
Genome Annotation Entrez, BLAST, MapViewer, LocusLink GO
Complete Genomes Entrez, BLAST GO
Recent publications
1. RefSeq and LocusLink: NCBI gene-centered resources.
Pruitt KD, Maglott DR
Nucleic Acids Res 2001 Jan 1;29(1):137-140
[PubMed] [PDF file] (reproduced with permission from NAR Online http://www
2. Introducing RefSeq and LocusLink: curated human genome resources at the N
Pruitt KD, Katz KS, Sicotte H, Maglott DR
Trends Genet. 2000 Jan;16(1):44-47.
