Table Index

NucleicAcid


Defines a specific nucleic acid molecule, such as DNA or of RNA.
Entries in this table correspond to a real-world DNA or RNA molecules
which at one point were purified and isolated ("something you can point at"). A row in this table
can define a complete molecule, a fragment of a molecule, or a molecule that has been partially
sequenced in different regions.
This table will be used in several ways: (1) to associate a sequence with an entire
replicon, or a region of a replicon, when the sequence of that replicon is known; (2) to associate
a DNA sequence with a single gene; (3) to define a DNA or RNA molecule that has not been
sequenced; (4) to define RNA molecules such as tRNAs.
Features on a NucleicAcid molecule (such as promoters or binding sites) can be defined using
the Feature table.
The Subsequence table contains zero or more full or partial sequences contents of this molecule.
If one represents the full sequence of this molecule, Subsequence.FullSequence = 'T' where
Subsequence.NucleicAcidWID references this molecule. In this case, NucleicAcid.FullySequenced = 'T' as well.

Columns

Column MySQL Type Oracle Type Nullable Description
WID BIGINT NUMBER No Warehouse identifier for this NucleicAcid.
Name VARCHAR VARCHAR2(200) Yes Name or description of this molecule.
Ex: (CMR) 'Chromosome II Brucella melitensis 16M'.
Type VARCHAR VARCHAR2(30) No Enumeration: 'DNA' 'RNA' or 'na'

Enumerated Values:
DNA - The molecule is composed of DNA
RNA - The molecule is composed of RNA
NA - The molecule is specified as a nucleic acid but whether of type DNA or RNA is not known
Class VARCHAR VARCHAR2(30) Yes Enumeration: describes the molecule as it exists in the organism; stores values such as RNA subtype ("pre-RNA", "mRNA"), replicon type (e.g., "chromosome", "plasmid")

Enumerated Values:
pre-RNA - As it exists within the organism, the molecule is a pre-RNA molecule. NCBI: used when there is no evidence that mature RNA is produced
mRNA - As it exists within the organism, the molecule is a mature mRNA. NCBI: used when there is evidence that mature mRNA is produced
rRNA - As it exists within the organism, the molecule is an rRNA; NCBI used when there is evidence that mature rRNA is produced
tRNA - As it exists within the organism, the molecule is a tRNA; NCBI: used when there is evidence that mature tRNA is produced
snRNA - As it exists within the organism, the molecule is a small nuclear RNA. NCBI: used when there is evidence that snRNA is produced
scRNA - As it exists within the organism, the molecule is an RNA which encodes small cytoplasmic ribonucleic proteins. NCBI: used when there is evidence that gene codes of small cytoplasmic (sc) ribonucleoproteins (RNP)s
snoRNA - As it exists within the organism, the molecule is a small nucleolar RNA. NCBI: used when there is evidence that transcript is a small nucleolar RNA
other - Check notes as to how we map this...
chromosome - As it exists within the organism, the molecule is a chromosome
plasmid - As it exists within the organism, the molecule is a plasmid
organelle-chromosome - As it exists within the organism, the molecule is the chromosome of an organelle
transposon - As it exists within the organism, the molecule is a transposon
virus - As it exists within the organism, the molecule is a virus
unknown - Used when the class is unknown
Topology VARCHAR VARCHAR2(30) Yes Enumeration: 'circular', 'linear' or 'other'.

Enumerated Values:
linear - The topology of the molecule is Linear
circular - The topology of the molecule is Circular
other - The topology of the molecule is neither circular nor linear but is known
Strandedness VARCHAR VARCHAR2(30) Yes Enumeration indicating whether Nucleic Acid is single stranded, double stranded or mixed stranded.

Enumerated Values:
ss - The molecule is single stranded
ds - The molecule is double stranded
mixed - The molecule is composed of single and double stranded regions
SequenceDerivation VARCHAR VARCHAR2(30) Yes Enumeration describing how the sequence was generated
(e.g., from single clone, from assembly of sequences
from other sequence entries, collection of different clones, etc).

Enumerated Values:
virtual - NCBI: The sequence of the molecule is NOT known. NOTE: this should map to null.
raw - POORLY DEFINED; seems to be used to indicate that the sequence of the molecule was actually generated from one single continuous molecule, as opposed to assembled together from sequences from different molecules (e.g., different clones)
seg - NCBI: The sequence of the molecule is made up of collection of segments arranged according to specified coordinates, e.g., sequence was derived from assembling sequences from different clones
reference - NCBI: The sequence of the molecule is constructed from existing Bioseqs;It behaves exactly like a segmented Bioseq in taking it's data and character from the Bioseq to which it points.
constructed - NCBI: The sequence of the molecule is constructed by assembling other Bioseqs
consensus - NCBI: The sequence of the molecule represents a pattern typical of a sequence region or family of sequences;' it summarizes attributes of an aligned collection of real sequences. Note that this is NOT A REAL OBJECT
map - NCBI: The molecule does not have a sequence describing it, but rather a set of coordinates (restriction fragment order, genetic markers, physical map, etc)
Fragment CHAR(1) CHAR(1) Yes 'T' if this is a fragment of a molecule,
'F' if this NucleicAcid describes an entire molecule.
FullySequenced CHAR(1) CHAR(1) Yes 'T' if the molecule is completely
sequenced within this dataset, else 'F'.
MoleculeLength INT NUMBER Yes The length of the molecule, in nucleotides. This value is firm if FullySequenced is 'T';
otherwise,MoleculeLength is an approximation and is not necessarily equal to CumulativeLength,
as the the molecule may not have been sequenced to completion.
This value is calculated if it is not explicitely stated in the source dataset
MoleculeLengthApproximate VARCHAR VARCHAR2(10) Yes Enumeration; specifies whether MoleculeLength stores an approximate value:
'gt' for greater than,
'lt' for less than, or
'ne' for not equal.

Enumerated Values:
gt - The length of the molecule's sequence is greater than the actual length specified
lt - The length of the molecule's sequence is less than the actual length specified
ne - The length of the molecule's sequence is less than or greater than the actual length. All we know is that its not the exact length
CumulativeLength INT NUMBER Yes The cumulative number of nucleotides for all Subsequences referenced by this NucleicAcid entry,
whether contiguous or not. This value is a summation of the number of nucleotides for these Subsequences.
If the molecule is completely sequenced, this value should be identical
to that of MoleculeLength, and both fields are populated.
CumulativeLengthApproximate VARCHAR VARCHAR2(10) Yes Enumeration; specifies whether CumulativeLength how approximates actual total length:
'gt' for greater than,
'lt' for less than, or
'ne' for not equal.

Enumerated Values:
gt - The total length of the molecule's sequence is greater than the actual length
lt - The total length of the molecule's sequence is less than the actual length
ne - The total length of the molecule's sequence is less than or greater than the actual length. All we know is that its not the exact length
GeneticCodeWID BIGINT NUMBER Yes References the genetic code of this molecule.
BioSourceWID BIGINT NUMBER Yes References the biological source of this molecule.
DataSetWID BIGINT NUMBER No References the data set from which the entity came from

Referenced By

Table Column
Subsequence NucleicAcidWID
Gene NucleicAcidWID
GeneWIDNucleicAcidWID NucleicAcidWID
Entry OtherWID
Support OtherWID
InteractionParticipant OtherWID
RelatedTerm OtherWID
CitationWIDOtherWID OtherWID
CommentTable OtherWID
CrossReference OtherWID
CrossReference CrossWID
Description OtherWID
DBID OtherWID
SynonymTable OtherWID
ToolAdvice OtherWID

Other Constraints

None.

Indexes

Name Columns
NUCLEICACID_Name Name
NUCLEICACID_Type Type
NUCLEICACID_Class Class
NUCLEICACID_MoleculeLength MoleculeLength
NUCLEICACID_CumulativeLength CumulativeLength
NUCLEICACID_GCWID GeneticCodeWID
NUCLEICACID_BSWID BioSourceWID
NUCLEICACID_DATASETWID DATASETWID