Table Index

Feature


Features define regions or points of interest on the protein sequence or
nucleic acid sequence specified by SequenceWID.
Features are not used to describe objects such as Genes, BioSources, etc,
because attributes of those objects are described in their own tables.
Exceptions:
1. Since pseudogenes are not entered in the Gene table, they are listed as sequence features in Feature

Columns

Column MySQL Type Oracle Type Nullable Description
WID BIGINT NUMBER No Warehouse identifier of this feature.
Description VARCHAR VARCHAR2(1300) Yes Textual description of this feature.
Type VARCHAR VARCHAR2(50) Yes Type of feature. These type values come from the source dataset and
are not necessarily enumerated. Example: "Promoter CYZ".
Class VARCHAR VARCHAR2(50) Yes Class of feature. Assigns our typology for types of features, or qualifiers
associated with features. These are our own enumerated type values, to allow us
to classify features without losing the original (author-provided) type values
stored in Type. Example: Class=promoter.

Enumerated Values:
binding site - Identifies the presence of a DNA binding site
promoter - Identifies the presence of a promoter
terminator - Identifies the presence of a terminator
pseudogene - Identifies a pseudogene, whether non-transcribed or processed
ORF - Identifies a truly unknown open reading frame according to warehouse definition (no strong evidence that a product is produced)
partial - Qualifier: States that feature is not complete
unknown product - Identifies an unspecified product is produced from this genomic location, as stated in dataset
notable - Qualifier: Characterizes the feature value as notable (same as 'Exceptional' in GB dataset)
by similarity - Qualifier: states that feature was derived by similarity analysis
potential - Qualifier: states that feature may be incorrect
probably - Qualifier: states that feature is probably correct
SequenceType CHAR(1) CHAR(1) No Enumeration (see also SequenceWID) that indicates whether the sequence
is protein or nucleic and how the sequence (if available) is represented:
If 'P', feature resides on a protein.
If 'S' or 'N', feature resides on a nucleic acid.

Enumerated Values:
P - Feature resides on a protein. Implies SequenceWID (if nonNULL) references a Protein
S - Feature resides on a nucleic acid. Implies SequenceWID is nonNULL and references a Subsequence
N - Feature resides on a nucleic acid. Implies SequenceWID (if nonNULL) references a NucleicAcid
SequenceWID BIGINT NUMBER Yes References the Protein or Subsequence containing the sequence on which we
are defining a feature:
SequenceType of 'S' implies SequenceWID is nonNULL and references a Subsequence-
sequence = Subsequence.Sequence (i.e., it is stored explicitly),
SequenceType of 'N' implies SequenceWID (if nonNULL) references a NucleicAcid-
sequence is the substring Subsequence.Sequence[StartPosition : EndPosition]
where Subsequence is the full Subsequence of the nucleic acid.
SequenceType of 'P' implies SequenceWID (if nonNULL) references a Protein-
sequence is the substring Protein.AASequence[StartPosition : EndPosition].
Variant LONGTEXT CLOB Yes Amino-acid sequence for this protein, if available
RegionOrPoint VARCHAR VARCHAR2(10) Yes Specifies whether this feature is specified with starting and ending coordinates
or with a single coordinate.

Enumerated Values:
region - Feature is specified by a start point and an end point on the sequence
point - Feature is specified by a single point on the sequence
PointType VARCHAR VARCHAR2(10) Yes Only defined if RegionOrPoint='point'. Specifies where the feature is relative
to its location as encoded in StartPosition and EndPosition:

Enumerated Values:
center - Feature is centered at location.
left - Feature extends to the left (decreasing position) of location.
right - Feature extends to the right (increasing position) of location.
StartPosition INT NUMBER Yes Start position of the feature within the NucleicAcid or Protein sequence.
If Feature.RegionOrPoint is 'point', StartPosition and EndPosition will
either be equal (location is exactly at a nucleotide or amino acid) or will differ by 1
(location is centered between two adjacent nucleotides or amino acids).

EndPosition INT NUMBER Yes End position of the feature within the NucleicAcid or Protein sequence.
If Feature.RegionOrPoint is 'point', StartPosition and EndPosition will
either be equal (location is exactly at a nucleotide or amino acid) or will differ by 1
(location is between two adjacent nucleotides or amino acids).
StartPositionApproximate VARCHAR VARCHAR2(10) Yes Indicates that the Start position of the coding region is an approximate value
It could be 'gt' for greater than, 'lt' for less than and 'ne' to indicate that it is not
equal. This is a controlled vocabulary.

Enumerated Values:
gt - The start position of the feature is greater than the actual position specified.
lt - The start position of the feature is less than the actual position specified.
ne - The start position of the feature is less than or greater than the actual position. All we know is that its not the exact position.
EndPositionApproximate VARCHAR VARCHAR2(10) Yes Indicates that the End position of the coding region is an approximate value.
It could be 'gt' for greater than, 'lt' for less than and 'ne' to indicate that it is not
equal. This is a controlled vocabulary.

Enumerated Values:
gt - The end position of the feature is greater than the actual position specified.
lt - The end position of the feature is less than the actual position specified.
ne - The end position of the feature is less than or greater than the actual position. All we know is that its not the exact position.
ExperimentalSupport CHAR(1) CHAR(1) Yes 'T' if the feature is supported by experimental evidence, else 'F'
ComputationalSupport CHAR(1) CHAR(1) Yes 'T' if the feature is supported by computational evidence, else 'F'
DataSetWID BIGINT NUMBER No Reference to the data set from which the entity came from

Denormalized References

Column References
SequenceWID Subsequence : WID
SequenceWID Protein : WID

Referenced By

Table Column
SeqFeatureLocation SeqFeature_Regions
Entry OtherWID
TranscriptionUnitComponent OtherWID
Support OtherWID
RelatedTerm OtherWID
CitationWIDOtherWID OtherWID
CommentTable OtherWID
CrossReference OtherWID
CrossReference CrossWID
Description OtherWID
DBID OtherWID
SynonymTable OtherWID
ToolAdvice OtherWID

Other Constraints

None.

Indexes

Name Columns
FEATURE_Description
FEATURE_TYPE Type
FEATURE_Class Class
FEATURE_SequenceWID SequenceWID
FEATURE_START_ENDPOS STARTPOSITION,ENDPOSITION
FEATURE_ENDPOSITION ENDPOSITION
FEATURE_DATASETWID DATASETWID