(C) 2006 SRI
International. All Rights Reserved. See BioWarehouse
Overview for license details.
Introduction
This document describes version 4.6 of the eco2dbase loader. It is one of several database loaders comprising the Bio-SPICE Warehouse. For more information regarding the eco2dbase Loader, see the eco2dbase manual.
Constant tables specify scientific data such as information from the Periodic Table of Elements, as well as constants used as column values in various warehouse tables.
Object tables describe a type of entity in a source database, such as compounds and proteins. Each column of an object table specifies a parameter that characterizes the object. In addition to the parameters defined by the source database, the loader assigns a unique warehouse ID (WID) to each object, which is used by other tables to reference the object.
It is very important to note, that WIDs have been set aside and reserved the eco2dbase data. For the eco2dbase DataSet, the reserved WID is 1001 or larger. For the rest of the eco2dbase data, the reserved WIDs are from 0 to 999,999. WIDs were reserved for the eco2dbase dataset to ensure that there are no conflicts (overlapping WIDs) with other datasets in the eco2dbase dataset. This is necesary to do because BIO-SPICE Warehouse users do not run the full eco2dbase loader. Users instead load database dumps into their BIO-SPICE Warehouse schema. Because their schema may already have other data in there it is necessary to make sure there is no overlap with the WIDs.
A special type of warehouse object is the dataset. A dataset object is created for each dataset loaded into the warehouse, i.e., the SWISS-PROT loader adds one row to this table when it is run. Its WID is referred to as the dataset WID and is a column in each object table, specifying the source database of the object.
A linking table describes relationships among objects. They contain WIDs of the associated objects, and any additional columns needed to characterize the relationship. In general, many-to-many relationships are supported. Special tables exist to capture reference and crossreference information and to facilitate lookup of objects.
Full schema information, including source files and browseable documentation, is available with this distribution.
Comments that are NULL or are "-" are ignored.
Gene
table for each entry in gene.
eco2dbase Attribute | Warehouse Semantics | |
---|---|---|
GENE_NAME | Gene.Name ; every gene contains a
name and each name is distinct. |
|
L_END | Gene.CodingRegionStart |
|
R_END | Gene.CodingRegionEnd |
|
GENE_DIRECTION | Gene.Direction |
|
B_NUM | SynonymTable.Syn |
|
GENE_NOTE | CommentTable.Comm |
Protein
table for each entry in prot.
eco2dbase Attribute | Warehouse Semantics | |
---|---|---|
PROTEIN_NAME | Protein.name ; the protein name is the protein definition. This is sometimes
null, and some proteins map to the same gene, and have the same name. |
|
CALC_MW | Protein.MolecularWeightCalc ; calculated from DNA sequence |
|
CALC_PI | Protein.PICalc ; calculated from DNA sequence |
gene_id
in the eco2dbase schema is not null, a row is added to GeneWIDProteinWID
to associate the protein to the gene.
eco2dbase Attribute | Warehouse Semantics | |
---|---|---|
ALPHANUMERIC |
Spot.Name ; the alphanumeric ID will be used as the primary name in the BIO-SPICE Warehouse schemaAn example spot name is A008.0. The A indicates a certain pI range, and the number indicates a MolWt range (see picture on p2069). |
|
RRM_NUMBER | DBID.XID ; not every spot has a RRM number.
|
|
EST_MW | Spot.MolecularWeightEst ; Molecular weight estimated from migration on
reference 2D gel. |
|
EST_PI | Spot.PIEst ; Isoelectric point (pI) estimated frommigration on reference 2D gel. |
ProteinWidSpotWid
to associate the spot to a protein, for cases where the spot has been mapped to a known protein.
eco2dbase Attribute | Warehouse Semantics | |
---|---|---|
REF_GEL | GelLocation.refGel ; Y or N indicates whether the gel usedto define the coordinates for this spot to determine MW and pI. |
|
X_COORD | GelLocation.Xcoord ; X coordinate on the gel
|
|
Y_COORD | GelLocation.Ycoord ; Y coordinate on the gel
|
|
GEL_NUMBER |
For each gel, a record will be created in the experiment table to capture about the conditions of each gel. GelLocation.ExperimentWID will refer to the WID of the gel (which is stored in the experiment table.Experiment.Description will contain the gel conditions pulled from captions of Figures 1, 2A, 2B, 3A, 3B, 4. These descriptions are also at: ftp://ftp.ncbi.nlm.nih.gov/repository/ECO2DBASE/edition6/image.info Gels 1, 2A, 2B, and 4 were run with E. coli K-12, strain W3110; Gels 3A/3B were run with E.coli B/r, strain NC3 Experiment.BioSourceWID will link to a record in the biosource table so that this information is captured.
Experiment.GroupSize will equal 0, since there are no child experimentsDBID.XID will storeteh gel number and be linked to the Experiment table. Experiment.Type will contain "2D Gel."
|
|
Spot_ID |
GelLocation.SpotWID ; will be used to link to a record in the spot table
|
eco2dbase Attribute | Warehouse Semantics | |
---|---|---|
METHOD_ABBR | SpotIdMethod.MethodAbbrev ; A short, one letter abbreviation for the method name.
|
|
METHOD_NAME | SpotIdMethod.MethodNameD ; The name of the method.
|
|
METHOD_DESC | SpotIdMethod.MethodDesc ; a more detailed description of the method |
eco2dbase Attribute | Warehouse Semantics | |
---|---|---|
SPOT_ID | SpotWIDSpotIdMethodWID.SpotWID |
|
METHOD_ID | SpotWIDSpotIdMethodWID.SpotIdMethodWID |
eco2dbase Attribute | Warehouse Semantics | |
---|---|---|
ALPHANUMERIC and RRM_NUMBER |
ExperimentData.OtherWID ; used to link to spot record. The combination of these two IDsallows you to link to a unique Spot ID in the spot table of the eco2dbase schema. |
|
PROT_LEVEL | ExperimentData.Data is the amount of protein seen under those growth conditions.ExperimentData.Kind will be "O" for observed.ExperimentData.Role will be "Amount of protein seen under the growth conditions."
|
|
TREATMENT |
The protein level will be stored in the ExperimentData table, and will link to the Experiment table, which will store the information about the growth conditions. Experiment.Description will contain the treatment (growth conditions)Experiment.Type will contain the text "Growth Conditions for 2D Gel Experiment."Experiment.GroupSize will equal 0, since there are no child experiments
|
ExperimentRelationship.ExperimentWID
will be the growth conditions WID from the Experiment table; ExperimentRelationship.RelatedExperimentWID
will be the gel conditions WID from the Experment table.