(C) 2006 SRI
International. All Rights Reserved. See BioWarehouse
Overview for license details.
Introduction
This document describes version 4.6 of the eco2dbase Loader. It is one of several database loaders comprising the BioWarehouse. The eco2dbase Loader (referred to simply as the loader), loads the eco2dbase data into the BioWarehouse - an Oracle relational database that provides a common representation for diverse bioinformatics databases. Please see the references section for more information about eco2dbase.
The eco2dbase loader is different from other loaders in the BioWarehouse. SRI will create MySQL and Oracle database dump files containing the eco2dbase data. The reason for this is that the source eco2dbase data given to SRI in the format of an Oracle database dump file, and not all BioWarehouse users use Oracle users.
Therefore, BioWarehouse do not have to run the full data loader, and can simply just very quickly import the Oracle or MySQL dump file into their BioWarehouse Schema.
Constant tables specify scientific data such as information from the Periodic Table of Elements, as well as constants used as column values in various warehouse tables.
Object tables describe a type of entity in a source database, such as compounds and proteins. Each column of an object table specifies a parameter that characterizes the object. In addition to the parameters defined by the source database, the loader assigns a unique warehouse ID (WID) to each object, which is used by other tables to reference the object.
A special type of warehouse object is the dataset. A dataset object is created for each dataset loaded into the warehouse, i.e., the SWISS-PROT loader adds one row to this table when it is run. Its WID is referred to as the dataset WID and is a column in each object table, specifying the source database of the object.
A linking table describes relationships among objects. They contain WIDs of the associated objects, and any additional columns needed to characterize the relationship. In general, many-to-many relationships are supported. Special tables exist to capture reference and crossreference information and to facilitate lookup of objects.
Full schema information, including source files and browseable documentation, is available with this distribution.
The eco2dbase loader should not be run concurrently with other loaders. The loader is likely to fail in this case. This limitation was imposed for performance reasons.
*BioWarehouse users do not have to do this step.
This step will be run by SRI to update the MySQL and Oracle database dump files. SRI will run this step only when it is necessary to create new eco2dbase dump files. This is only necesary when there is a schema change, or there is new eco2dbase data. Information on how to run the eco2dbase loader to do this is contained in the document describing how to create the eco2dbase dump files.Before importing the data, make sure the Oracle or MySQL environment is configured according to the Environment Setup (MySQL). Also make sure the schema is loaded into the database as specified in the Schema doc.
In addition, Oracle users must have the Oracle binary imp
included in their PATH environment variable, and MySQL users must have the mysql
binary as part of their PATH variable.
Also, Oracle users must set the TNS_ADMIN environmental variable to the directory containing your tnsnames.ora file. The tnsnames.ora file is the conventional file used to specify the connection parameters to the database.
The shell scripts to import the data is run from the eco2dbase-loader/bin
directory. The database dump files are in the eco2dbase-loader/data
directory.
osprompt: sh ./mysql-load-eco2dbase.sh user password database host port ../data/eco2dbase_mysql.dmp
You will see the command about to be executed, and then you will be prompted for your password. Enter your password in, and the eco2dbase data will be loaded into the database. When the data is finished being loaded into the database, it will say "done loading..."
osprompt: sh ./oracle-load-eco2dbase.sh user password database ../data/eco2dbase_orcl.dmp
You will see the command about to be executed, and as the data as imported you will see which tables is having data loaded into it. Upon completion, look at the file orcl_import.log
to identify any errors during the import.
Other that the standard warehouse creation procedure, the loader does not require that any other Warehouse tools be run prior to its execution.
Dataset
table as follows:
Column | Value assigned by eco2dbase loader | |
---|---|---|
WID |
The next available reserved WID in the warehouse. Uniquely specifies this dataset in the warehouse. WIDs have been reserved for the eco2dbase dataset, so that thare no WID conflicts with other datasets. | |
Name |
'eco2dbase' | |
ReleaseDate |
July-2000 | |
Version |
NULL | |
LoadDate |
The time/date the loader was run to generate the dump files. | |
ChangeDate |
The date and time the loader completed to generate the dump files. | |
LoadedBy |
The value of the system environment variable USER for the account running the loader. | |
Application |
'eco2dbase Loader' | |
ApplicationVersion |
NULL | |
HomeURL |
NULL | |
QueryURL |
NULL |
Please see the document describing the translation semantics for eco2dbase.