(C) 2006 SRI International.
All Rights Reserved. See BioWarehouse
Overview for license details.
This document describes how to build and run the NCBI Taxonomy
Loader.
The NCBI Taxonomy loader is located in the ncbi-taxonomy-loader/
subdirectory of the warehouse distribution.
For more information regarding the NCBI Taxonomy loader, see the
NCBI Taxonomy Manual.
Before building the loader, make sure the environment is configured according to the Environment Setup. Also make sure the schema is loaded into the database as specified in the Schema document.
To build the loader, bring up a shell and navigate to the
ncbi-taxonomy-loader/src/
directory. Then:MySQL:
osprompt: make clean
osprompt: make db=mysql
Creates the filemysql-ncbi-taxonomy-loader
Oracle:
osprompt: make clean
osprompt: make db=oracle
If filewh_oracle_util.c
is reported missing, re-run the above make command:
osprompt: make db=oracle
Creates the fileoracle-ncbi-taxonomy-loader
Also, a symbolic link named
"ncbi-taxonomy-loader"
is created, which points to the newly created executable. This can be used as a synonym for the most recently created DBMS-specific loader if desired.If the build fails and gives errors about header files which are not found, read the section on configuring the appropriate client in Environment Setup. Posible problems are: improper installation of ProC (Oracle) or library/header files installed in an incorrect place.
Obtaining the NCBI Taxonomy databases
The NCBI Taxonomy Manual contains information regarding the NCBI Taxonomy data set. See the NCBI Taxonomy Homepage for general information about NCBI.
Running the NCBI Taxonomy loader
The
ncbi-taxonomy-loader/src/
directory contains scripts to run the MySQL and Oracle loaders.MySQL:
For example:./run-mysql host database user password datadir version releasedate host - The machine address where the MySQL server/database resides. database - Name of the MySQL database to be loaded. user - MySQL userid. password - MySQL password for userid. datadir - Directory which contains the NCBI Taxonomy data files to be loaded. version - Version of source database (typically the date it was downloaded) releasedate - Release date of source database (typically the date it was downloaded)
./run-mysql 123.45.67.8 warehouse me mypwd /space/bio/databases/NCBI/released "2008-03-13" "2008-03-13"This command loads NCBI Taxonomy data into the MySQL database namedwarehouse
. The data files are located in the directory/space/bio/databases/NCBI/released
and the user name and password used to access MySQL areme
andmypwd
.Oracle:
For example:./run-oracle "user/passwd" datadir version releasedate
user/passwd - User name and password. Ex: "dan/mypwd", "dan/mypwn@mydb" datadir - Directory which contains the NCBI Taxonomy data files. version - Version of source database (typically the date it was downloaded) releasedate - Release date of source database (typically the date it was downloaded)
./run-oracle "me/mypwd@mydb" /space/bio/databases/NCBI/released "2008-03-13" "2008-03-13"This command loads NCBI Taxonomy data into the Oracle databasemydb
. The data files are located in the directory/space/bio/databases/NCBI/released
and the user name and password used to access Oracle is"me/mypwd"
.
The loader may report parse errors; this is expected. The expected output for MySQL can be found sample-loader-output-mysql.txt. The expected output when loading Oracle can be found sample-loader-output-oracle.txt.
The database data sets should be queried to ensure NCBI Taxonomy data is loaded. See the document on Running the Perl Utiltity scripts to check this.