Adding a new Object to the BioSpice warehouse schema
This is an example of how to add a new table object into the java Loader code.
Explainations of Package and Class concepts can be found in LoaderHandOff.txt.
Name: pathway
Create the interface file
- public interface Pathway extends ObjectTable {
}
Create the implementation file
- class PathwayImpl extends ObjectTableImpl implements Pathway {
}
Add variables to PathwayImpl.java
- long wid;
- String name;
- String type;
- long organismWID;
- Other variables are not needed at this point since ObjectTableImpl class takes care of DataSetWID, DBID, synonyms, comments, citation, cross-references, terms and Entry information.
Add constructor to PathwayImpl.java
- public PathwayImpl( String name, SourceInformation newSource ) {
//Here is where the WID for this object is set.
wid = DBInquirer.getNextWID();
//The information about the object source is also set here.
Source = newSource;
//Since the name is a string that might contain a single quote use the setName method.
setName( name );
}
Add get/set methods for each variable into Pathway.java and PathwayImpl.java
- WID only has a get method to prevent others from setting this variable.
- Since the grammar input is only strings, it is necessary to have String input for setting a numeric variable. Other classes contain examples of converting Strings into number. Remember to enclose the conversion in a try catch statement, since a NumberFormatException is thrown if the String is not a number.
- Any string variable that might contain a single quote needs to be sent to parseQuote. ParseQuote accepts a String as a parameter and returns a String. Any single quotes in the String are doubled. ParseQuote is in ObjectTableImple making it accessible to all object classes.
Edits to SchemaFactory.java
- Add a method into SchemaFactory to return a new Pathway object.
- public Pathway newPathway ( String name, SourceInformation source ) {
return new PathwayImpl( name, source );
}
Edits to DataSet.java and DataSetImpl.java
- Add prototypes to DataSet.java. (See notes in DataSet.java and DataSetImpl.java for an explanation.)
- public Pathway getCurrentPathway();
- public Vector getPathwayList;
- Add variables to DataSetImpl.java.
- private Pathway currentPathway = null;
- The object starts off as null, so the program does not make all objects for every entry and then not use half of them.
- private Vector pathwayList = new Vector();
- Stores one or more pathway objects.
- Add methods to DataSetImpl.java.
- To allow access to the pathway object from the grammar, a getCurrentPthway method is used. This method will return the current pathway if the object has been created or this method will get a new object form the SchemaFactory. The name is set to unknown since at this point it is not known if the pathway will have a name. The current source information is also passed in and the entry information of this object is set here also.
- public Pathway getCurrentPathway() {
if( currentPathway == null ) {
currentPathway = SchemaFactory.getFactory().newPathway ( "unknown", getCurrentSource() );
currentPathway.setEntry(newEntry(currentPathway.getWID()));
}
return currentPathway;
}
- Acces to the list of pathway needs to be created.
- public Vector getPathwayList() {
}
- If more then one object can exist in the same entry from the flat file, then another method needs to be added:
- getNewPathway.
- (See getNewGene or getNewOrganism in DataSetImpl.java for examples.)
- GetNewPathway method should check to see if the name field of the object has been set.
- If the name has been set then it is assumed that a new object needs to be made.
- The old object is put into its vector list and a new object is created and returned.
- If the name has not been set, then it is assumed that this object has not been used yet and the current object is returned.
- Add a couple of lines to saveEntry method in DataSetImpl.java.
- Before the code that checks the output variable, put the last used pathway object into the list of similar objects that will be used to load the database.
- if( currentPathway != null ) {
pathwayList.add( currentPathway );
}
- After the code that checks the output variable, clear the object variable and clear the object list variable.
- currentPathway = null;
- pathwayList = new Vector();
- That completes the work on the DataSetImpl class.
Edits to SQLOutput.java and SQLOutputImpl.java
- SQLOutput has three ways it can output information. An explanation of the SQLOutput class files can be found in LoaderHandOff.txt. Each of these three ways has to be handled for every new table added to the schema.
- Add a new BufferedWriter variable for the table name: pathwayOut. This variable will point to the Oracle loader control file.
- BufferedWriter pathwayOut;
- Add two lines to the closeLoaderFiles method in SQLOutputImpl.java.
- pathwayOut.flush();
- pathwayOut.close();
- Add lines of code to the doTable method in SQLOutputImpl.java.
- Add a new list variable that is filled from the DataSet object.
- Vector pathwayList = storage.getPathwayList();
- Add a new object variable for temporary storage.
- Currently there are no new holding areas for linking tables and look-up tables. This will be done later on.
- The if statements, of the doTable method, start with a SourceInformation object, then proceed to each object in the schema. Place the new insertion code after the other objects' insertion code but before the linking and look-up tables' insertion code.
- // store Pathway information
Enumeration tempPathwayList = pathwayList.elements();
while( tempPathwayList.hasMoreElements() ) {
mainPathway = (Pathway)tempPathwayList.nextElement();
try {
if (writeFlag) {
commonOut.write(mainPathway.getSQLInsert(null) + ";\n");
}
if (insertFlag) {
DBInquirer.runInsert(mainPathway.getSQLInsert(DBLoader.PARENTNAME));
}
if (oracleFlag) {
reactionOut.write(mainPathway.toLoader() + "\n");
}
// store the Pathway's entry information
entryList.put(new Long(mainPathway.getWID()), mainPathway.getEntry());
// store the comments
tempList = mainPathway.getComment();
if (tempList != null) {
extraList = tempList.elements();
while (extraList.hasMoreElements()) {
Object temp = extraList.nextElement();
commentList.add(new Long(mainPathway.getWID()));
commentList.add(temp);
}
}
// store the citations
tempList = mainPathway.getCitation();
if (tempList != null) {
extraList = tempList.elements();
while (extraList.hasMoreElements()) {
Object temp = extraList.nextElement();
citationList.add(new Long(mainPathway.getWID()));
citationList.add(temp);
}
}
// store the cros references
tempCrossReff = mainPathway.getCrossReference();
if (tempCrossReff != null) {
crossReferenceList.put(new Long(mainPathway.getWID()), tempCrossReff);
}
// store the dbid
tempList = mainPathway.getDBID();
if (tempList != null) {
extraList = tempList.elements();
while (extraList.hasMoreElements()) {
Object temp = extraList.nextElement();
dbidList.add(new Long(mainPathway.getWID()));
dbidList.add(temp);
}
}
// store the synonyms.
tempList = mainPathway.getSynonym();
if (tempList != null) {
extraList = tempList.elements();
while (extraList.hasMoreElements()) {
Object temp = extraList.nextElement();
synonymList.add(new Long(mainPathway.getWID()));
synonymList.add(temp);
}
}
}
catch (Exception error) {
String temp = "SQL To Database Pathway table";
log.error(temp, error);
log.error("Line: " + mainPathway.getEntry().getLineNumber() );
log.error( mainPathway.getSQLInsert(null));
reportError(temp);
setEntryError(mainPathway.getEntry(), temp, error);
}
}
- The extra methods that are not found in the impl class file will be found in the ObjectTableImpl file. Since linking tables and look-up tables that contain some the of pathway object�s information have yet to be filled in, the remainder of the doTable method can be skipped.
- Add two lines to the setUpLoaderFiles method in SQLOutputImpl.Java.
- To create the Oracle loader control file and write out the Oracle loader control file header information.
- pathwayOut = new BufferedWriter(new FileWriter("pathway.ctl"));
- pathwayOut.write(DataLoader.getpathwayLoaderSetUp());
- This completes the installation of a new object into the loader code, but if the object�s information is also used in different tables, then other variables might need to be added to the object or to other objects. One example is the PathwayLink table. Since pathway is used in the PathwayLink table, the information to fill this table needs to be stored. Because a link is not an object and does not have its own class object, the object that references the link needs to contain the information about the link. PathwayLink contains two pieces of information in addition to the current pathway�s WID, they are a pathway WID and a chemical WID.
- The two variables should be added into the PathwayImpl class as well as get and set methods:
- long linkedPathwayWID = 0;
- long chemicalWID = 0;
- This is done for ease of storing the information and loading the information into the database, but the grammar file does not know about the database and thus does not contain WIDs. DataSetImpl class is where WID information is gathered together. By adding a method to collect the information for the PathwayLink table, the code in the grammar file remains based on parsing the file and remains ignorant of the database. The grammar can now call a single method to save the PathwayLink Table information.
- Public void addPathwayLink( String linkedPathwayName, String linkedChemicalName ) {
// set the pathway wid based on the name
getCurrentPathway().setLinkedPathwayWID( getPathwayWID( linkedPathwayName ) );
// set the chemical wid based on the name
getCurrentPathway().setChemicalWID( getChemcialWID( linkedChemicalName ));
}
- By shifting the work to more specific methods, this procedure remains simple. DataSetImpl already contains a method to get the chemical WID from the name of a chemical, which leaves the pathway WID for the linked pathway. By using the code from getChemicalWID and adapting it for used with getPathwayWID, the retrieval of a pathway WID based on its name is completed. GetPathwayWID method get a little tricky since duplicated pathway objects are not wanted in the database, so each pathway object is stored in a hash table for quick reference by name. This means that the variable type of pathwayList needs to be changed to a Hashtable, which will also change getPathwayList, and other methods inside of SQLOutput. The method to query the WID from an object�s name is present in DBInquirer. GetObjectWID is a blind method that will look up the WID from the name of the table that is passed in.
- protected long getPathwayWID( String name ) {
// return a reference to a pathway of the given name
Pathway tempPath;
long number = 0;
String fixedName = parseQuote( name );
number = DBInquirer.getObjectWID( "PATHWAY", fixedName, mainSource.getWID() );
if( number <= 0 ) {
// nothing found in the query
// check if already made.
if( pathwayList.containsKey( fixedName ) ){
tempPath = (Pathway)pathwayList.get( fixedName );
}
else {
// create a new pathway object
tempPath = SchemaFactory.getFactory().newPathway( fixedName, getCurrentSource() );
tempPath.setEntry( newEntry( tempPath.getWID() ) );
}
number = tempPath.getWID();
}
return number;
}
- When the WID of the linked pathway is identified and the WID of the linking chemical is identified, then these two pieces of information can be stored in the current pathway object through the addPathwayLink method.
- The next step is to take the information from the pathway object and distribute it somewhere. The code for this step will go into SQLOutputImpl. The loading of the pathway object is already complete so only some addition lines of code need to be added to store the information for the PathwayLink table. A place to store the linking table information is declared with the rest of the holding area variables.
- Hashtable pathwayLinkList = new Hashtable();
- The hashtable is populated when the pathway objects are analyzed. After the entry information has been stored, the linked pathway WID and the chemical WID can be placed into their holding area.
- If( mainPathway.getLinkedPathwayWID() != 0 ) {
Vector tempLinkingList = new Vector();
tempLinkingList.add( new Long( mainPathway.getLinkedPathwayWID() ));
tempLinkingList.add( new Long( mainPathway.setChemicalWID() ));
pathwayLinkingList.put( new Long( mainPathway.getWID() ), tempLinkingList );
}
- Java does not like to store numbers in its lists so the numbers are turned into number objects that can be stored in lists.
- Once all PathwayLink information is stored locally the doTable method can continue on its way. To keep the method somewhat organized the code to distribute the PathwayLink table information goes near the end of the method. Since the table is not an object table, the SQL commands need to be generated locally.
- // create Pathway links
e = pathwayLinkList.keys();
while (e.hasMoreElements()) {
Long pathwayWID = (Long) e.nextElement();
Vector tempLinkingList = (Vector)pathwayLinkList.get( pathwayWID);
Long linkedPathwayWID = (Long)tempLinkingList.elementAt(0);
Long chemicalWID = (Long)tempLinkingList.elementAt(1);
try {
if (writeFlag) {
sqlLine = "INSERT into PathwayLink( Pathway1WID, Pathway2WID, ChemicalWID)" +
" values (" + String.valueOf(pathwayWID.longValue()) + ", " +
String.valueOf(linkedPathwayWID.longValue()) + ", " +
String.valueOf(chemicalWID.longValue()) + ")";
commonOut.write(sqlLine + ";\n");
}
if (insertFlag) {
sqlLine = "INSERT into ";
if (DBLoader.PARENTNAME != null) {
sqlLine += DBLoader.PARENTNAME + ".";
}
sqlLine += "PathwayLink( Pathway1WID, Pathway2WID, ChemicalWID)" +
" values (" + String.valueOf(pathwayWID.longValue()) + ", " +
String.valueOf(linkedPathwayWID.longValue()) + ", " +
String.valueOf(chemicalWID.longValue()) + ")";
DBInquirer.runInsert(sqlLine);
}
if (oracleFlag) {
sqlLine = String.valueOf(pathwayWID.longValue()) + ", " +
String.valueOf(linkedPathwayWID.longValue()) + ", " +
String.valueOf(chemicalWID.longValue()) + "\n";
enzReaOut.write(sqlLine);
}
}
catch (Exception error) {
mainPathway = (Pathway)pathwayList.get(pathwayWID);
String temp = "SQL To Database Pathway Link table";
log.error(temp, error);
log.error("Line: " + mainPathway.getEntry().getLineNumber() );
log.error(sqlLine);
reportError(temp);
setEntryError(mainPathway.getEntry(), temp, error);
}
}
- With this section completed, one of the two linking tables has been taken care of. Follow the idea of how the linking table was added to the loader and the other two tables should fall into place.
- That completes the work on the SQLOutput classes.