Contents
Many times, you will have enough microarray results to enter that entering the file names one-by-one on a web page will be too time consuming. If this is the case, you can construct a file that provides all the information you would otherwise enter on the web page. The Data Entry for Microarray Experiment form is used to enter a batch of experiments into the database according to instructions specifed in the batch file. This help describes the information you need to enter an experiment and how to create the batch file. Small numbers of experiments can also be entered individually, and a separate help is available for that procedure.
To load a group of experiments into the database, you first need to assemble a batch file in which each line contains all the information needed to enter one experiment. This is the same information that goes on the web form for individual experiment entry. The batch file must be tab delimited and have the lists below in the header. These lists are the same as the entry box titles on the individual experiment entry form. The columns in the batch file can be in any order.
Sample batch files from which you can copy headings are available here for GenePix/ScanAlyze/SpotReader, AffyMetrix, NimbleGen and Agilent type of data. Within the File menu of your browser, Just "Save As Text" and either copy or edit the resulting file. Your complete batch file must be in your loader account before you can enter the data.
Batch file Columns
As on the Individual Experiment Entry form, providing values for Experiment Date, Experiment Description, Collaborative Group, Individual User and IS_REVERSE is optional. You can either provide these columns in your batch file and leave them blank, or you can omit them from the batch file entirely; either will work. The Is_Reverse value defaults to 'N'. As in individual entry, Slide Name and Experiment Name must be unique. Unlike the individual form, however, where Print Name, Experiment Category and SubCategory, Normalization Types, and the various access selections are provided by selectable menus, you must know the exact values in advance, by consulting the Print, Category, Subcategory, user groups and user lists while putting together your batch file.
Normalization Type is required and can be "Computed", "Regression" or "User Defined". If Normalization Type is "User Defined", then Norm Value is required. Any number can be entered as a Norm Value. If norm Type is "Computed," then the default computed normalization is used. If the Normalization Type is either "Computed" or "Regression", the Norm Value column should be left blank. Normalization type is required for entering Agilent and Affymetrix data, but is ignored.
Experiment Date will default to the date the experiment is entered if the column is left blank. Two date formats are accepted. One is a 4-digit year, 2-digit month, and 2-digit day (YYYY-MM-DD, e.g. 2000-01-18), and the other is the Excel default (MM/DD/YY e.g.01/18/00).
Data files (.sag, .dat, .tif, etc.) can be identified in the batch file by filenames only if the batch file is in the same directory as all the data files. If some data files are organized in subdirectories inside incoming/, then the batch file should include the path to those data files relative to the batch file. If, eg. some of the data files are in the "worm_aging" directory inside the incoming/ directory, the path would be: "worm_aging/1234.gpr".
Experiment loading is commenced by entering your loading-data into a queue. The rate of loading is determined by a number of factors, including both the load on the database and how many other array-load requests were made prior to yours. If there are no delays, it usually takes at least five minutes per array, and may take quite a bit longer if your arrays have a large number of spots (human arrays) or if many other users are using the database. During this time, you can check the progress of your experiment load within the queue. After your data is successfully entered into the queue (Note: this is not the same thing as final entry into the database), you should receive a confirmation screen as well as an email notifying you:
Your database entry request (batch number XXXX) has been queued for loading. Please note the data for your array(s) ARE NOT YET IN THE DATABASE. Do NOT delete any of your files until you receive email confirmation that the data have been loaded. Progress of is batch within the queue can be viewed at: http://genome-www5.stanford.edu/cgi-bin/tools/queue/nph-ProgressQuery?batchno=XXXX You may also view this file in your loader account under the ORA-OUT directory. If you have any questions please contact the curators array@genome.stanford.edu)
You can check the progress of your experiment load based on the batch number reported to you with either the link on the queue confirmation page or from the URL in the email.
If all goes well, you will eventually get an email message that says:
Loading of your array data (batch number XXXX), has
successfully completed. 1 out of 1 were successfully loaded.
Details of the load process have been written to :
/loader/ftphome/username/ORA-OUT/XXXX.log,
or you can temporarily view the details via the web at:
http://genome-www5.stanford.edu/cgi-bin/tools/queue/nph-ProgressQuery?batchno=XXXX
If you have any questions please contact the curators
(array@genome.stanford.edu).
At the bottom of the HTML confirmation page or in the log file in your ORA-OUT directory on loader.stanford.edu should be the message:
***All data for this experiment ('slidename') have been successfully inserted into oracle database***