Contents
- Description
- Types of Microarray data accepted
- First Time Users
- Entering Your Data
- Data Files Needed to Load Your Data
- Putting your data on loader using sftp
- Result Entry Options
The Experiment and Result Entry form is used to load microarray results into the database. To do this, you need an "unrestricted" account and an account on loader.stanford.edu. For more information regarding both the accounts and access required to enter results into the database, please refer to the Accounts and Access document.
Software Version Agilent A.6.1.1, A.7.1.1 and A.7.1.2 Affymetrix MAS 5 (5.0, 5.1), GCOS (1.0, 1.1), and dChip 1.3 GenePix 3.0, 4.1, and 5.0 NimbleScan 2.1.16 ScanAlyze All SpotReader 1.0
Experiment entry is performed by software that reads through your data files and then inserts the appropriate information into the database. First, the software must be able to find your data files in a special directory in your loader account. This means that you cannot enter data without having a loader account and without having copied your data files onto loader. There are two directories in your loader account that are used for data entry: the "incoming" directory and the "ORA-OUT" directory.
The data files that you want to enter into the database must be in the incoming/ directory on your loader account. You can only copy files onto loader using SFTP (Secure FTP), as regular, non-secure FTP access is now blocked for security reasons. If you are unsure how to upload your data, see the Moving Files via SFTP section. Please note that the incoming directory on loader is cleaned out regularly, so enter data into the database soon after uploading your data files and verify your data in the database promptly. We also recommend that you personally save a copy of ALL files. You can SFTP your files from loader to your personal computer so you will have an archived copy of your original data.
The default location for the database output files (e.g. error files, search results, etc.) is a subdirectory on your loader account called ORA-OUT. This subdirectory will be created when your loader account is created. All logs and error reports will be placed in the ORA-OUT directory.
Please note that loader is a communal machine used by everyone who enters microarray results into the database. Storing files that are not related to the use of the database (even if on a "temporary" or "emergency" basis) can prevent other users to load legitimate data and is therefore grounds for having your account revoked.
Depending on the feature extraction software that you used to generate your data, different files and information will be needed to load your experiment data.For all types of feature extraction software, you need the following things before you enter an experiment into the database. The table below describes these items. All files submitted can be compressed or uncompressed. Compressed files have the normal suffix with .gz on the end; an example would be .dat.gz .
Experiment Details Required for Data Entry for all Feature Extraction Softwares
Item Unique Required Max # of characters Notes Print name1 * N/A often the spotlist or Godlist name Slide name *2 * 30 Usually a systematic name assigned by the slide printing facility Experiment Name *2 * 100 Green3 or Single4 Channel Description * 100 Red Channel Description3 * 100 Experiment Description 2000 Unformatted text to describe experiment details Category5 * 30 Choose from a list in the database Subcategory5 * 30 Choose from a list in the database Experiment Type6 * 40 Choose from a list in the database Normalization Type7 * 30 Choose from a list in the database Norm Value7 * 30 For normalization type "user-defined", a normalization value must be entered. Result_Set_Name8 * * 100 The name of your result set Result Set Description8 100 Free text description of your result set Probe Set Algorithm4 * 100 Accepted values are: 'Affymetrix MAS 5' or 'dChip MBEIs', depending on what software was used to 'normalize' your data. Table Notes1 If you don't know which print to use for your experiment, after login, click Print List under "List Data" from the Main page or click "Print Name" on the individual experiment entry form. For further information, contact the microarray database curators.
2 Every slide name (e.g., array serial #) and experiment (hybridization) name must be unique - you may not re-use them, or use the same names as any other user. Result set names may be re-used, but only once per slide - you may have a result set called "simple normalization" for each of your slides, but only one per slide.
3 This field is required only for GenePix, ScanAlyze, or Agilent data.4 This field is required only for Affymetrix data.
5 The database requires a category and subcategory for each experiment, both of which are chosen from lists stored in the database. Any category can be paired with any subcategory to describe an experiment. Categories, subcategories, and their descriptions can be found by clicking Category or SubCategory under "List Data" on the Main page, or by clicking Experiment Category or Experiment SubCategory on the individual experiment entry form. If you need a category or subcategory which is not already in the database, contact the microarray database curators.6 This field is required for NimbleGen data, and is ignored for the others. Experiment types and their descriptions can be found by clicking Experiment types under "List Data" on the Main page. If you need an experiment type which is not already in the database, please send an email to the microarray database curators.
7 This field is always required, but it is ignored for Agilent and Affymetrix data.
8 These values are needed for Affymetrix and Agilent data only. See the note on unique slide, experiment, and result set names above2.
Item Suffix Max # of characters Notes Data File1 .dat, .gpr, .srr, .txt 50 Please check the print dimensions within the database before gridding your array.2 For Agilent data, be sure that you have the text file and not the xml file. Grid File .sag, .gps, .sra, .shp 50 Green Scan File3 .tif 50 typically the 532nm scan Red Scan File3 .tif 50 typically the 635nm scan Table Notes1 Do not change the default column names in the data file. SpotReader in particular gives you a choice of "channel shortcut names." Any of the default two-channel options are acceptable (Ch1/Ch2, Cy3/Cy5, Green/Red, 532/635). Any other names will cause experiment loading to fail.
2 Attempts to load array data not matching print dimensions (tips/blocks x rows x columns) are disallowed. If using GenePix, do not pre-filter any of the spot-features. Only a full gpr file, with an entry for every spot, in order, can be loaded.
3 Automatic .gif generation requires that you submit two .tif (not .scn) files when entering your experiment (one for each channel). The automatic .gif generation fails occasionally. However, if your .gif is not created at experiment entry, the microarray database curators can make it for you in most cases. It is also possible to upload a preferred .gif file if you don't like the generated ones.
Please note that the database accepts only gene expression data, from Affymetrix and dChip software (see below). Affymetrix mapping, resequencing and universal array data cannot be entered.
Item Suffix Max # of characters Notes Data File .dat 50 The image file is generated by the Affymetrix chip scanning software. It is a 16-bit tiff file will be archived and converted to a 8-bit giff file for viewing in the database. Cell File .cel 50 This file is generated from the .dat file by the Affymetrix MAS 5 or Affymetrix GeneChip Operating Software (GCOS). The native .cel file format for GCOS is a proprietary binary format. To upload GCOS .cel files into the database, open the GCOS Manager program and export the .cel file. This converts it into a text file the database can understand. Gene File .txt, .xls 50 This file is generated from the .cel file by Affymetrix MAS 5, Affymetrix GeneChip Operating Software (GCOS), and dChip. To upload the Probe Set file into the database it needs to be exported from the analysis software as a tab-delimited text file, see Uploading a probe set file. Experiment File .exp 50 This file is generated by the Affymetrix chip scanning software and contains chip protocol information.
Uploading a Probe Set File
Data Files Required for NimbleGen Data entry
Please note that the database currently accepts only single
channel data from NimbleGen.
Item
Suffix
Max # of characters
Notes
Image File
.tif
50
The image file should be a tiff file of the single
channel scan. This file will be archived and a gif copy of it will be
created for viewing in the database.
Cell File
.xys
50
This file contains processed result data for each feature on
the slide. It is a tab-delimited text file.
FTR File
.ftr
50
This file is a tab-delimited text file that contains
result information about features on the slides.
Gene Intensity File
.txt, .calls
50
This file is a tab-delimited text file that contains
result information for genes.
Moving Files via SFTP
Files can now be transferred onto your loader account using
SFTP only. For more information about SFTP please see the GSSG
page on Secure
Remote Shells and File Transfers. There are sftp clients
available for Stanford affiliates for Mac
and PC
users. For detailed information on how to transfer files via sftp,
please see the SFTP Help Page. Please
note that you will connect to loader at your University, which is not
necessarily the university displayed in the photos on the help
page. If your files are older than 2-3 weeks, disable the 'preserve
timestamp' option of the sftp client (you can do this on Fugu or
WinSCP). Otherwise the files will be deleted the at night by the
process that cleans up loader files.
Result Entry Options
On the Experiment
and Result Entry form, you will need to select the following: