SMD : Help : Help Data Selection for Analysis

Help : Help Data Selection for Analysis


Contents

Related Help Documents


Description

The Data Selection for Analysis tool is available only after you have selected a set of hybridized arrays using either the Basic Search or the Advanced Search programs. Once a set has been selected, Data Selection for Analysis allows you to select genes or spots to cluster, and to filter data based on a variety of parameters. This tool can be used to generate a preclustering (.pcl) file, or the files needed for viewing a cluster with TreeView. In addition, Data Selection for Analysis will lead you to tools that will let you view clustered data via the web.

Data Selection for Analysis is split into three large steps:

Gene Selection Options

Although we use the word 'gene,' it really refers to any DNA sample spotted on the microarrays. A 'gene' might be a PCR product representing an entire section of a gene, a portion of a gene, a clone associated with a gene, an intergenic region or anything at all.

This section allows you to first specify which genes are of interest to you, then decide how to collapse your data, how to identify genes in your output file, select biological annotation and to choose a way to label the arrays you're using.

Data Filtering Options

This section of the tool allows you to choose what data you think is reliable enough to include in your analysis. The steps are:

Gene Filtering Options

There are several steps to this part of the tool. Which options appear depends on what sort of data you have retrieved. Operations are carried out in the order in which they are presented on the page. The steps are:

Viewing Clustering Results

Once you've submitted a clustering query, you will see a page where text writes to your screen. When the preclustering file is complete, the last line will read, "...genes were selected."

Data Analysis

SMD allows you to perform some data analysis on your preclustering file, using either of two methods:

Clustering Options

You have the define the following options when hierarchically clustering

  • If you want to generate a file of sorted correlations, the default correlation is .8. Click 'Submit' when you have chosen the appropriate options.

    Image Generation Options

    Here are a couple tips that will help you optimize the time it takes to analyze the experiments you selected.

    Browsing, Viewing, and Downloading Clustered Data

    To interactively browse the clustered data, click the red and green image in the lower left-hand corner of the window. This takes you to the 'Hierarchical Cluster View' where you can focus on specific gene sub-clusters. You can view the clustered data in the following ways. The other links at the bottom of the screen download files to your machine.


    Please send comments or questions to: array@genome.stanford.edu