Analyst/Heatmap Help and DefinitionsA Work in Progress |
Input File FormatThe input file format is based on the output of the Analyst™ Platereader, but the actual format requirements are fairly straightforward and flexible. The file can be a Microsoft Excel™ spreadsheet (.xls) file of a tab-delimitted ASCII text file. The heatmap program will automatically detect which of these file types was used. Note to Open Office users: As of the time this was written, the portion of the program that reads Excel files could not read .xls files created by Open Office Calc, only those created by Microsoft Excel itself. The author of this code, an Open Office user, shares your frustration and is looking for a solution.The file format is a series of plates, each represented by a set of metadata, followed by a matrix of well values, followed by one or more blank rows/lines. The metadata is a series of name-value pairs in columns one and two. The only metadata that is required (or used) is the "Barcode" field, whose name can be any of "Barcode:", "Barcode", "Plate ID:" or "Plate ID". ("Barcode:" has precedence. In the example below, both "Barcode:" and "Plate ID:" are given. The value for "Barcode:" - *001+59* - will be used.) The value associated with the Barcode field is used as a unique name for the plate for display (and for data submission in the internal version.) The well matrix is a series of rows corresponding to the rows of the plate. The first column contains the row letter and later columns contain well values. The matrix must be preceded by a row with an empty first field and column numbers for the plate in later columns. Example: |
Sample DataThere are two sample files available for experimenting with the Heatmap page:
|
Computation FormatThis radio button group selects the computation to be performed on the data before the results are displayed. Regardless of what format is chosen, the raw data from the input file is what will be stored in the database. The computation affects what data is displayed and what data goes into the tab-delimitted output file that is created. Cutoff calculations are performed against the computed values.
|
HighlightingThis collection of radio buttons determines how outliers (defined in other sections) are indicated on the heatmapped plate tables.
|
Combine raw data from this column ...If this checkbox is checked, the data column (not plate column) indicated above will be combined, well by well, with the data from a second data column for the same plate. If data is not available for the second plate, it will not be shown. This does not create a new, third column of data in the database, but the results can be retrieved as a tab-delimitted file, which DRSC staff can then add to the database as a third column if asked to do so.The method of combination can best be described by designating the original field/data column, which is selected from the "type of data being collected" pulldown menu or named in the next field if this is a new data type, as field F1 and the field/data column indicated in the pulldown menu in the combine data area as F2.
|
Statistical NormalityStatistical normality vs. normalization: It is important to distinguish between the terms "Statistical Normality", described here, and "normalization". Determining statistical normality means examining the distribution of a single dataset to see if the results show bias or are skewed in some way. Testing for statistical normality is an essential step in the analysis of HTS datasets. Most statistical tests that are applied to large datasets, including most of those used in the Computation Format section of this page, work under the assumption that the dataset to be tested is at least close to statistically normal. Most datasets contain some bias, which can be compensated for through the use of normalization techniques. "Normalization" refers to the manipulation of data from multiple sets to make data from different plates comparable. The calculations available in the Computation Format section can be classed as normalization techniques. Statistical Tests: We provide tests of the normality of plate values by running the Jarque-Bera and Shapiro-Wilk statistical tests (against all wells, non-edge wells and non-control wells), and providing summary values for the whole dataset. These tests are not performed by default because they increase page display time considerably. The Jarque-Bera score indicates deviation from normality with a perfect score being 0. The Shapiro-Wilk score indicates degree of normality with a perfect score being 1. In both cases a probability of normality score (P-value) is also provided. The P-values are generally easier to interpret, with values greater than 5.0x10-2 being considered good for our 384-well RNAi experiments. P-Values below 1.0x10-5 should be cause for reevaluating the plate or possibly the experiment. The normality of the data can also be evaluated visually by selecting the "Q-Q Plot" button for a Quantile-Quantile plot of the data against a normal curve. Perfectly normal data should form a straight diagonal line across the graph. The Q-Q Plot page will include the results of the S-W and J-B tests for all wells and all non-edge wells regardless of whether statistical tests were selected for the main page. |