DLS | Sequence Analysis | SEC | TM | Yield

Differential Scanning Fluorimetry (Tm): tm_calc.pl

The perl script tm_calc.pl reformats and analyzes Tm curves. This document describes the requirements, input format, command line options, processing and output format for the script. The basic command,

    > perl tm_calc.pl path/filename.csv

writes the XML output file path/filename.tmc.csv containing the original data and values such as Tm derived from basic analysis of the major transition. The derived values are also written in a human-readable comma-separated value table near the top, viewable e.g. in Excel. The command

    > perl tm_calc.pl path/filename.csv -f

fits one or more transition models to the data for each well, and includes derived values for those models in the output file. The command

    > perl tm_calc.pl path/filename.csv -w

does the same as -f and also produces a directory structure containing web pages and plots of the observed data and models:

    path/filename_fit/
    path/filename_fit/TM_Summary.html
    path/filename_fit/TM_Details.html
    path/filename_fit/1/
    path/filename_fit/1/filename_1_1.png/
    ...



  (for well 1)
  (well 1, transition 1)

Use the -h command or see below for more documentation of command line options. PROSPERO uses the first two forms of the command for initial input and complete analysis after well selection, respectively. The third form is useful for viewing plots when using tm_calc.pl on your own computer.

Requirements for running the script

Input Format

The script can read a variety of input formats:

Files in other formats, e.g. with temperature in one row followed by sample header and fluorescence in following rows, will need to be converted e.g. by using Excel's Edit - paste special - transpose to put the data into columns instead of rows.

Command Line Options

You can get a help message by running tm_calc.pl with no input, or with -h or --help options. You get a summary of options and defaults. See below for more details on options for input, output, sample selection, fitting and plotting, fit parameters and tracking .


USAGE:
perl tm_calc.pl <in_file> | -  [-o (- | <out_file>)]  [-c <columns>] [-r|--raw]
  [-w  | --web]  |  [(-f | --fitdir) [<fit_dir>]]  [(-p | --plot) [<plot_type]]
  [(-m | --mindelta) <min_delta> ]  [-x | -exp]  [(-s | --smooth) <window>]
  [-t | --track]  [-v | --verbose]  [-d | --debug]  [--version]  [-h | --help]

>>>> You must supply either <in_file> or '-' (dash) meaning STANDARD IN. <<<<<

	Items in <angle brackets> should be replaced by the real thing,
	Items in [square brackets] are optional. "a | b" means a or b.
	-h or --help or no input specifier prints this text.
Output: XML including a csv table of derived values including Tm as T at 
max. slope and Tm as mean of T's at half max. slope, followed by raw data
wrapped in XML tags: <X>X1,X2...</X> and <SAMPLE>Y1,Y2...</SAMPLE>. 
Flags are:

-r or --raw 	Include raw data as standardized csv in the XML output.
-f or --fit 	Fit 1 or more Boltzmann curves to data using gnuplot
-p or --plot	Plot raw data, total fit, each transition and derivatives;
            	-p png puts Tm_Details.html & Tm_Summary.html in <fit_dir>.
-w or --web 	shortcut for '-f <in_name>_fit -p png' where <in_file> is
            	<in_name>.csv; puts web pages in subfolder next to <in_file>.
-x or --exp 	Use exponential decay background in curve fitting.
-t or --track  	Write the sample number and name (column heading) to STD ERR
-v or --verbose	Write some intermediate values to STD OUT
-d or --debug  	Write copious intermediate values to STD OUT
All flags are OFF by default.
Default values are:
<out_file> ... 	<in_file>.tmc.csv if <in_file> is given, or STANDARD OUTPUT
	        	    (e.g. screen) if no <in_file> is given.
<columns>  ... 	Use all sample columns.
<fit_dir>  ... 	Use a temporary fit directory e.g. /tmp/<in_file>/<column>;
               	    do NOT read existing fits or write to a named directory.
<min_delta>    	0.02 = fraction of total intensity change for smallest peak.
<window>   ...	3 = size in degrees of smoothing window	(15 points for OM RTPCR)
<plot_type>    	ps for Postscript; the only other tested option is png.
               	   Only works if -f is used

Use -w to fit and make web pages with png plots in <in_file>_fit or ./fit,
OR use -f <fit_dir> -p png to fit and make web pages in <fit_dir>.  Pages are
Tm_Summary.html with 1 plot per sample and Tm_Details.html, 3 per transition+.

If given, <columns> can be a comma-separated list of numbers, wells, or ranges,
e.g."1-3,B3-5,C10-d02". The first sample column after "Temp" is number 1
Wells match the letter and number ignoring case and leading 0: "A2" = "a02".
Ranges are number-number or well-well where well ranges go from the first
to the last column as found in the file, if both first and last are found.
	NOTE: Lists must be either WITHOUT_SPACES or "enclosed in quotes."
	Buffers with spaces and commas are not (yet) recognized.

Input file specification

Output file specification

Sample Selection: Columns

Fitting and Plotting: Web, Fitdir and Plot

Curve Fitting Parameters: Minimum Delta and Exponential Backgound

Tracking and other options


Data Processing

The basic steps carried out by the script are:
  1. Parse the input file: find the delimiter, the header line, and the columns.
  2. Preliminary analysis: for each column, find the maximum slope and half-max slope points
  3. Fit one transition using maximum slope and half-max width as initial estimates.
  4. Subtract the model from the data and look for max slope and width of the residual.
  5. If the max slope is still high enough above zero, use it to fit another transition, then repeat the previous step.
  6. Report the resulting model parameters.
In more detail:

Parsing

The script first determines if the input is a .tmc.csv file produced by this script; if so it reads in the xml fields using special processing which puts data into the same form as regular processing would. If not:

Preliminary Analysis

Basic analysis of the major transition is always done, whether or not fitting is used. The steps are:

Curve Fitting

When -f or -w options are used, tm_calc.pl does iterative Levenberg-Marquardt curve fitting via gnuplot:

Output Format: tmc.csv and html files

The main output, <outfile>.tmc.csv , is an XML file containing one or more (XML-wrapped) comma-separated value tables for human readability. Curve fitting also produces directories of intermediate files and, if requested, plots and HTML files. This document covers the main output file in detail, and briefly describes the other files.

See an example. of the XML output from the command:
    perl tm_calc.pl DSF_sample_data.csv -f -r
which uses the sample input file shown above, DSF_sample_data.csv.

The XML file has the following heirarchy of TAGS, attributes - and brief descriptions or links:

The values of these tags, where not obvious, are: