BioPNG format

From ArrayWiki

Jump to: navigation, search

Try an upload for yourself!

Use this sample file:

Image:BioPNG Sample File Gaussian.png Gaussian Sample File

[edit] How to Interpret What You See

Probe Variance Dark spots represent probes with high variance. Localized features of dark spots represent data 'artifacts'. Most likely, probes in these regions are reporting 'non-biological' processes, probably due to mishandling or manufacturing defects. Even 'clean' chips will contain noisy dark areas.
Artifact Mask Dark spots represent regions flagged by our artifact recognition algorithm as 'artifacts'. Clean data will replace all of these probe values with the median value of this probe from other chips in the dataset. Our artifact algorithm is data conservative, which means it is biased toward preserving data, even in the presence of diffuse noise.
Original Intensity/Clean Intensity This image actually contains the data reported by each probe! The image should be primarily blue (as that is the small integer channel) with random green pixels and a black to reddish-purple background.
Number of Pixels This image actually contains the number of pixels used by GCOS to calculate the intensity value from the DAT file. Usually, this value is identical across the whole chip. Because the values are usually between 16-35, this is essentially a black & white image. It's really difficult to see if there are any patterns in the data because of the low intensity. We're developing an "enhanced" version that will amplify these differences.
Standard Deviation This image actually contains the standard of deviation value reported by GCOS software. This file will be primarily green with red background. We have switched the blue and green color channels for the purpose of distinguishing this data from intensity data. Many, but not all of our artifacts can also be seen in this image. For experiments with chips processed individually, or in a number of batches, this data is not that useful as GCOS can only calculate STDEV for chips loaded in a single session. This is another example of why it's a good idea to retain DAT files for future reference.
Original/Clean Expression This image actually contains the calculated expression data, currently calculated using the Bioconductor-distributed R implementation of PLIER. This image will be significantly smaller than the intensity data. The dimensions of this image are calculated as 1+sqrt(number of measurements). For this reason, the data main contain a thin white line on the last row of pixels. The order of expression measurements is the alphabetical order of the probe set ids in the CDF file associated with the chip platform. For custom chips, the layout of the intensity measurement will be preserved. However, the exact X,Y to probe ID mapping must be stored independently of the data.
Gel Plot (Histogram) This image contains the distribution of the log(16) intensity values. Bins are 0.008 wide (centered at 0.004) and range from 0 to 5.6 (700 bins). Each chip in the dataset is a horizontal stripe. The width of the horizontal stripes is proportional to the available vertical space (roughly 700 pixels) with a minimum of 3 pixel width. Red marks on the histogram indicate counts higher than the maximum allowable range (2^16 or 65536). This usually (but not always) indicates data corruption as most chips are log-normally distributed. Gel Plots generated before normalization give a clear idea of the spread of the original distributions (and which chips are likely to have low SNR). Gel Plots generated after normalization and cleaning should have nearly identical distributions with minor variations due to artifact data replacement.

[edit] Example PHP Code

$im = ImageCreateFromPng(“sampleBioPNG.png”);
list($xDim, $yDim, $type, $attr) = GetImageSize(“sampleBioPNG.png”);
$fileOut = fopen(“Intensities.txt”,’w’);
for($i=0; $i<$yDim; $i++) {
	for($j=0; $j<$xDim; $j++) {
		$rgb = ImageColorAt($im, $j, $i);
		$r = ($rgb >> 16) & 0xFF;
		$g = decbin(($rgb >> 8) & 0xFF);
		$b = decbin($rgb & 0xFF);
		str_pad($b,8,"0",STR_PAD_LEFT);

		$integer_part=bindec($g.$b);
		$intensity_value=$integer_part.'.'.$r;
		fputs($fileOut, $j.’\t’.$i.’\t’.$intensity_value.’\n’);
	}
}

[edit] Example Python Code

Coming Soon!

Personal tools