staff project download information miscellaneous
Picky   shRNA design using Picky and Perl programs

Picky 2 Oligo Microarray Design Oligo Sets
C. elegans
E. coli
Plasmodium falciparum

Picky 1 Oligo Microarray Design Oligo Sets


Oligo Microarray Design Tutorials
Guided Tour
Animated Tutorial

Gene Assembly
Microarray Calibration
shRNA Design

Change Log

MangoVect DownloadLucy2 DownloadTrend DownloadGRAMAUBViz DownloadgeneDBN Download


RNA interference (RNAi) is a natural phenomenon in cells where small interference RNAs (siRNAs) guide the recognition, inhibition and potential degradation of target messenger RNAs (mRNAs), resulting in the loss of their gene functions. RNAi is an important molecular biology technology that can be used to perform functional genomic studies. It is also suggested that siRNAs can be used as drugs to stop oncogenes in cancer cells or to fight off viruses. Although there are many dfiferent transfection methods to induce siRNAs in cells to knock out specific target genes, the most popular method is by introducing a DNA template that expresses a short hairpin RNA (shRNA). The shRNA is expected to be clipped by the Dicer enzyme and whose antisense strand will then become the siRNA to be incorporated into the RNA-induced Silencing Complex (RISC) which subsequently cleaves the target mRNA. A typical shRNA expressing template looks like the following:


The color-coded regions on the shRNA template are as follows: siRNA sense strand, siRNA loop area, siRNA anti-sense strand (the one that actually knocks out genes) , sticky vector linkers (use yours), RNA polymerase terminator (use yours), and DNA strand not transcribed.

There are many papers in the literature about how to select shRNAs that are mechanistically favorable for inducing siRNA in cells, but few of them considered beyond using a sequence-level comparison tools like BLAST to achieve gene specificity within a whole-genome background, When a siRNA is not gene-specific, there can be two problems: 1) off-target genes can be accidently silenced, causing side effects and imprecise experiment results; and 2) if some off-target genes are house-keeping and highly expressed genes, they can significantly dilute the availability of siRNAs to silence the actual target genes. Despite the fact that Picky is microarray design software, its rigorous whole-genome based thermodynamic screening can be utilized to greatly enhance the potency and gene specificity of shRNAs.

In the following, we present two general shRNA design appraoches both using Picky to avoid off-target genes. The first approach is to use Picky to identify good microarray probes, and then use a Perl program to find valid siRNAs from within the probe target regions on each gene. Since each microarray probe Picky designed will only target thermodynamically unique gene regions, this design approach prevents off-target genes. Nevertheless, not all probe target regions contain valid siRNA constructs, thus not all genes can acquire their dedicated shRNAs to silence them using this appraoch. The second appraoch aims to target a few specific genes by using a Perl program to select all valid siRNA constructs from the whole gene transcripts, and then sending those siRNA candidates to Picky for thermodynamic screening. This appraoch is exhaustive in the sense that any siRNAs that can be found will be selected for screening. However, it does not always produce high quality siRNAs for each target gene. The details of both design approaches are outlined below.

Whole-genome Picky probe design first, and then siRNA selection based on these probes

This is literally a "cherry-picking" approach to siRNA design because we consider the complete gene transcript set of a species and harvest valid siRNAs that are both high quality and unique to specific target genes. The steps are as follows: 1) Load the complete gene transcript set into Picky; 2) Ask Picky to design short probes for these transcripts; 3) Send the resulted *.picky output file to the Perl program; and 4) Obtain the shRNA design in either HTML table format or plain text FASTA format. In Step 1, it is recommended to load full-length transcripts into Picky including the UTR's to increase the chance of finding unique probes. When designing probes in Step 2, the recommeded Picky parameter settings are oligo sizes 25~35 bp, 20 probe candidates per gene, 5 probes per gene and 10°C minimum temperature separation . These will increase the chance to find usable siRNA target regions. Using a much longer oligo size (e.g. over 60 bp) may result in less specific siRNA targetting due to the fact that even within a thermodynamically unique long probe target region short non-unique regions may still be found. The output from the program contains all shRNAs that satisfy both gene specificity as determined by Picky parameters and shRNA construction requirements as mentioned in the literature. Nevertheless, not all genes can acquire their targeting shRNAs using this design approach.

Exhaustive siRNA selection from specific genes, and then Picky whole-genome screening

If there are a few important target genes that must be silenced, a reversal of the above approach can be used. First, the gene transcripts are sent to the Perl program which will produce short siRNA candidates that are all structurally valid. Next, the unscreened shRNAs are screening using Picky to determine their thermodynamic uniqueness which is measured by the melting temperature difference between their target and cloest nontarget genes. The target melting temperature is determined as if the siRNAs are PCR primers. The estimated closest nontarget melting temperatures, which is a unique computational capability of Picky, is determined similarly. Although each melting temperature taken alone bears not much meaning to RNAi experiments which are conducted at physiology temperature (e.g., 37°C), the difference between the target and nontarget melting temperatures does provide a valuable insight into the specificity of the siRNAs. Therefore, it can be used to determine the gene specificity of each siRNA candidate. The screening output from Picky were later combined with the original unscreened shRNA output by the Perl program whose final outout is similar to the ones obtained using the first appoach and can be in either HTML table format for plain text FASTA format.

Two design examples

We present two actual examples here using the two different shRNA design appraoches. In the first example, the original C. elegans gene set was obtained from the Wormbase. It was then compacted by using the program to remove sequence redundancy and processed by using the program to reduce certain sequence lengths longer than the Picky processing limit of 16,384 bp. The prepared data set was then sent to Picky for microarray probe design using the parameters reported in the Picky report file. The Picky probe output file is then sent to the program with default parameters, and the final output is a nice HTML table which contains detail information about each shRNA that can be used to dynamically sort the shRNAs according to certain selection preferences.


In the second example, The human hepatitis B virus is the target, so its genome is sent to the program to discover all possible siRNAs targetting its genome. The unscreened shRNAs were then examined by Picky two times: once using the human gene transcript set as nontargets, the other time using the mouse gene transcript set as nontargets. Both gene sets have been processed by using the program to reduce extra long sequence lengths. Because the HBV inhibition experiments were mostly conducted in mice, it is necessary to guarantee gene specificity of the siRNAs against the entire mouse transcriptome. The eventual goal is to apply the designed shRNA to cure human dieases, thus it is also necessary to screen against the whole human transcriptome. It is this essential thermodynamic whole-genome screening that ensures the high quality and low toxicity of the designed shRNAs. In both Picky screening runs, of course, the HBV genome itself was loaded as a target, and one of the host genomes is also loaded as another target set. The reason why the host genomes are not loaded as nontargets is to ensure Picky will identify multiple-targetting shRNAs instead of outright rejecting them. The Picky screening was carried out using the parameters recorded in the Picky report file. The final HTML table was produced by the program which combines the two Picky output files after screening with the human and mouse genomes..

Additional technical information

More detailed algorithm descriptions are included in the three Perl programs,, and The generated dynamically sortable HTML tables require a set of auxiliary files to function correctly. These files should be placed at the same webserver directory where the HTML files produced by either or are to be viewed.


This work was supported in part by the National Science Foundation grant DBI0850195. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Last modified June 13, 2008 . All rights reserved.

Contact Webmaster