staff project download information miscellaneous
Vect   Vect Reference Manual
  Installing Perl
Mac
Windows
Unix


Download

Reference Manual
Introduction
Overview
Input Panel
Convert Panel
Output Panel
Perl Program Panel


Tutorials
Numerical Data Extraction
Statistical Data Extraction
Patent Calculation
PDB Data Extraction
GenBank Data Extraction
Tabular Data Analysis
Word Mapping
DNA to Protein Extraction

Change Log


FAQ

Cookbook
 
MangoPicky DownloadLucy2 DownloadTrend DownloadGRAMAUBViz DownloadgeneDBN Download

Convert Data Panel (continued)

2. Convert Data Panel: Rules

The Convert Data Panel creates only rule-dependent rules. There are currently the following rule types in this panel:

1) Concatenate Data Rule
2) Extract Quoted Data Rule
3) Filter Data Rule
4) Pick Data Rule
5) Merge Data Rule
6) Convert/Reverse Data Rule
7) Translate Data Rule
8) Extract Substrings Rule
9) Write your own Perl Script

To apply any rule to a data set, simply click on the 'Insert' icon in the toolbar and select the rule you wish to apply.  Give the rule a descriptive name and also select the rule you wish to apply it to by selecting the grey highlighted question marks. 

If the rule has incomplete or incorrect definition, the rule will appear in red on the left-panel and the erroneous parameters will also been shown in red .  Select the grey boxes with the incomplete red text to specify more information.  Change other grey highlighted boxes to help extract only the data needed.

1) Concatenate Data Rule

This rule allows users to connect data in multiple lines into one-line strings based on the level of the data.  Users can also choose to add punctuation or other text to visually separate concatenated lines. 

There are seven levels, level six being the deepest level (not concatenated at all) and level 0 being globally concatenated to one string (one line.)  The different levels in-between will concatenate based on the level the data is present in, which is determined by the block selection rules. 

If, for example, all data pieces belonging to a DNA sequence are labeled as (3) or above from the original data set, then concatenating to level three will result in a single DNA string. 

You can determine the number of separate DNA sequences by the number of stars on the left-panel of your rule, even when they are both in the same level.  It is a good idea to practice extracting various levels of data and concatenating them at various levels, to see how the rule works.

2) Extract Quoted Data Rule
The Extract Quoted Data rule allows phrases in between or next to punctuation or other common characters to be easily extracted.  A common use is to extract data in between quotation marks.  The user will specify between which marks the wanted data is located and Vect will extract that data.

3) Filter Data Rule
The Filter Data Rule allows specific characters to be extracted (i.e. integers, words, uppercase, lowercase and alphanumeric characters.)  A pull down menu allows users to easily pick what characters are wanted.

4) Pick Data Rule
The Pick Data Rule allows for users to choose which blocks they would like to extract. They can select from choosing every other block, every third, etc or define their own selection.

5) Merge Data Rule
The Merge Data Rule allows users to combine data from two different rules into one document. A good example is if a user wants to attach the body to a title. The user would select all titles in one rule and all of the body in another rule and then use the Merge Data Rule to combine them.

6) Convert/Reverse Data Rule
The Convert/Reverse Data Rule allows sequences of data to be reversed or converted in the case of DNA. Users could easily obtain the complementary string of the DNA string that they are editing or flip the string from a three prime to five prime end.

7) Translate Data Rule
The Translate Data Rule allows for DNA to RNA or RNA to DNA translation. A pull down menu allows users to easily pick what type of translation is needed.

8) Extract Substrings Rule
The Extract Substrings Rule allows Vect to find viable information from a different rule to be used in data manipulation. Users could tell Vect where the coordinates are for one set of data and Vect would use these coordinates to grab data.

9) Write your own Perl Script
The Perl Script Rule allows users to write their own rule, in case their rule is not already given. Users should consult documentation on how to write Perl Script in case they are not familiar with its syntax.

The 'Output Data' panel is where a template can be formulated to organize the data collected from extraction. The 'Copy' button is used from the 'Convert Data' panel to copy all wanted rules over.

Each rule appears in brackets (< >). These rules should not be edited at this point unless users understand ways to edit. Users may add their own text and format by positioning the where needed. Users can select the 'Output' icon to visually show how their example will look in that format.

If the user would like to make more changes they simply go back to the 'Template' view. Users may add text to the 'Preamble Output' and 'Appendix Output' sections, which will only appear above and below the body.


Last modified June 13, 2008 . All rights reserved.

Contact Webmaster

lab