Tripal BLAST Module Description
This module provides a basic interface to allow your users to utilize your server's NCBI BLAST+.
  Setup | Functionality 
  | Large jobs | Genome visualization
—
Setup Instructions
  - 
    Install NCBI BLAST+ on your server (Tested with 2.2.26+). There is a 
    package available 
    for Ubuntu to ease installation. Optionally you can set the path to your 
    BLAST executable 
    in the settings.
  
- 
    Optionally, create Tripal External Database References to allow you to link 
    the records in your BLAST database to further information. To do this simply 
    go to Tripal> 
    Chado Modules > Databases > Add DB and make sure to fill in the Database 
    prefix which will be concatenated with the record IDs in your BLAST database 
    to determine the link-out to additional information. Note that a regular 
    expression can be used when creating the BLAST database to indicate what the 
    ID is.
  
- 
    Create "Blast Database" 
    nodes for each dataset you want to make available for your users to BLAST 
    against. BLAST databases should first be created using the command-line 
    makeblastdbprogram with the-parse_seqidsflag.
- 
    It's recommended that you also install the Tripal Job Daemon 
    to manage BLAST jobs and ensure they are run soon after being submitted by the 
    user. Without this additional module, administrators will have to execute the 
    tripal jobs either manually or through use of cron jobs.
  
—Highlighted Functionality
  - Supports blastn, 
    blastx, 
    blastp and 
    tblastx with separate forms depending upon the database/query type.
  
- 
    Simple interface allowing users to paste or upload a query sequence and then 
    select from available databases. Additionally, a FASTA file can be uploaded 
    for use as a database to BLAST against (this functionality can be disabled).
  
- 
    Tabular Results listing with alignment information and multiple download 
    formats (HTML, TSV, XML) available.
  
- 
    Completely integrated with Tripal Jobs 
    providing administrators with a way to track BLAST jobs and ensuring long 
    running BLASTs will not cause page time-outs
  
- 
    BLAST databases are made available to the module by 
    creating Drupal Pages 
    describing them. This allows administrators to 
    use the Drupal Field API to add any information they want to these pages.
  
- 
    BLAST database records can be linked to an external source with more 
    information (ie: NCBI) per BLAST database.
  
—Protection Against Large Jobs
Depending on the size and nature of your target databases, you may wish to constrain use 
of this module.
  - Limit the number of results displayed via admin page. The recommended number is 500.
- 
    Limit the maximum upload file size in php settings. This is less useful because some 
    very large queries may be manageable, and others not.
  
- 
    Repeat-mask your targets, or provide repeat-masked versions. Note that some 
    researchers may be looking for repeats, so this may limit the usefulness of the BLAST 
    service.
  
—Whole Genome Visualization
This module can be configured to use 
CViTjs to display BLAST hits on 
a genome image. To configure this module to use CViTjs:
  - 
    Download CViTjs and copy
    the code to your webserver. It might make the most sense to put the code directly into
    this module's directory, in a subdirectory named js.
  
- 
    Enable CViTjs from the BLAST module administration page and provide the path to the
    root directory for the CViTjs code relative to this module. For example, js/cvitjs.
  
- 
    CViTjs will have a config file in its root directory named cvit.conf. This file 
    provides information for whole genome visualization for each genome BLAST target.
    Make sure the config file can be edited by your web server.
  
- 
    Edit the configuration file to define each genome target. These will look like:
    
[data.Cajanus cajan - genome]
conf = data/cajca/cajca.conf
defaultData = data/cajca/cajca.gff Where:
 —the section name, "data.Cajanus cajan - genome", consists of "data." followed
    by the name of the BLAST target node,
 —the file "cajca.conf" is a cvit configuration file which describes how to draw the 
    chromosomes and BLAST hits on the Cajanus cajan genome,
 —and the file "cajca.gff" is a GFF3 file that describes the Cajanus cajan 
    chromosomes.
 You will have to put the target-specific conf and gff files (e.g. cajca.conf and 
    cjca.gff) on your web server, in the directory, js/cvitjs/data. You may choose
    to group files for each genome into subdirectories, for example, 
    js/cvitjs/data/cajca.
 
 At the top of the configuration file there must be a [general] section that defines
    the default data set. For example:
[general]
data_default = data.Cajanus cajan - genome 
- 
    Edit the nodes for each genome target (nodes of type "Blast Database") and enable whole 
    genome visualization. Remember that the names listed in the CViTjs config file must 
    match the BLAST node name. In the example above, the BLAST database node for the
    Cajanus cajan genome assembly is named "Cajanus cajan - genome"