Manual
Introduction
CRISPETa is a flexible tool to design optimal pairs of sgRNAs for deletion of desired genomic regions. These target regions can be supplied in BED or UCSC format. CRISPETa can be run on any number of targets - from one to thousands. At present, designs can be performed for 5 species' genomes: human, mouse, zebrafish, fruitfly, worm (if your favourite species is not listed here, drop us an email and we'll try to help).
The only required inputs for a design are:
- Target coordinate(s)
- Species
CRISPETa designs can be controlled by a range of other parameters (see below), otherwise provided with default values.
The following are the outputs of every run:
- DESIGN FILE: n ranked pairs of sgRNAs for every target region (n can be modified by the user).
- DESIGN BED: A BED file of designed sgRNAs ready to be uploaded and visualised in the UCSC Genome Browser.
- SETTINGS: A summary of the run settings.
- STATISTICS: Statistics and summary of the design performance.
Steps
- Supply the target coordinates. This can be done in two different ways:
a. Upload a file from local disk. This file must contains one target coordinate on each line in any of the supported formats.
b. Copy and paste target coordinates directly into the text area. Be sure that coordinates apply to the correct genome assembly version! - Modify the settings, as desired (otherwise, defaults are supplied)
- Submit
- Wait - designs take a couple of minutes, depending on the number of targets.
Options
- Input: Target regions in BED or UCSC format pasted in input box or uploaded from file. Be sure that coordinates apply to the correct genome assembly version!
- Genome: Genome of desired specie (CRISPETa supports the following genome assemblies: Human (hg19), Mouse (mm10), Zebrafish (danRer10), Drosophila (dm6) and C.elegans (ce11))
- Off-targets: Number of off-target matches genome-wide allowed with each number of mismatches, from 0 (ie identical protospacers) up to 4 (sequences differing at 4 positions). "-" can be used to indicate no limit.
- Number of sgRNAs per target: Maximum number of pairs to be returned per target region.
- Diversity: The maximum fraction of returned pairs per target that contain the same sgRNA. For example, a diversity value of 0.5 will allow a single sgRNA to appear in at most half of the sgRNA pairs for one target.
- Upstream exclude region (bp): Length of upstream region adjacent to target excluded from sgRNAs search.
- Downstream exclude region (bp): Length of downstream region adjacent to target excluded from sgRNAs search.
- Upstream design region (bp): Length of downstream region for sgRNAs search.
- Downstream design region (bp): Length of downstream region for sgRNAs search.
- Positive mask: Favoured regions from genome. Must be in BED6 format.
- Negative mask: Disfavoured regions from Genome. Must be in BED6 format.
- Individual score: The minimum individual efficiency score that each sgRNA must have to be considered. Range: 0-1.
- Paired score: The minimum combined score that a sgRNA pair must have to be considered. Range: 0-2.
- Score combination: Method by which individual scores are combined to yield pair score: by addition ("sum") or multiplication ("product").
- Construct method: Method applied when making sgRNA pairs and oligo construction: "General" (no constraint) or "DECKO" (one sgRNA in pair should starts with G. If none, an extra "G" is prepend to the sgRNA in upstreamdesign region. Length of this sgRNA will be 21nt. Compatible with vectors expressing first sgRNA using the U6 promoter).
- Ranking method: Criteria for ranking sgRNA pairs ("score" or "distance"). "Score" returns the first n pairs ranked by highest Paired Score, "Distance" shortest distance between sgRNA pairs.
Outputs
DESIGN FILE: n ranked pairs of sgRNAs for every target region. Each line correspond to one pair. Tab separated fields in this file are:
- Target sequence ID. Those without identifiers in the input are assigned a random ID
- Chromosome location of sgRNA1
- Start coordinate of sgRNA1
- End coordinate of sgRNA1
- Sequence of sgRNA1 + PAM sequence
- Score of sgRNA1
- Chromosome location of sgRNA2
- Start coordinate of sgRNA2
- End coordinate of sgRNA2
- Sequence of sgRNA2 + PAM sequence
- Score of sgRNA2
- Distance of sgRNA1 to upstream exclude region
- Distance of sgRNA2 to downstream exclude regions
- Distance between both sgRNAs
- Score of paired sgRNAs
- Mask score. (If no masks supplied this score will be 2 by default)
- Oligo sequence. Only available for DECKO construction method
Sequence_ID(#pair) | chromosome | start | end | sgRNA_1 | score_1 | chromosome | start | end | sgRNA_2 | score_2 | distance_to_exclude_up_region | distance_to_exclude_down_region | distance_between_gRNAs | paired_score | mask_score | oligo |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MALAT1(1) | chr11 | 65254653 | 65254676 | TTATAGAGCCATTAGCCCAAGGG | 0.868 | chr11 | 65256189 | 65256212 | AGTTCTGCCTCAGCTCAGGACGG | 0.684 | -50 | 291 | 1513 | 1.552 | 2 | . |
MALAT1(2) | chr11 | 65254685 | 65254708 | GTTGGTCAAGTAAAGACACGTGG | 0.835 | chr11 | 65256189 | 65256212 | AGTTCTGCCTCAGCTCAGGACGG | 0.684 | -18 | 291 | 1481 | 1.519 | 2 | . |
MALAT1(3) | chr11 | 65254574 | 65254597 | ACAGGAATCAAACTCCAAAGGGG | 0.729 | chr11 | 65256189 | 65256212 | AGTTCTGCCTCAGCTCAGGACGG | 0.684 | -129 | 291 | 1592 | 1.413 | 2 | . |
MALAT1(4) | chr11 | 65254645 | 65254668 | CTAATGGCTCTATAAATTGGAGG | 0.659 | chr11 | 65256189 | 65256212 | AGTTCTGCCTCAGCTCAGGACGG | 0.684 | -58 | 291 | 1521 | 1.343 | 2 | . |
MALAT1(5) | chr11 | 65254653 | 65254676 | TTATAGAGCCATTAGCCCAAGGG | 0.868 | chr11 | 65256126 | 65256149 | GCACCAGCCCAAGGCTGCATGGG | 0.435 | -50 | 228 | 1450 | 1.303 | 2 | . |
DESIGN BED: A BED file of designed sgRNAs ready to be uploaded and visualised in the UCSC Genome Browser. This file will create two costum tracks: Target regions track and paired sgRNAs track.
SETTINGS: A summary of the run settings including all selected options and useful information about sgRNAs and pairs designed.
- STATISTICS: Statistics and summary of the design performance: distribution of individual and paired scores of sgRNAs and pairs found, dristribution of pair distances and pie chart with the percentage of complete and incomplete designs.