Rtools Help

Created: Jan 7, 2016; Last Update: Jul 5, 2018.

An RNA molecule forms each secondary structure with a small probability in their huge space of candidates. In order to visualize the whole picture of the distribution of the secondary structure, this web-server provides users with rich information of *single* RNA sequences using 8 tools.

Enter your RNA sequence in the main input box. The input sequence must be single FASTA format, and less than or equal to 400nt.

Rtools only stores results for 30 days. So be sure to download any results you wish to keep.

Softwares

CentroidFold based on a generalized centroid estimator is one of the most accurate tools for predicting RNA secondary structures. The predicted secondary structure is coloured according to base pairing probabilities.

Option

inference engine: McCaskill(BL) (default), McCaskill(Turner), CONTRAfold.

weight of base pairs: 2^-5 - 2^10 (default: 2^2).

Output (text)

Secondary structure is a dot-bracket format with a fasta-like header line, indicating a secondary structure. In this format, each dot represents an unpaired base, opening and closing brackets represent a base pair.

>sequence_name ACGCUGUCUGUACUUGUAUCAGUACACUGACGAGUCCCUAAAGGACGAAACAGCGC .((((((.((((((......)))))).......((((.....))))...)))))). (g=4,th=0.2,e=-9.14)

Base pairing probability contains base-pairing probabilities more than "weight of base pairs" in blank delimited format.

pos1 nt1 pair-pos11:prob11 pair-pos12:prob12 ... pos2 nt2 pair-pos21:prob21 pair-pos22:prob22 ... :

pos1,pos2 are sequence positions in 1-based coordinate.

nt1,nt2 are nucleotides.

pair-pos11,pair-pos12 ... are sequence positions of pairing partners in 1-based coordinate.

pron11,prob12 ... are base pairing probabilities.

CentroidHomfold predicts RNA secondary structures by employing automatically collected homologous sequences of the target. Homologous sequences are collected from Rfam using LAST. If homologous sequences are available, CentroidHomfold can predict secondary structures for the target sequence more accurately than CentroidFold using homologous sequence information with the probabilistic consistency transformation for base-pairing probabilities.

Option

inference engine (secondary structure): McCaskill(BL) (default), McCaskill(Turner), CONTRAfold.

inference engine (pairwise alignment): CONTRAlign (default), ProbCons.

E-value: using for homology search against Rfam. 0.0001 - 0.1 (default: 0.01).

number of homologous sequences: 10 - 50 (default: 30).

Output (text)

Secondary structure, Base pairing probability: same as CentroidFold.

IPknot predicts RNA secondary structures including a wide class of pseudoknots. IPknot can also predict the consensus secondary structure when a multiple alignment of RNA sequences is given. IPknot runs fast and predicts the maximum expected accuracy (MEA) structure using integer programming (IP) with threshold cut.

The structure is predicted at at the IPknot level 1. The figure is created by VARNA.

Option

inference engine: McCaskill(BL) (default), McCaskill(Turner), CONTRAfold, NUPACK (Dirks-Pierce model).

weight of base pairs: 2^-5 - 2^10 (default: 2^2).

Output (text)

Secondary structure: same as CentroidFold.

Rchange

Rchange computes entropy and internal energy changes of secondary structures for single-point mutated sequences. The figure shows the upper and lower bound of the relative changes of the ensemble free energy (dF/|F|). The upper bound values indicate the largest energy increase caused by a single mutation. The lower bound values indicate the largest energy decrease caused by a single mutation.

Option

maximal span of base pairs: constraint for base pair prediction. 50-500 (default: 100).

Output (text)

Rchange output is a tab delimited format with a fasta-like header line.

>sequence_name mut_pos1 mut_pos2 original_base(s) mutation_data

Sequence positions of mutations (mut_pos1, mut_pos2) are in 0-based indexing convention. "mutation_data" is semicolon (';') delimited entries representing different mutations to the original base(s). For each mutaion, there are 4 fields delimited by comma ',' representing mutated_base(s),entropy_change,internal_energy_change,helmholz_free_energy_change;

When mut_pos1 < 0, it represents thermodynamic variables of the original sequence as follows:

-seq_len -seq_len N N:entropy:internal_energy:helmholz_free_energy

CapR

CapR calculates probabilities that each RNA base position is located within each secondary structural context for long RNA sequences. Six categories of RNA structures were taken into account, stem part (S), hairpin loop (H), Bulge loop (B), Internal loop (I), multibranch loop (M), and exterior loop (E). The figure shows a structural profile of an RNA base by a set of six probabilities that the base belongs to each category.

Option

maximal span of base pairs: constraint for base pair prediction. 50-500 (default: 100).

Output (text)

CapR output is a space delimited format with a fasta-like header line.

>sequence_name Bulge prob1 prob2 ... Exterior prob1 prob2 ... Hairpin prob1 prob2 ... Internal prob1 prob2 ... Multibranch prob1 prob2 ... Stem prob1 prob2 ...

The probabilities (prob1 prob2 ...) are order by the sequence positions.;

Raccess

Raccess computes the accessibility of segment [a, b] = [x, x+l-1] in the transcript for all the positions x with fixed length l (Acc.len) = 5, 10, 20. Thermodynamic energy that is required to keep range [a, b] being accessible is given by

access_energy([a, b]) = -RT log(prob([a, b])) prob([a, b]) = sum_{s in S[a, b]} exp(-E(s)/RT) / sum_{s in S0} exp(-E(s)/RT)

Where S0 is all the possible secondary structures of the transcript and S[a, b] is all the secondary strutures having range [a, b] as loop region. In the figure, these access_energy([a, b]) are plotted at (x + l/2).

Option

maximal span of base pairs: constraint for base pair prediction. 50-500 (default: 100).

Output (text)

Raccess output is a tab delimited format with a fasta-like header line.

>sequence_name pos1 len1,ene11;len2,ene12...; pos2 len1,ene21;len2,ene22...; :

pos1,pos2 are sequence positions x in 0-based coordinate.

len1,len2 are fixed lengths l (Acc.len).

ene11,ene12 are access energies.

RintD

RintD validates RNA secondary structures. Target secondary structures are predicted by CentroidFold (inference engine: McCaskill) and RNAfold (Minimum free energy structure).

Option

weight of base pairs: 2^-5 - 2^10 (default: 2^2).

Your structure: user input secondary structure for RintD validation. It must be dot-bracket format without sequence.

Output (text)

RintD output is a tab delimited format.

For validation of one structure,

dist1 prob1 dist2 prob2 :

dist1,dist2 are Hamming distances from the structure.1.

prob1,prob2 are probabilities.

For validation of two structures,

dist11 dist21 prob1 dist12 dist22 prob2 :

dist11,dist12 are Hamming distances from the structure.1.

dist21,dist22 are Hamming distances from the structure.2.

prob1,prob2 are probabilities.

The figures show probability distributions.

RintW

This analysis detects essential alternative secondary structures from an RNA sequence by decomposing base pairing probability matrix. The decomposition is calculated by RintW, which efficiently computes the base pairing probability distributions on the Hamming distance from an canonical secondary structure.

Option

inference engine for canonical structure: McCaskill(BL) (default), McCaskill(Turner), CONTRAfold.

weight of base pairs: 2^-5 - 2^10 (default: 2^2).

Output (text)

Probability distribution on Hamming distance is a tab delimited format.

dist1 prob1 dist2 prob2 :

dist1,dist2 are Hamming distances from the canonical secondary structure.
prob1,prob2 are probabilities.

Validation is a tab delimited format.

dist11 dist21 prob1 dist12 dist22 prob2 :

dist11,dist12 are Hamming distances from the alternative structure.1. dist21,dist22 are Hamming distances from the alternative structure.2. prob1,prob2 are probabilities.