When analyzing RNA-Seq data, rRNA and tRNA reads can be removed from the sequencing files. Here, I briefly describe how to do this step using ERNE
Prepare software#
Download software from https://sourceforge.net/projects/erne/files/2.1.1/
Unzip and move erne-create and erne-filter to ~/local/bin
Install Seqkit conda install -c bioconda seqkit to use rmdup that remove duplicated fasta files.
Prepare rRNA and tRNA sequence#
rRNA#
rRNA sequences were downloaded from Silver
w
g
e
t
h
t
t
p
s
:
/
/
w
w
w
.
a
r
b
-
s
i
l
v
a
.
d
e
/
f
i
l
e
a
d
m
i
n
/
s
i
l
v
a
_
d
a
t
a
b
a
s
e
s
/
c
u
r
r
e
n
t
/
E
x
p
o
r
t
s
/
S
I
L
V
A
_
1
2
8
_
L
S
U
P
a
r
c
_
t
a
x
_
s
i
l
v
a
.
f
a
s
t
a
.
g
z
tRNA#
tRNA sequences were downloaded from GtRNAdb .
w
g
e
t
h
t
t
p
:
/
/
g
t
r
n
a
d
b
2
0
0
9
.
u
c
s
c
.
e
d
u
/
d
o
w
n
l
o
a
d
/
t
R
N
A
s
/
G
t
R
N
A
d
b
-
a
l
l
-
t
R
N
A
s
.
f
a
.
g
z
Combine two fasta files together and remove duplicate#
Combine two fasta files as contaminate_rna.fa .
c
c
a
a
t
t
G
c
t
o
R
n
N
t
A
a
d
m
b
i
-
n
a
a
l
t
l
e
-
_
t
r
R
n
N
a
A
.
s
f
.
a
f
a
|
s
S
e
I
q
L
k
V
i
A
t
_
1
r
2
m
8
d
_
u
L
p
S
U
P
a
r
c
c
o
_
n
t
t
a
a
x
m
_
i
s
n
i
a
l
t
v
e
a
_
.
r
f
n
a
a
s
_
t
u
a
n
i
>
q
.
c
f
o
a
n
t
a
m
i
n
a
t
e
_
r
n
a
.
f
a
Prepare allign file#
e
r
n
e
-
c
r
e
a
t
e
o
u
t
p
u
t
-
p
r
e
f
i
x
c
o
n
t
a
m
i
n
a
t
e
_
r
n
a
-
f
a
s
t
a
c
o
n
t
a
m
i
n
a
t
e
_
r
n
a
_
u
n
i
q
.
f
a
&
#
t
a
k
e
s
a
b
o
u
t
t
h
r
e
e
h
o
u
r
s
Run erne-filter to remove rRNA/tRNA#
e
r
n
e
-
f
i
l
t
e
r
-
c
o
n
t
a
m
i
n
a
t
i
o
n
-
r
e
f
e
r
e
n
c
e
c
o
n
t
a
m
i
n
a
t
e
_
r
n
a
.
e
b
h
-
t
h
r
e
a
d
s
2
0
-
q
u
e
r
y
1
t
e
s
t
_
t
r
i
m
m
e
d
.
f
q
o
u
t
p
u
t
-
p
r
e
f
i
x
r
m
c
o
n
t
a
c
&
As a result, the clean file rmcontac_1.fastq has 55167445 sequences, while the original file has 55301662 sequences. Over 99.76% of reads retained. This step is probably more important for small RNA-sequencing.