Download

To construct BERMP, the multi-species m6A sites datasets were derived from previous publications. Specifically, the dataset of Saccharomyces cerevisiae [1] included 1307 positive m6A sites and 1307 negatives with the length of 51 nucleic acids. We separated them into two groups: one for ten-fold cross-validation (positives: 1100; negatives: 1100) and the other (positives: 207; negatives: 207) for independent test. The dataset of Arabidopsis thaliana [2] covered 2518 positives and the same number of negatives with 101 nucleic acids. 2100 m6A sites and the same number of non-m6A sites were taken as the training dataset for five-fold cross-validation test. 418 m6A sites and the same number of non-m6A sites were taken as independent test. The mammalian dataset [3] included over 50,000 positives and tenfold negatives. Four fifths of dataset were taken as the training dataset for five-fold cross-validation test, the others were taken as independent test. Both of the datasets could be downloaded here.

[1] Chen, W., et al., iRNA-Methyl: Identifying N(6)-methyladenosine sites using pseudo nucleotide composition. Anal Biochem, 2015. 490: p. 26-33.
[2] Wang, X. and R. Yan, RFAthM6A: a new tool for predicting m(6)A sites in Arabidopsis thaliana. Plant Mol Biol, 2018. 96(3): p. 327-337.
[3] Zhou, Y., et al., SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features. Nucleic Acids Res, 2016. 44(10): p. e91.

Who are using?