Fast String Kernels using Inexact Matching for Protein Sequences

  title={Fast String Kernels using Inexact Matching for Protein Sequences},
  author={Christina S. Leslie and Rui Kuang},
  journal={Journal of Machine Learning Research},
We describe several families of k-mer based string kernels related to the recently presented mismatch kernel and designed for use with support vector machines (SVMs) for classification of protein sequence data. These new kernels – restricted gappy kernels, substitution kernels, and wildcard kernels – are based on feature spaces indexed by k-length subsequences (“k-mers”) from the string alphabet Σ. However, for all kernels we define here, the kernel value K(x,y) can be computed in O(cK(|x|+ |y… CONTINUE READING
Highly Influential
This paper has highly influenced 20 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 169 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 104 extracted citations

A Multi-Granularity Pattern-Based Sequence Classification Framework for Educational Data

2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA) • 2016
View 13 Excerpts
Highly Influenced

A study of spam filtering using support vector machines

Artificial Intelligence Review • 2010
View 10 Excerpts
Highly Influenced

Kernels for sequentially ordered data

View 5 Excerpts
Highly Influenced

Transfer String Kernel for Cross-Context DNA-Protein Binding Prediction

IEEE/ACM transactions on computational biology and bioinformatics • 2016
View 6 Excerpts
Highly Influenced

Efficient multivariate kernels for sequence classification

Pavel P. Kuksa pavel
View 4 Excerpts
Highly Influenced

169 Citations

Citations per Year
Semantic Scholar estimates that this publication has 169 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 28 references

On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach

Data Mining and Knowledge Discovery • 1997
View 4 Excerpts
Highly Influenced

Cluster kernels for semi-supervised protein classification

J. Weston, C. Leslie, D. Zhou, A. Elisseeff, W. S. Noble
Neural Information Processing Systems • 2003
View 4 Excerpts
Highly Influenced

Optimizing kernel alignment over combinations of kernels

J. Kandola, J. Shawe-Taylor, N. Cristianini
NeoroCOLTTechnicalReport NC-TR-2002-121,, • 2002
View 4 Excerpts
Highly Influenced

Computer alignment of sequences, chapter Phylogenetic Analysis of DNA Sequences

M. S. Waterman, J. Joyce, M. Eggert
View 4 Excerpts
Highly Influenced

Dynamic Alignment Kernels

View 3 Excerpts
Highly Influenced

Similar Papers

Loading similar papers…