n-Gram Statistics for Natural Language Understanding and Text Processing

@article{Suen1979nGramSF,
  title={n-Gram Statistics for Natural Language Understanding and Text Processing},
  author={Ching Y. Suen},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={1979},
  volume={PAMI-1},
  pages={164-172}
}
n-gram (n = 1 to 5) statistics and other properties of the English language were derived for applications in natural language understanding and text processing. They were computed from a well-known corpus composed of 1 million word samples. Similar properties were also derived from the most frequent 1000 words of three other corpuses. The positional distributions of n-grams obtained in the present study are discussed. Statistical studies on word length and trends of n-gram frequencies versus… CONTINUE READING

Similar Papers

Citations

Publications citing this paper.
SHOWING 1-10 OF 128 CITATIONS

Analysis of Mobility Patterns During a Large Social Event

  • 2018 IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY)
  • 2018
VIEW 2 EXCERPTS
CITES METHODS

Identifying Features in Forks

  • 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE)
  • 2018
VIEW 1 EXCERPT
CITES METHODS

Performance Comparison of Machine Learning Models Trained on Manual vs ASR Transcriptions for Dialogue Act Annotation

  • 2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI)
  • 2018
VIEW 1 EXCERPT
CITES METHODS

A Privacy-Preserving Multi-Pattern Matching Scheme for Searching Strings in Cloud Database

  • 2017 15th Annual Conference on Privacy, Security and Trust (PST)
  • 2017
VIEW 1 EXCERPT

Creating and utilizing section-level Web service tags in service replaceability

  • Service Oriented Computing and Applications
  • 2017
VIEW 1 EXCERPT
CITES BACKGROUND

FILTER CITATIONS BY YEAR

1980
2018

CITATION STATISTICS

  • 1 Highly Influenced Citations

References

Publications referenced by this paper.
SHOWING 1-10 OF 70 REFERENCES

A simplified heuristic version of a recursive Bayes algorithm for using context in text recognition

G. Silva, H. Love
  • IEEE Trans . Syst .
  • 1978

Advances in recognition of handprinted characters , " in

C. Y. Suen, C. Shiau, R. Shinghal, C. C. Kwan
  • Proc . 4 th Int . Joint Conf . Pattern Recognition , Nov .
  • 1978

Low error rate optical character recognition of unconstrained handprinted characters based on a model of human perception

M. Symonds, A. J. Szanser
  • IEEE Trans . Syst .
  • 1977