SciELO - Scientific Electronic Library Online

 
vol.35 número2Discrete event control of time-varying plantsA method solving an inverse problem with unknown parameters from two sets of relative measurements índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

  • Não possue artigos citadosCitado por SciELO

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Latin American applied research

versão impressa ISSN 0327-0793

Resumo

PIANTANIDA, J. P.  e  ESTIENNE, C. F.. A new estimator based on maximum entropy. Lat. Am. appl. res. [online]. 2005, vol.35, n.2, pp.143-147. ISSN 0327-0793.

In this paper, we propose a new formulation of the classical Good-Turing estimator for n-gram language models. The new approach is based on defining a dynamic model for language production. Instead of assuming a fixed probability distribution of occurrence of an n-gram on the whole text, we propose a maximum entropy approximation of a time varying distribution. This approximation led us to a new distribution, which in turn is used to calculate expectations of the Good-Turing estimator. This defines a new estimator that we call Maximum Entropy Good-Turing estimator. In contrast to the classical Good-Turing estimator, the new formulation needs neither expectations approximations nor windowing or other smoothing techniques. It also contains the well known discounting estimators as special cases. Performance is evaluated both in terms of perplexity and word error rate in an N-best rescoring task. Also comparison to other classical estimators is performed. In all cases our approach performs significantly better than classical estimators.

Palavras-chave : Languaje Models; Maximum Entropy; Good-Turing Estimation.

        · texto em Inglês     · Inglês ( pdf )

 

Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons