sexta-feira, 30 de março de 2007

A Bayesian Approach to Filtering Junk E-Mail

Seguindo o prometido no tópico anterior, de agora em diante estarei postando aqui alguns artigos utilizados no mestrado, sendo que, quando for o caso, o artigo será de minha autoria.

O artigo A Bayesian Approach to Filtering Junk E-Mail é de autoria de Mehran Sahami, Susan Dumaisy, David Heckermany e Eric Horvitzy e fala sobre a utilização de algoritmos bayseanos para filtrar lixo eletrônico. Segue abaixo o abstract. Se houver interesse, clique aqui para fazer o download do artigo completo.

Abstract

In addressing the growing problem of junk E-mail on the Internet, we examine methods for the automated construction of filters to eliminate such unwanted messages from a user's mail stream. By casting this problem in a decision theoretic framework, we are able to make use of probabilistic learning methods in conjunction with a notion of diferential misclassification cost to produce filters which are especially appropriate for the nuances of this task. While this may appear, at first, to be a straight-forward text classification problem, we show that by considering domain-specific features of this problem in addition to the raw text of E-mail messages, we can produce much more accurate filters. Finally, we show the eficacy of such filters in a real world usage scenario, arguing that this technology is mature enough for deployment.

Nenhum comentário: