DataMining Project

  • Estado Cerrado
  • Presupuesto $10 - $30 USD
  • Total de ofertas 7

Descripción del proyecto

There are two sets of Wikipedia articles. The first set is from Wikipedia featured articles of a

certain type. The first set becomes class Featured. The second set of articles are Wikipedia (non-

featured) articles of similar type to featured articles. The second set becomes class Non-Featured.

We are dealing with a binary classification problem. 

To create attributes, extract all possible tokens from the entire dataset after stemming and stop-

word removal. Create 1-gram, 2-gram and 3-grams from these tokens. Use these n-grams as the

attributes for ARFF files. 

Perform attribute selection on each of 1-gram, 2gram, 3-gram an using information gain and gain

ratio. Perform classification using decision tree, and naïve Bayes. 

Make a Wiki report on your finding including various statistical evaluation measures given by WEKA for each classifier.

Obtén cotizaciones gratis para un proyecto como este
Habilidades necesarias

¿Buscas ganar algo de dinero?

  • Fija tu plazo y presupuesto
  • Describe tu propuesta
  • Recibes pagos por tu trabajo

Contrata Freelancers que también oferten en este proyecto

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online