DataMining Project



There are two sets of Wikipedia articles. The first set is from Wikipedia featured articles of a

certain type. The first set becomes class Featured. The second set of articles are Wikipedia (non-

featured) articles of similar type to featured articles. The second set becomes class Non-Featured.

We are dealing with a binary classification problem. 

To create attributes, extract all possible tokens from the entire dataset after stemming and stop-

word removal. Create 1-gram, 2-gram and 3-grams from these tokens. Use these n-grams as the

attributes for ARFF files. 

Perform attribute selection on each of 1-gram, 2gram, 3-gram an using information gain and gain

ratio. Perform classification using decision tree, and naïve Bayes. 

Make a Wiki report on your finding including various statistical evaluation measures given by WEKA for each classifier.

Habilidades: Python

Ver más: sample project datamining, project reports datamining client, project ids datamining, online customer project asp net datamining, ieee project titles datamining java, datamining area project title php language, code project datamining rule, visual studio 2005 deployment project customer information form, visual studio deployment project customer information form, setup project customer information serial number, dbms project student information system, free java project student information system, project based information system aktel, sql database project days information, free post project product information, semester project college information management system, statistical datamining research analysis project bid, research datamining statistics project bid, install project customer information form, setup project customer information, datamining project sql, vbnet 2002 project employee information system, need project traffic information, solar pulse project full information, 2008 newly awarded project warehouse cold storage

ID de proyecto: #12124371