Machine Learning and Data Mining in Practice 2013
KDD Cup 2014 Most Recent
Introduction to the course
The major purpose of the course is to attend KDDCup. It is a well-known data mining competition held in conjunction with KDD-2013, the premier conference on data mining and knowledge discovery.
In the competition, you are expected to get your hands dirty and do data mining on some real world large data sets. This years task is 'Author-Paper Identification Challenge', and the competition is hold by Kaggle. You can find the Kddcup2013 competition details in there
Last year, we have got the championship in Kddcup2012. link. We hope we can also get good result in this year's competition!
What you will learn from the course
You will learn how to apply data mining and machine learning techniques to real world problems. You will cooperate with your teammates, learn new algorithms and techniques, implement them and test them on the data sets. We hope this will provide an alternative to the Fatworm course projects(students attending this course will no longer need to work on Fatworm) for those students interested in related topics.
Note that we expect you already know something about data mining and machine learning and data mining skills. The course will be an intensive one, so we will expect you to spend on average more than 40 hours each week, and each team will need to present their ideas to others every week.
Organization of this course
To effectively make use of the collective intelligence and make us more competitive in KDD Cup, we will
- Build a private wiki to share the knowledge base (related papers and software).
- Build a platform and code framework to develop and test algorithms. Each of the students will be assigned a seat in APEXLab, with appropriate support of computation resources.
- Form independent teams. Independent teams will help discover different algorithms and achieve better performance in the end. Each team will have 3 members (as in Fatworm projects) with 1 team leader. We will merge all the teams and work closely together during the last phase of contest, our final goal is to win KDDCup as ONE team. The general rules for submission will given by the TFs after the contest starts.
- Hold regular meetings. One of our ultimate goals is to win in the KDD Cup, so we will frequently share experience between teams. We will hold regular meetings every week, and each team will report their progress at the meetings. It is OK not to use PPT in such meetings.
The final score is related to your contribution to this course, not just the performance of your code. If you implement a model that is not strong itself but helps other models achieve better results, it is just wonderful. If you devise an algorithm that others find to be effective, you will also receive credits. The final grades will be given by TFs of the course. Note that since we expect to meet regularly and work together. The TFs will be very familiar with your contribution and the final grades will be given by teaching fellows.
- Special consultant: Tianqi Chen
- Kailong Chen
- Xuezhi Cao
- Enpeng Yao