{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## April 6 - More Learning - Ensemble and Adaboost\n", "\n", "Mostly chapter 18 and 20 from AIMA" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from learning import *\n", "from notebook import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ENSEMBLE LEARNER\n", "\n", "### Overview\n", "\n", "Ensemble Learning improves the performance of our model by combining several learners. It improves the stability and predictive power of the model. Ensemble methods are meta-algorithms that combine several machine learning techniques into one predictive model in order to decrease variance, bias, or improve predictions. \n", "\n", "\n", "\n", "![ensemble_learner.jpg](images/ensemble_learner.jpg)\n", "\n", "\n", "Some commonly used Ensemble Learning techniques are : \n", "\n", "1. Bagging : Bagging tries to implement similar learners on small sample populations and then takes a mean of all the predictions. It helps us to reduce variance error.\n", "\n", "2. Boosting : Boosting is an iterative technique which adjusts the weight of an observation based on the last classification. If an observation was classified incorrectly, it tries to increase the weight of this observation and vice versa. It helps us to reduce bias error.\n", "\n", "3. Stacking : This is a very interesting way of combining models. Here we use a learner to combine output from different learners. It can either decrease bias or variance error depending on the learners we use.\n", "\n", "### Implementation\n", "\n", "Below mentioned is the implementation of Ensemble Learner." ] }, { "cell_type": "code", "execution_count": 225, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "
\n", "def EnsembleLearner(learners):\n",
" """Given a list of learning algorithms, have them vote."""\n",
" def train(dataset):\n",
" predictors = [learner(dataset) for learner in learners]\n",
"\n",
" def predict(example):\n",
" return mode(predictor(example) for predictor in predictors)\n",
" return predict\n",
" return train\n",
"
\n", "One way to think about this is that you have more to learn from your mistakes than\n", "from your success. If you predict X and are correct, you have nothing to learn.\n", "\n", "\n", "\n", "All the examples start with equal weights and a hypothesis is generated using these examples. Examples which are incorrectly classified, their weights are increased so that they can be classified correctly by the next hypothesis. The examples that are correctly classified, their weights are reduced. This process is repeated K times (here K is an input to the algorithm) and hence, K hypotheses are generated.\n", "\n", "These K hypotheses are also assigned weights according to their performance on the weighted training set. The final ensemble hypothesis is the weighted-majority combination of these K hypotheses.\n", "\n", "The speciality of AdaBoost is that by using weak learners and a sufficiently large *K*, a highly accurate classifier can be learned irrespective of the complexity of the function being learned or the dullness of the hypothesis space.\n", "\n", "### Implementation\n", "\n", "As seen in the previous section, the `PerceptronLearner` does not perform that well on the iris dataset. We'll use perceptron as the learner for the AdaBoost algorithm and try to increase the accuracy. \n", "\n", "Let's first see what AdaBoost is exactly:" ] }, { "cell_type": "code", "execution_count": 230, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "If you predict X and are wrong, you made a mistake. You need to update your model. You have something to learn.\n", " \n", "In the cognitive science world, this is known as failure-driven learning.\n", "It is a powerful notion.\n", "\n", "Think about it. How do you know that you have something to learn? You make a \n", "mistake. You expected one outcome and something else happened.\n", "\n", "Actually, you do not need to make a mistake. Suppose you try something and expect\n", "to fail. If you actually succeed, you still have something to learn because \n", "you had an **expectation failure**.\n", " \n", "
def AdaBoost(L, K):\n",
" """[Figure 18.34]"""\n",
"\n",
" def train(dataset):\n",
" examples, target = dataset.examples, dataset.target\n",
" N = len(examples)\n",
" epsilon = 1/(2*N)\n",
" w = [1/N]*N\n",
" h, z = [], []\n",
" for k in range(K):\n",
" h_k = L(dataset, w)\n",
" h.append(h_k)\n",
" error = sum(weight for example, weight in zip(examples, w)\n",
" if example[target] != h_k(example))\n",
"\n",
" # Avoid divide-by-0 from either 0% or 100% error rates:\n",
" error = clip(error, epsilon, 1 - epsilon)\n",
" for j, example in enumerate(examples):\n",
" if example[target] == h_k(example):\n",
" w[j] *= error/(1 - error)\n",
" w = normalize(w)\n",
" z.append(math.log((1 - error)/error))\n",
" return WeightedMajority(h, z)\n",
" return train\n",
"
def WeightedLearner(unweighted_learner):\n",
" """Given a learner that takes just an unweighted dataset, return\n",
" one that takes also a weight for each example. [p. 749 footnote 14]"""\n",
" def train(dataset, weights):\n",
" return unweighted_learner(replicated_dataset(dataset, weights))\n",
" return train\n",
"
def err_ratio(predict, dataset, examples=None, verbose=0):\n",
" """Return the proportion of the examples that are NOT correctly predicted.\n",
" verbose - 0: No output; 1: Output wrong; 2 (or greater): Output correct"""\n",
" examples = examples or dataset.examples\n",
" if len(examples) == 0:\n",
" return 0.0\n",
" right = 0\n",
" for example in examples:\n",
" desired = example[dataset.target]\n",
" output = predict(dataset.sanitize(example))\n",
" if output == desired:\n",
" right += 1\n",
" if verbose >= 2:\n",
" print(' OK: got {} for {}'.format(desired, example))\n",
" elif verbose:\n",
" print('WRONG: got {}, expected {} for {}'.format(\n",
" output, desired, example))\n",
" return 1 - (right/len(examples))\n",
"