Bootstrap Aggregation or bagging, By default a BaggingClassifier samplesmtraining instances with replacement (bootstrap=True), wheremis the size of the training set.

    1. Bagged Decision Trees

    2. Random Forest (Split Point에서 feature subset에서 선택)

    3. Extra Trees(extremely randomized tree) With respect to random forests, the method drops the idea of using bootstrap copies of the learning sample, and instead of trying to find an optimal cut-point for each one of the K randomly chosen features at each node, it selects a cut-point at random. This idea is rather productive in the context of many problems characterized by a large number of numerical features varying more or less continuously

  • Boosting

    1. AdaBoost was perhaps the first successful boosting ensemble algorithm. It generally works by weighting instances in the dataset by how easy or difficult they are to classify, allowing the algorithm to pay or or less attention to them in the construction of subsequent models.

    2. Stochastic Gradient Boosting

  • Voting(성능이 가장 안좋음)

    • You can create a voting ensemble model for classification using theVotingClassifierclass.

    • The predictions of the sub-models can be weighted, but specifying the weights for classifiers manually or even heuristically is difficult. More advanced methods can learn how to best weight the predictions from submodels, but this is called stacking (stacked aggregation) and is currently not provided in scikit-learn.

  • **The BaggingClassifier automatically performs soft voting instead of hard voting if the base classifier can estimate class probabilities (i.e., if it has a predict_proba() method), which is the case with Decision Trees classifiers.

  • **If your AdaBoost ensemble is overfitting the training set, you can try reducing the number of estimators or more strongly regularizing the base estimator.

results matching ""

    No results matching ""