2013-02-22-Gini

Table of Contents

1 HW: Gini    slide

  • Calculate Gini Index

2 Gini Index    slide

Gini(D) = 1 - sum(frac**2 for frac in classes)

Sum of the squares of the fraction of items in each class

3 Data: Campaign Contributions    slide

  • Calculating the Gini Index for the Candidate Names for the enitre data set
  • Partition by zip code, calculate the weighted average Gini Index score over all partitions
  • Partitions are weighted by the number of records they contain divided by the total number of records in the data set

4 Extra Credit    slide

  • Find a best split of a continuous field

5 Python Tips    slide

  • collections
  • defaultdict autovivifies keys
  • Counter autovivifies integer keys
zipcodes = defaultdict(Counter)

6 Git Tips    slide

$ git checkout master
$ git pull jblomo master
$ git checkout -b hw-gini

Date: 2013-02-23 16:11:38 PST

Author: Jim Blomo

Org version 7.8.02 with Emacs version 23

Validate XHTML 1.0