2013-02-22-Gini
Table of Contents
1 HW: Gini
- Calculate Gini Index
2 Gini Index
Gini(D) = 1 - sum(frac**2 for frac in classes)
Sum of the squares of the fraction of items in each class
3 Data: Campaign Contributions
- Calculating the Gini Index for the Candidate Names for the enitre data set
- Partition by zip code, calculate the weighted average Gini Index score over all partitions
- Partitions are weighted by the number of records they contain divided by the total number of records in the data set
4 Extra Credit
- Find a best split of a continuous field
5 Python Tips
- collections
defaultdict
autovivifies keysCounter
autovivifies integer keys
zipcodes = defaultdict(Counter)
6 Git Tips
$ git checkout master $ git pull jblomo master $ git checkout -b hw-gini