2013-05-03-Visualization

Table of Contents

1 Visualization in Data Mining    slide

2 Your Brain    slide two_col

  • Pattern detector
  • Visualizations help you search for possible models
  • Help intuitively understand the data

img/memory-recall.png

2.1 Visual    notes

  • Most people, vision is the strongest sense
  • Recall improves 55% (10%=>65%) with addition of a picture
  • We've talked about the need to understand the data before using algorithms on it. Visualization can speed that process up.

3 Patterns    slide

  • Use visualizations that surface patterns and relationships
  • Know the context for the visualization
  • Verify results

3.1 Steps    notes

  • For gaining intuition, focus on simple visualizations that help you see relationships in the data.
  • At this time, labels, titles, etc. not very important. Multiple dimension in multiple windows? Fine!
  • We'll discuss, but the context a visualization is going to be used in matters a lot. Don't feel like you have to import every cool infographic into your project
  • Clustering, classification, outlier selection can be verified visually, eg. highlighting points. Use it to gut check conclusions, even if you have to drastically reduce dimensionality

4 Scatter    slide

  • Great for multidimensional data
  • Just plot > 2 dimensions in different plots
  • Reveals correlation, clustering, distribution, …

4.1 Data Mining    notes

  • DM bread and butter. Often deal with high dimensionality, so scatter is one of the best ways to visualize
  • Wide variety of patterns can be searched

4.2 Multiple Dimensions    slide center

img/vp-sample.png

4.3 vp    notes

  • This data is for body positions over time
  • Dimensions are the different angles for different body parts, like hip ankle, knee, over time
  • We can see some strong patterns. Maybe we'll need to kernelize them to make them learnable, but we have a good understanding that there are, or are not relationships between the data

5 Geographic    slide

img/cancer-county.jpg

5.1 Trade-offs    notes

  • Coordinates intuitively understandable
  • Lots of ways to bucket/aggregate
  • Dependence on geographical area (eg. when you'd like to depend on human impact instead)

6 Other Chart Types    slide

Box plot
aggregate data
Bar charts
simple summaries
Pie charts
compound proportions

6.1 Types    notes

  • Box plots, for real data, still carry a lot of data
  • Bar charts nice for summarizing, not great for exploring
  • Same for pie charts. Pie charts are mostly bad, but can use in particular circumstances

7 Aesthetics    slide

  • The visual aesthetics you use should be tied to the data

img/graphics-aesthetics.png

7.1 Aesthetics    notes

7.2 Larger Value?    slide

  • Position
  • Length / Angle
  • Area / Volume
  • Color: Chroma Luminance

7.3 Slide Switch    notes

  • Hadley Wickham slides, OSCON

8 Color: HCL    slide two_col

Hue
color type, relative to RGBY
Chroma
colorfulness, perceived color intensity
Luminosity
brightness, light-dark

img/Munsell.png

8.1 Color Spaces    notes

8.2 ColorBrewer    slide

9 Careful    slide

9.1 Line Lengths    notes

  • Line lengths can appear to look smaller when extended instead of right next to each other

9.2 Careful    slide

10 Grammar of Graphics    slide

Geom
Graphic element
Aesthetics
appearance of a geom
Data
raw, context, statistical aggregations of data
Mapping
functions which map data to geom properties or aesthetics

10.1 Bringing Together    notes

  • We've talked about different aesthetics of showing data, we've talked about data, all that's needed is to bring them together
  • Wilkinson, L. (2005), The Grammar of Graphics (2nd ed.). Statistics and Computing, New York: Springer.
  • Rigorous way of describing graphics beyond "scatter plot" or "bar chart"

11 Scatter Plot    slide animate

img/scatter-ice-cream.gif

  • Geoms?
    • points, tick marks
  • Data?
    • temperature, sales
  • Mapping?
    • sales -> y, temp -> x
    • Note, not a simple 1:1 mapping, we must map to something visual, like pixels

11.1 Ice Cream    notes

12 Bar Plot    slide animate

img/bar-graph-fruit.gif

  • Geoms?
    • rectangles (ticks, text)
  • Data?
    • Fruit to popularity
  • Mapping?
    • popularity -> height, fruit type -> x, color

12.1 Fruit    notes

13 Hipmonk    slide

img/hipmonk.png

  • Geoms?
    • rectangles, text, ticks,
  • Data?
    • Carrier, flight time, layover time, cost, wifi available, airports
  • Mapping?
    • travel time -> bar length, flight times -> sub-bars, "agony" -> y, airline -> color

13.1 Fruit    notes

  • Shows travel options from SFO to Ithica, connecting flights, airports, etc.
  • More complex, but still expressible via Grammar
  • img: http://www.hipmonk.com

14 Recursive    slide

img/grammar-af.png

  • Geoms?

14.1 Complex    notes

  • Reading will go a further extension of this, where the geoms are themselves other plots

15 Tufte    slide

  • Clarity from data
  • Avoid chart junk
  • Techniques for displaying many types

img/tufte-books.jpg

15.1 Tufte    notes

  • No talk on visualization would be complete without mentioning Tufte
  • Great examples

16 Break    slide

Date: 2013-05-03 12:26:13 PDT

Author: Jim Blomo

Org version 7.8.02 with Emacs version 23

Validate XHTML 1.0