
Table of Contents

1 Visualizing Data at Yelp    slide center

2 Visualizing Data is Important    slide

  • Effectively summarizes data
  • Highlights patterns
  • Improves recall

2.1 Metrics    notes

  • Can't improve something till you measure
  • True, but have to look at and understand the data!
  • Often best way to understand data is visually
  • Having metrics you care about evident will make you focus on improving them
  • More sophisticated your visualizations, more sophisticated your goals

3 Visualizing Data is Difficult    slide

  • Requires investment
  • Dimensions of success
  • Successful visualizations in Yelp

3.1 Role in Yelp    notes

  • Often requires specific domain knowledge of both the data and the tools
  • Moving to a new office
  • Ideally have 2 big screens per pod
  • "That's a lot of TVs!"
  • Get motivated everyday
  • Show what you care about
  • Don't want a sterile office, decorate with the results of your work

4 Birth of a City    slide

4.1 Review activity over time on a map    notes

  • Written as part of Yelp's quarterly Hackathons
  • A lot of feedback from our Community Managers on understanding their city
  • Demonstrable value to advertisers
  • But main feature is… Cool

5 Cool    slide

  • Good looking is a dimension of any visualization
  • We want to be Neo or John Anderton, not Milton


5.1 Cool is OK    notes

  • Engineers need to come to grips that to be visually compelling, a visualization needs to look nice
  • Just like the most compelling novels need to be well written
  • We realize this, we just don't like to admit it

6 Avoid Chart Junk    slide two_col

  • Edward Tufte rightfully suspicious of cool
  • Worry about data/ink ratio
  • But remember tradeoffs: memorability, fun


6.1 Useful Junk?    notes

  • Data/ink ratio describes the amount of information displayed per ink/pixel
  • If you remove a pixel, will you remove information?
  • Best Paper by Scott Bateman HCI: some useful Junk
  • Noted the context of the chart
  • Bad ratio limits richness, especially important on mobile

7 Grapperr    slide

7.1 Shows errors live from log    notes

  • Error activity
  • Highlight error type UnicodeDecodeError
  • Text details available
  • Still Cool!
    • Colors slick, modern
    • But used for differentiation (data)

7.2 Grapperr Snapshot    slide


8 Actionable    slide two_col

  • Realtime*
  • Context
  • Connections


8.1 Definitions    notes

  • As realtime as problem domain requires
    • Seconds matter when fixing site problems, so should be up to the second
    • Days or weeks might matter when deciding budget issues
  • Context: Is this a normal amount of errors?
  • Connections: Ability to drill down to specific instance

9 Dimensions    slide

cool, pretty, engaging
realtime, contextual, connecting

9.1 Agenda    notes

  • Dimensions important to visualizations
  • Axis on which you can evaluate them
  • Tradeoffs in developing them

10 A Tale of Two Datacenters    slide

  • Testing datacenter failover
  • Tracking metrics in a new way
  • Did we spend a week preparing a dashboard?

10.1 How?    notes

  • Yelp used to be in only one datacenter
  • Moving to two datacenters is a huge undertaking, but worth it for reliability reasons
  • Don't want to bring down a worldwide site when freak electrical storms hit your datacenter
  • After months of work, how did watch over our site when we finally flipped the switch?
  • This was the first time Yelp had done this: we didn't have a premade dashboard so everyone could track the important metrics

11 Firefly    slide

Github: Yelp/firefly

11.1 Demo    notes

  • One of our many open source projects
  • Hosted on Github
  • Existing extension to Ganglia

12 Easy    slide

  • Make repeated operations fast and within reach
  • Must understand problem domain
  • Accessible

12.1 Definitions    notes

  • Sophisticated Tool: Data discovery, stacking options, coloring, layout
  • But all of the steps are repeated, formulaic: we're making similar things over and over
  • So make it easy!
  • Not much more accessible than Web: share links, etc.

13 Easy from Simple    slide

  • Avoid temptation to make visualizations easy from the start
  • Easy systems are designed for non-experts
  • Long term investment in the system to manage complexity

13.1 Non-experts    notes

  • Simple Made Easy, Rich Hickey
  • Still potentially technical users
  • Just don't know the details of how metrics are collected, or how to display across browsers
  • Always will require experts to make changes
  • Always are going to want new features
  • Make sure you have the ability to add them
  • Not extensible

14 Search Maps    slide

img/yelp-beer.png Mo' Map

14.1 Times Change    notes

  • 2005, 8 years ago
  • May not seem like important visualization, but times have changed
  • Full page refresh for each map square
  • Now we take zoom in, panning for granted
  • Sign of a great visualization: don't think about it: it's a tool
  • What else are we not plotting on maps that we should be?

15 Interactive    slide two_col

  • Fast
  • Explorable
  • Feedback


15.1 Definitions    notes

One of the reasons its a fairly recent technology, hard to get fast
  • Speed gives the UI illusion that you are interacting with a physical thing, something we're much more comfortable with
Multiple levels of detail that can be discovered by user
Update all other dependent displays (search results)

16 Creation    slide

  • Michael Bostock had a problem
  • Protovis useful, but not flexible
  • How to provide coherent description for visualizing data?

16.1 D3 Intro    notes

  • Mike Bostock professor at Stanford
  • Protivis was a declarative Javascript charting library
  • But hard to keep up with changes in technology
  • Wasn't quite flexible enough for new visualizations

17 D3: Data-Driven Documents    slide center

18 Flexible    slide

  • Language level
  • Access to medium
  • Access to data
  .data([4, 8, 15, 16, 23, 42])
    function(d) { return d + "px"; });

18.1 Why?    notes

  • Metaphor natural language
  • General language most flexible tool humans have to describe new things
  • Full access to medium to be able to create take advantage of all possibilities
    • and new tech
  • Not D3 specific, but need full data to find new ways to summarize, explore, drill
  • Need to understand where data came from to clean, normalize

19 Dimensions    slide

cool, pretty, engaging
realtime, contextual, connecting
available for non-experts, remove repetition
fast, explorable
expressive, full access to lowest level

19.1 Tension    notes

  • Obvious: Flexible vs Easy. Too many options is confusing.
  • Less obvious: Interactive vs Actionable. Spend too long playing, not enough fixing
  • In fact: All in contention for your time

20 Understand Usage Context    slide

20.1 Press: Fun    slide


20.2 Alerting: Actionable    slide


20.3 Search Metrics    notes

  • This is a visualization of the status of our search cluster

20.4 Product Managers: Easy    slide


20.5 Investigation: Interactive    slide

img/ipy_0.13.png img/IPy_header.png

20.6 Explorable: Interactive    slide


20.7 Another Case    notes

  • Another case for Interactivity is geographical data

20.8 New tools: Simple    slide


20.9 New tools: Flexible    slide


20.10 Unique    notes

  • You can see this is not a standard visualization
  • It is one that is customized to its purpose
  • Made possible by flexible tools

21 Dimensions    slide

cool, pretty, engaging
realtime, contextual, connecting
available for non-experts, remove repetition
fast, explorable
expressive, full access to lowest level

21.1 Consider Tradeoffs    notes

  • Visualization is just part of making an effective biz, team
  • Interested in working at Yelp?

Date: 2013-05-03 09:46:19 PDT

Author: Jim Blomo

Org version 7.8.02 with Emacs version 23

Validate XHTML 1.0