2013-02-01-Lab

Table of Contents

1 Lab: Obtain and Explore Data    slide

  • Setup GitHub account
  • Find a data set or external API
  • Superficially examine it
  • Summarize findings
  • Submit assignment via GitHub

2 Why GitHub?    slide

  • git tool is standard in industry
  • GitHub provides best tools for sharing, commenting code
  • This assignment will not have code, just practice submitting

3 Setup GitHub account    slide

4 Setup git repository on ischool server    slide

  • On the server ischool.berkeley.edu
$ git clone git://github.com/jblomo/datamining290.git
  • On the server, in the datamining290 directory run
$ git remote rename origin jblomo

5 Connect it to GitHub    slide

  • After you recieve your free micro account on GitHub, create a private repository called datamining290
  • It will provide you with an SSH git path, let's call it PATH
  • You must use the SSH PATH starting with git://
  • On the server, in the datamining290 directory, run
$ git remote add origin PATH
$ git push origin master

6 Share with us    slide

  • Hopefully you now have a private copy of my repository
  • Add Shreyas and me (users: seekshreyas, jblomo) as a contributor to your private repository

7 Obtain Data    slide

  • Look through the links in slides for interesting data sets, or find your own
  • Or find a service API, like NYTimes
  • Explore the data available to answer the following questions

8 Questions    slide

  • What are the types of data available to you?
  • For data sets: how many records are in the data set?
  • For API: what are the limits on fetching data?
  • Provide an "interesting" record, explain its properties and why it is interesting
  • What are 3 questions you could answer using your data?

9 Submit Homework    slide

  • On the ischool server, create a branch called hw-obtain-data
  • Create a text file to write the solution, a simple editor to use is pico
  • git add the file
  • git commit the change
  • git push origin hw-obtain-data to put it on GitHub
  • on github, submit a "pull request" from the hw-obtain-data branch to your master branch

9.1 Pull Requests    notes

  • Pull requests are a way of showing updates in a way that lets me provide comments, get notifications
  • This is the first time I've tried it for class, so you're on the cutting edge. Hopefully it will work, give me feedback if it is not

10 Going Forward    slide

  • Other homework assignments will be completing code
  • General work-flow:
    • Start a new branch
    • Add required files
    • push to GitHub
    • Submit Pull Request

Date: 2013-02-01 17:25:49 PST

Author: Jim Blomo

Org version 7.8.02 with Emacs version 23

Validate XHTML 1.0