Rational Girl

Attempting to be rational while dreaming of 3.141592653589793...

Pandas: create dataframe

I often find myself needing to extract data from disparate data sources and combine them into a csv or excel file for others to then work with. Of course I fall back on pandas as it is a data wrangling guru...

to create an empty dataframe:

import numpy as np
import pandas
columns = ['some', 'column', 'headers']
index = np.arange(103) # array of numbers for the number of samples
df = pandas.DataFrame(columns=columns, index = index)

Now I can populate my new data frame:

myarray = np.random.random((10,3))
for val, item in enumerate(myarray):
    df.ix[val] = item

If my data is already in a list of dicts, it is even easier:

mydata = [{'subid' : 'B14-111', 'age': 75, 'fdg':1.78},
          {'subid' : 'B14-112', 'age': 22, 'fdg':1.56},]
df = pandas.DataFrame(mydata)

Award IPython

Kudos to the ipython team (esp Fernando) for a well deserved award.

Free Software Award

While at pydata, the percent of talks that used ipython notebook @ 98% For the other talks I saw, if not done in ipython, I remember feeling a bit sad.

For a great collection of notebooks, including my favorite

XKCD plots in matplotlib

check out http://nbviewer.ipython.org/

pydata 2013

Peter Norvig - Learning Python

I attended this amazing talk at pydata2013

Learning Python by Peter Norvig (vimeo)

Heres a few of the key points I noted.

Major errors The hard parts of learning python, common mistakes

  • create list where each item is a reference
    • change one -> change all
  • setting value vs checking equality
    • x = 1 vs x == 1
  • understanding pointer to reference vs copy:

a = 1
b = a
a = 2

# what is b = ?

Compare this to this behavior

a = [1, 2, 3, ]
b = a
a[0] = 9

b[0] # what is b[0]
  • implicit operators ( x + 1 )( x * 2 )
    • understanding why this fails

The 2 Sigma Effect

Benjamin Bloom wikipedia: Bloom 2 sigma problem

one-to-one tutoring students using mastery learning results in performance of 2 standard deviations better than control (classroom instruction) class.

Effective Measures for MOOC (Massive Open Online Courses)

  • willpower / due dates
  • peer support / peer grading
  • Faculty Engagement ( email), personal feel to contact
  • Authenticity (reputable teachers, institution )

Try to influence how a student learns

Introduce the problem, then the explanations

Rate of Learning

  • mastery vs tutoring, hold student back until 90% mastery
  • interactive assignments, use tools to see how students solve, and struggle
  • student control over rewind (esp video which allows questions without
  • self-consciousness
  • changing opinions: value struggling through problems to find solution
  • flipped classroom video at home, discussion in class

Hal Varian

get together problem sets and exams you want them to be able to solve, and then write book that teaches them how to solve

simliar to test driven development

Software Carpentry

Greg Wilson has been at the forefront of this issue in teaching computing to scientists, check out Software Carpentry