# Pandas: create dataframe

I often find myself needing to extract data from disparate data sources and combine them into a csv or excel file for others to then work with. Of course I fall back on pandas as it is a data wrangling guru...

to create an empty dataframe:

```
import numpy as np
import pandas
columns = ['some', 'column', 'headers']
index = np.arange(103) # array of numbers for the number of samples
df = pandas.DataFrame(columns=columns, index = index)
```

Now I can populate my new data frame:

```
myarray = np.random.random((10,3))
for val, item in enumerate(myarray):
df.ix[val] = item
```

If my data is already in a list of dicts, it is even easier:

```
mydata = [{'subid' : 'B14-111', 'age': 75, 'fdg':1.78},
{'subid' : 'B14-112', 'age': 22, 'fdg':1.56},]
df = pandas.DataFrame(mydata)
```

# Award IPython

Kudos to the ipython team (esp Fernando) for a well deserved award.

While at pydata, the percent of talks that used ipython notebook @ 98% For the other talks I saw, if not done in ipython, I remember feeling a bit sad.

For a great collection of notebooks, including my favorite

**XKCD plots in matplotlib**

check out http://nbviewer.ipython.org/

# pydata 2013

## Peter Norvig - Learning Python

I attended this amazing talk at pydata2013

Learning Python by Peter Norvig (vimeo)

Heres a few of the key points I noted.

## Major errors The hard parts of learning python, common mistakes

- create list where each item is a reference
- change one -> change all

- setting value vs checking equality
- x = 1 vs x == 1

understanding pointer to reference vs copy:

```
a = 1
b = a
a = 2
# what is b = ?
```

Compare this to this behavior

```
a = [1, 2, 3, ]
b = a
a[0] = 9
b[0] # what is b[0]
```

- implicit operators
**( x + 1 )( x * 2 )** - understanding why this fails

- implicit operators

## The 2 Sigma Effect

**Benjamin Bloom**
wikipedia: Bloom 2 sigma problem

one-to-one tutoring students using mastery learning results in performance of 2 standard deviations better than control (classroom instruction) class.

## Effective Measures for MOOC (Massive Open Online Courses)

- willpower / due dates
- peer support / peer grading
- Faculty Engagement ( email), personal feel to contact
- Authenticity (reputable teachers, institution )

Try to influence how a student learns

Introduce the problem, then the explanations

## Rate of Learning

- mastery vs tutoring, hold student back until 90% mastery
- interactive assignments, use tools to see how students solve, and struggle
- student control over rewind (esp video which allows questions without
- self-consciousness
- changing opinions: value struggling through problems to find solution
- flipped classroom video at home, discussion in class

**Hal Varian**

get together problem sets and exams you want them to be able to solve, and then write book that teaches them how to solve

simliar to test driven development

## Software Carpentry

Greg Wilson has been at the forefront of this issue in teaching computing to scientists, check out Software Carpentry