Rational Girl

Attempting to be rational while dreaming of 3.141592653589793...

Pandas: create dataframe

I often find myself needing to extract data from disparate data sources and combine them into a csv or excel file for others to then work with. Of course I fall back on pandas as it is a data wrangling guru...

to create an empty dataframe:

import numpy as np
import pandas
columns = ['some', 'column', 'headers']
index = np.arange(103) # array of numbers for the number of samples
df = pandas.DataFrame(columns=columns, index = index)

Now I can populate my new data frame:

myarray = np.random.random((10,3))
for val, item in enumerate(myarray):
    df.ix[val] = item

If my data is already in a list of dicts, it is even easier:

mydata = [{'subid' : 'B14-111', 'age': 75, 'fdg':1.78},
          {'subid' : 'B14-112', 'age': 22, 'fdg':1.56},]
df = pandas.DataFrame(mydata)