Pete Skomoroch

Arlington, VA

38.890511, -77.086296

www.datawrangling.com

0e55b01

About: Interested in connecting with python people into machine learning, collaborative filtering, text mining, or web analytics...

http://www.juiceanalytics.com

http://del.icio.us/pskomoroch

Where do you work? Juice Analytics
What do you use Python for? Machine Learning, Data Mining, Web Scraping
Primary Python frameworks/libraries you use? Numpy, Scipy, BeautifulSoup, nltk
What Python topics most interest you? performance, distributed programming, django, olpc
How many years have you been using Python? 6

Blog Posts

blog posts

Hidden Video Courses in Math, Science, and Engineering

--

Over the last few years, a large number of open courseware directories and video lecture aggregators have popped up on the web. These sites often ...

PyCon 2008 ElasticWulf Slides

--

Here are the ElasticWulf slides from my talk. The video will eventually be posted to the PyCon site. The cluster management scripts I used to run...

PyCon 2008 ElasticWulf Slides

--

Here are the ElasticWulf slides from my talk. Video will eventually be posted to the PyCon site. The cluster management scripts I used to run the...

Python Montage Code for Displaying Arrays

--

This post will show how to replicate the Matlab montage function using Python. The Data Wrangling blog seems to be getting search traffic from peo...

The Colbert Bump in Amazon Data

--

Last month, I took a position as Director of Advanced Analytics at Juice. I’m primarily a machine learning guy, so I will be focused on ...

Some Datasets Available on the Web

--

The Datawrangling blog was put on the back burner last May while I focused on my startup. Now that I have some bandwidth again, I am getting back ...

Google Paper on Parallel EM Algorithm using MapReduce

--

I hadn’t seen much discussion of this on the web, so I thought I would post the link to this May 2007 paper from Google: Google News Per...

Amazon EC2 Considered Harmful

--

“The TruckNumber is the size of the smallest set of people in a project such that, if all of them got hit by a truck, the project would b...

MPI Cluster with Python and Amazon EC2 (part 2 of 3)

--

Today I posted a public AMI which can be used to run a small beowulf cluster on Amazon EC2 and do some parallel computations with C, Fortran, or P...

On-Demand MPI Cluster with Python and EC2 (part 1 of 3)

--

In this post, we will build a 20 node Beowulf cluster on Amazon EC2 and run some computations using both MPI and its Python wrapper pyMPI. This tu...

Bookmarks:

Econometric Modeling as Junk Science

Despite their scientific appearance, these models do not meet the fundamental criterion for a useful mathematical model: the ability to make predictions that are better than random chance.

Orange: MDS scatterplot in pyhon

MDS scatterplot In our first example, we will take iris data set, compute the distance between the examples and then run MDS on a distance matrix. This is done by the following code:

depth first search ? MDS

Multidimensional Scaling - python code adapted from Principles of Multivariate Analysis: A User's Perspective

Particletree ? Smarter Auto-Linking

For example, some web sites auto-link or convert text presented in a URL format as a hyperlink to improve the user experience.

Google Testing Blog: Announcing: New Google C++ Testing Framework

We all know the importance of writing automated tests to cover our code. To make it easier for everyone to write good C++ tests, today we have open-sourced Google C++ Testing Framework (Google Test for short), a library that thousands of Googlers use

David Hawking's Personal Home Page


Network

Spacer David May
Spacer David May (mutual) friend
Spacer Moses Ting
Spacer Moses Ting friend
Spacer Chris Gemignani
Spacer Chris Gemignani (mutual) friend
Spacer Sal Uryasev
Spacer Sal Uryasev (mutual) friend
Spacer Travis Oliphant
Spacer Travis Oliphant want-to-meet
Spacer Chris McAvoy
Spacer Chris McAvoy (mutual) friend

Comments

Be the first to leave a comment for Pete Skomoroch