Archive for the 'python' Category


InfiniteSortedObjectSequence - for large data sets in Python

What do you do when you have a data set too large to fit into RAM?
You could just use disk directly, so instead of a dict you use a shelve or bsddb, however the problem with that is that you then have a performance hit as all operations are disk based.
You could have a specialization […]

MapReduce in 10 or so lines of Python

I’ve realized that I understand things best when I implement them myself, and I was recently reading Trevor Strohman’s dissertation, intriguied by TupleFlow, a kind of more elaborate and improved MapReduce, and was about to write my own toy impl of TupleFlow when I decided to simplify and just for fun write MapReduce in […]

-->
  • Activity

  • Archives

  • Categories