All posts by admin

Theano with Anaconda

A little background: Theano is a symbolic computation library which makes writing neural network models very easy. It is python-based, requires a range of dependent libraries, including numpy, scipy among others (I didn’t work through the labyrinth of all the dependencies as I always failed 🙂 ).

If you are as lazy as I am, but want to enjoy the enriched environment and resources by the freeware community, i.e. python here, perhaps the all-inclusive python packages like Anaconda or Enthought provided by those very good people is our choice.

Recently I turned to Enthought to Anaconda for no other reason but the latter is more dummy-proof, since make links are hard, and you don’t need to fiddle with your LD_LIBRARY_PATH, DYLIB_ etc. etc., only to find yourself breaking dependencies of some other program.


However, Anaconda does have a problem with its BLAS library. Fresh installation works well with Theano, but without BLAS. If you install the ATLAS library, Theano will throw out a complain about missing “_gfortran_st_write_done” (likely google this and theano lead you here), when you try to build functions involving nodes of operation such as tanh() function. This is probably because ATLAS does not contain that particular function. Google it returns back some confusing ask-answer sessions. The above explanation is my best guess.

So my solution is simple and brutal. I did “conda remove atlas”. Things went back to the slow and happy status. A sensible step forward would be install OpenBLAS and let conda use it. Will try and see …

Simple Exponential Family Principal Component Analysis

The work addresses a very common problem: finding the main underlying factors behind a chuck of data. The particular point here is that we take care of the situations when the elements of your data vectors are attributes in special domains. I.e. sometime you have good reasons NOT to consider all of them just general real numbers (then your next step is to claim the population is Gaussian and march on …), but positive-only, binary, etc. We find major components for those data AND consider the problem of choosing how many components to use.

Thanks for your attention, if you’d like to know more about it, keep reading this post. For more interested readers, the paper is here [pdf], as well as a bit demo code.