Several months back I needed to compute NMF of some relatively larges matrices.
Since the native Ruby code was painfully slow, and for some reason even failed to work for some matrices, I decided to write a C implementation which will leverage the GNU Scientific Library (GSL) and then wrap it for using in Ruby application.
It was a neat add-on to the rb-gsl ruby library. What it does is adding NMF module under the GSL::Matrix, and there you have a method nmf which receives a GSL::Matrix and number of columns as a parameter and then returns two matrices.
Since this is an iterative algorithm, the number of runs is limited to 1000, and the desired difference cost metric is set to 10-6.
I tried to contact the author and even posted my code in the issue tracker, but haven’t received any response at the time of writing.
So I decided to create a git-svn mirror on Github and add my changes there.
http://github.com/romanbsd/rb-gsl
You can install the gem using this command:
$ gem sources -a http://gems.github.com/ # (you only have to do this once) $ gem install romanbsd-gsl
December 14th, 2009 at 22:56
Hi,
I was looking for some other work on wrapping GSL with Ruby on GitHub and I found yours.
I just wanted to let you know that I’m trying to create a good wrapper for GSL using RubyFFI, in case you’re interested in it.
I’m currently wrapping Matrix so maybe you should wait a little, but you could eventually submit your work on NMF (which I know nothing about =b) as an extension to my wrapper (it should be really easy using FFI, specially if you already have the C code).
Matt
Reply
Roman Reply:
December 14th, 2009 at 23:26
What is motivation behind this effort? Is it that you’d like to use GSL with JRuby?
Because with MRI the performance of C version is better than of FFI…
Reply
December 14th, 2009 at 23:30
Since GSL is really big and creating a wrapper gets quite complex (eventually, you have a ton of C code and it is impossible to maintain/extend), I thought that FFI could help in this aspect.
Regarding the performance, I’m not really sure if there’s really an important difference, although I’ll create some benchmarks when I have more code wrapped.
There are some other objectives I pursue with this wrapper, you can read them here:
http://ruby-gsl-ng.googlecode.com
Matt
Reply
March 1st, 2010 at 23:05
Hi
Do you have some examples of the ruby interface to the NMF code?
How to do you choose the seed values?
Which algorithm did you implement?
How large of a system can you solve?
Thanks
Charles
Reply
Roman Reply:
March 2nd, 2010 at 09:59
Examples:
http://github.com/romanbsd/rb-gsl/blob/master/tests/matrix/matrix_nmf_test.rb
The seed values are picked randomly in the min..max interval.
The algorithm is the one that circulated in some pdf, I don’t remember the name of the authors. You can see the implementation here:
http://github.com/romanbsd/rb-gsl/blob/master/ext/nmf.c
I know that it solves a system of thousands of rows.
Reply