Large Scale Machine Learning and Other Animals: Matrix factorization algorithms

Monday, June 6, 2011

Matrix factorization algorithms - which algorithms can be parallelized?

The following matrix factorization algorithms have a rather straightforward
palatalization (and are implemented in GraphLab open source collaborative filtering library):

Probabilistic matrix/tensor factorization:
A) Liang Xiong, Xi Chen, Tzu-Kuo Huang, Jeff Schneider, Jaime G. Carbonell, Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization. In Proceedings of SIAM Data Mining, 2010. html (source code is also available).

B) Salakhutdinov and Mnih, Bayesian Probabilistic Matrix Factorization using Markov Chain Monte Carlo. in International Conference on Machine Learning, 2008. pdf, since our code implements matrix factorization as a special case of a tensor as well.

Alternating least squares:
C) Yunhong Zhou, Dennis Wilkinson, Robert Schreiber and Rong Pan. Large-Scale Parallel Collaborative Filtering for the Netflix Prize. Proceedings of the 4th international conference on Algorithmic Aspects in Information and Management. Shanghai, China pp. 337-348, 2008. pdf

Collaberative filtering:
D) SVD++ algorithm: Koren, Yehuda. "Factorization meets the neighborhood: a multifaceted collaborative filtering model." In Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 426434. ACM, 2008. http://portal.acm.org/citation.cfm?id=1401890.1401944

Gradient descent method:
E) SGD (sotchastic gradient descent) algorithm: Matrix Factorization Techniques for Recommender Systems Yehuda Koren, Robert Bell, Chris Volinsky In IEEE Computer, Vol. 42, No. 8. (07 August 2009), pp. 30-37.
F) Tikk, D. (2009). Scalable Collaborative Filtering Approaches for Large Recommender Systems. Journal of Machine Learning Research, 10, 623-656.

SVD:
G) For Lanczos algorithm (SVD) see: wikipedia.

I got an additional wish-list from Marinka Zitnik, a graduate student University of Ljubljana, Slovenia. She is working on implementing them as part of the GSoC Orange project. This is a very interesting list of matrix factorization algorithms. I marked in red the ones which are already implemented as part of GraphLab matrix factorization library. I am looking for volunteers who like to help us implement some more!

Non-negative matrix factorization

Standard NMF based on Kullbach Leibler divergence[1,3],simple multiplicative updates, enhanced to avoid numerical overfow [2]

[1] Kim H, Park H. Sparse non-negative matrix factorizations via alternating non-negativity-

constrained least squares for microarray data analysis. Bioinformatics (Oxford, England)

2007, 23:1495--502.

[2] Lee, D. D. and Seung, H. S. Algorithms for non-negative matrix factorization. NIPS,2000,

556--562. [Implemented in GraphLab]

[3] Brunet, J. P., Tamayo, P., Golub, T. R., and Mesirov, J. P. Metagenes and molecular

pattern discovery using matrix factorization. Proc Natl Acad Sci U S A, 2004, 101(12),

4164--4169.

Sparse non negative matrix factorization

Sparse NMF based on alternating non-negativity constrained least squares, solved by

a fast non-negativity constrained least squares. Sparseness imposed on the left, right factor.

[4] Kim, H., Park, H. Sparse non-negative matrix factorizations via alternating non-

negativity-constrained least squares for microarray data analysis. Bioinformatics (Oxford,

England), 2007, 23(12), 1495-502.

Non-smooth NMF. Uses a modifed version of Lee and Seung's multiplicative

updates for Kullbach-Leibler divergence. It is meant to give sparser results.

[5] Pascual-Montano, A., Carazo, J. M., Kochi, K., Lehmann, D., and Pascual-Marqui, R. D.

Nonsmooth nonnegative matrix factorization (nsnmf ). IEEE transactions on pattern anal-

ysis and machine intelligence, 2006 28(3), 403--415.

Local Fisher NMF. Add the Fisher constraint

to maximize the between-class scatter and minimize the within-class scatter.

[6] Wang, Y., Turk, M. Fisher non-negative matrix factorization for learning local features.

Alternating Least Square NMF. It is meant to be very fast compared to other approaches.

[4] Kim, H., Park, H. Sparse non-negative matrix factorizations via alternating non-

negativity-constrained least squares for microarray data analysis. Bioinformatics (Oxford,

England), 2007, 23(12), 1495-502.

Comment: alternating least squares is currently implemented in GraphLab.

Probabilistic matrix factorization. Model which scales linearly

with the number of observations and performs well on large, sparse, imbalanced datasets.

[7] Salakhutdinov, R., Mnih, A. Probabilistic Matrix Factorization. Learning.

[Implemented in GraphLab]

Nonnegative matrix approximation. Method for dimensionality reduction with respect on the nonnegativity of input data. Multiplicative iterative scheme.

[8] Sra, S.,Dhillon, I. S. Nonnegative Matrix Approximation : Algorithms and Applications.

Sciences-New York, 2006, 1--36.

Interval-valued NMF.

[9] Zhiyong Shen, Liang Du, Xukun Shen, Yidong Shen, Interval-valued Matrix Factorization

with Applications, Data Mining, IEEE International Conference on, pp. 1037--1042, 2010

IEEE International Conference on Data Mining, 2010.

Interval-valued PMF.

[9] Zhiyong Shen, Liang Du, Xukun Shen, Yidong Shen, Interval-valued Matrix Factorization

with Applications, Data Mining, IEEE International Conference on, pp. 1037--1042, 2010

IEEE International Conference on Data Mining, 2010.

Probabilistic Sparse MF. PSMF allows for varying levels of

sensor noise, uncertainty in the hidden prototypes used

to explain the data and uncertainty as to the prototypes selected to explain each data vector.

[10] Dueck, D., Morris, Q. D., Frey, B. J. Multi-way clustering of microarray data using

probabilistic sparse matrix factorization. Bioinformatics (Oxford, England), 21 Suppl 1,

2005, 144-51.

Bayesian decomposition. A Bayesian treatment of NMF, based on a normal likelihood

and exponential priors, Gibbs sampler to approximate the posterior density.

[11] Schmidt, M. N., Winther, O., Kai Hansen, L. Bayesian non-negative matrix factorization.

bfrm

Bayesian factor regression model. Markov chain Monte Carlo technique.

[12] Schmidt, M. N. Bayesian non-negative matrix factorization. Bayesian Statistics 7 (Ox-

ford), 2003.

Variational Bayesian decomposition Linearly constrained Bayesian decomposition..

[13] Ochs, M. F., Kossenkov A. V. NIH Public Access. Methods, Methods Enzymol., 2009,

467: 59--77.

5 comments:

IgorJune 8, 2011 at 3:18 PM
Other matrix factorization algo of interest:
http://perception.csl.uiuc.edu/matrix-rank/sample_code.html

I don't know if they fit well into a GraphLab approach though.

cheers,

Igor.
ReplyDelete
Replies
helwyrJune 11, 2011 at 6:05 PM
thanks, added to the list: http://www.quora.com/Machine-Learning/What-are-some-good-class-projects-for-machine-learning-using-MapReduce and http://www.quora.com/What-are-some-good-resources-for-learning-about-numerical-analysis
ReplyDelete
Replies
IgorAugust 31, 2011 at 3:28 AM
Danny,

This may be of interest:

https://sites.google.com/site/igorcarron2/matrixfactorizations

Cheers,

Igor.
ReplyDelete
Replies
Danny BicksonAugust 31, 2011 at 10:40 PM
Thanks Igor! When I will grow up, my blog will be famous as yours!
ReplyDelete
Replies
flash to html5 converterMarch 5, 2012 at 4:13 AM
This comment has been removed by a blog administrator.
ReplyDelete
Replies

Add comment

Large Scale Machine Learning and Other Animals

Monday, June 6, 2011

Matrix factorization algorithms - which algorithms can be parallelized?

5 comments:

Labels

GraphLab Users Google Group

pagerank

google analytics

syntax