Monday, June 6, 2011

Matrix factorization algorithms - which algorithms can be parallelized?

The following matrix factorization algorithms have a rather straightforward
palatalization (and are implemented in GraphLab open source collaborative filtering library):

Probabilistic matrix/tensor factorization:
A) Liang Xiong, Xi Chen, Tzu-Kuo Huang, Jeff Schneider, Jaime G. Carbonell, Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization. In Proceedings of SIAM Data Mining, 2010. html (source code is also available).

B) Salakhutdinov and Mnih, Bayesian Probabilistic Matrix Factorization using Markov Chain Monte Carlo. in International Conference on Machine Learning, 2008. pdf, since our code implements matrix factorization as a special case of a tensor as well.

Alternating least squares:
C)  Yunhong Zhou, Dennis Wilkinson, Robert Schreiber and Rong Pan. Large-Scale Parallel Collaborative Filtering for the Netflix Prize. Proceedings of the 4th international conference on Algorithmic Aspects in Information and Management. Shanghai, China pp. 337-348, 2008. pdf

Collaberative filtering:
D) SVD++ algorithm: Koren, Yehuda. "Factorization meets the neighborhood: a multifaceted collaborative filtering model." In Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 426434. ACM, 2008. http://portal.acm.org/citation.cfm?id=1401890.1401944


Gradient descent method:
E) SGD (sotchastic gradient descent) algorithm: Matrix Factorization Techniques for Recommender Systems Yehuda Koren, Robert Bell, Chris Volinsky In IEEE Computer, Vol. 42, No. 8. (07 August 2009), pp. 30-37.
F) Tikk, D. (2009). Scalable Collaborative Filtering Approaches for Large Recommender Systems. Journal of Machine Learning Research, 10, 623-656.


SVD:
G) For Lanczos algorithm (SVD) see: wikipedia.

I got an additional wish-list from Marinka Zitnik, a graduate student University of Ljubljana, Slovenia. She is working on implementing them as part of the GSoC Orange project. This is a very interesting list of matrix factorization algorithms. I marked in red the ones which are already implemented as part of GraphLab matrix factorization library. I am looking for volunteers who like to help us implement some more!

Non-negative matrix factorization
Standard NMF based on Kullbach Leibler divergence[1,3],simple multiplicative updates, enhanced to avoid numerical overfow [2]
[1] Kim H, Park H. Sparse non-negative matrix factorizations via alternating non-negativity-
constrained least squares for microarray data analysis. Bioinformatics (Oxford, England)
2007, 23:1495--502.
[2] Lee, D. D. and Seung, H. S. Algorithms for non-negative matrix factorization. NIPS,2000,
556--562. [Implemented in GraphLab]
[3] Brunet, J. P., Tamayo, P., Golub, T. R., and Mesirov, J. P. Metagenes and molecular
pattern discovery using matrix factorization. Proc Natl Acad Sci U S A, 2004, 101(12),
4164--4169.
 
Sparse non negative matrix factorization
Sparse NMF based on alternating non-negativity constrained least squares, solved by
a fast non-negativity constrained least squares. Sparseness imposed on the left, right factor.
[4] Kim, H., Park, H. Sparse non-negative matrix factorizations via alternating non-
negativity-constrained least squares for microarray data analysis. Bioinformatics (Oxford,
England), 2007, 23(12), 1495-502.

Non-smooth NMF. Uses a modifed version of Lee and Seung's multiplicative
updates for Kullbach-Leibler divergence. It is meant to give sparser results.
[5] Pascual-Montano, A., Carazo, J. M., Kochi, K., Lehmann, D., and Pascual-Marqui, R. D.
Nonsmooth nonnegative matrix factorization (nsnmf ). IEEE transactions on pattern anal-
ysis and machine intelligence, 2006 28(3), 403--415.

Local Fisher NMF. Add the Fisher constraint
to maximize the between-class scatter and minimize the within-class scatter.
[6] Wang, Y., Turk, M. Fisher non-negative matrix factorization for learning local features.

Alternating Least Square NMF. It is meant to be very fast compared to other approaches.
[4] Kim, H., Park, H. Sparse non-negative matrix factorizations via alternating non-
negativity-constrained least squares for microarray data analysis. Bioinformatics (Oxford,
England), 2007, 23(12), 1495-502.
Comment:  alternating least squares is currently implemented in GraphLab.

Probabilistic matrix factorization. Model which scales linearly
with the number of observations and performs well on large, sparse, imbalanced datasets.
[7] Salakhutdinov, R., Mnih, A. Probabilistic Matrix Factorization. Learning.
[Implemented in GraphLab]

Nonnegative matrix approximation. Method for dimensionality reduction with respect on the nonnegativity of input data. Multiplicative iterative scheme.
[8] Sra, S.,Dhillon, I. S. Nonnegative Matrix Approximation : Algorithms and Applications.
Sciences-New York, 2006, 1--36.

Interval-valued NMF.
[9] Zhiyong Shen, Liang Du, Xukun Shen, Yidong Shen, Interval-valued Matrix Factorization
with Applications, Data Mining, IEEE International Conference on, pp. 1037--1042, 2010
IEEE International Conference on Data Mining, 2010.

Interval-valued PMF.
[9] Zhiyong Shen, Liang Du, Xukun Shen, Yidong Shen, Interval-valued Matrix Factorization
with Applications, Data Mining, IEEE International Conference on, pp. 1037--1042, 2010
IEEE International Conference on Data Mining, 2010.

Probabilistic Sparse MF. PSMF allows for varying levels of
sensor noise, uncertainty in the hidden prototypes used
to explain the data and uncertainty as to the prototypes selected to explain each data vector.
[10] Dueck, D., Morris, Q. D., Frey, B. J. Multi-way clustering of microarray data using
probabilistic sparse matrix factorization. Bioinformatics (Oxford, England), 21 Suppl 1,
2005, 144-51.

Bayesian decomposition. A Bayesian treatment of NMF, based on a normal likelihood
and exponential priors, Gibbs sampler to approximate the posterior density.
[11] Schmidt, M. N., Winther, O., Kai Hansen, L. Bayesian non-negative matrix factorization.
bfrm

Bayesian factor regression model. Markov chain Monte Carlo technique.
[12] Schmidt, M. N. Bayesian non-negative matrix factorization. Bayesian Statistics 7 (Ox-
ford), 2003.

Variational Bayesian decomposition Linearly constrained Bayesian decomposition..
[13] Ochs, M. F., Kossenkov A. V. NIH Public Access. Methods, Methods Enzymol., 2009,
467: 59--77.








5 comments:

  1. Other matrix factorization algo of interest:
    http://perception.csl.uiuc.edu/matrix-rank/sample_code.html

    I don't know if they fit well into a GraphLab approach though.

    cheers,

    Igor.

    ReplyDelete
  2. thanks, added to the list: http://www.quora.com/Machine-Learning/What-are-some-good-class-projects-for-machine-learning-using-MapReduce and http://www.quora.com/What-are-some-good-resources-for-learning-about-numerical-analysis

    ReplyDelete
  3. Danny,

    This may be of interest:

    https://sites.google.com/site/igorcarron2/matrixfactorizations

    Cheers,

    Igor.

    ReplyDelete
  4. Thanks Igor! When I will grow up, my blog will be famous as yours!

    ReplyDelete
  5. This comment has been removed by a blog administrator.

    ReplyDelete