Wednesday, February 29, 2012

The world's coolest machine learning internships - updated

Note: Openings for summer 2014 are here.

I believe that every graduate student should try to do at least two internships in the industry. It is a great experience. Below you can find a list I compiled by aggregating information from some of the companies I am in touch with as a part of our GraphLab project. This list is a academic resource - I am not involved in any of the companies below. I also got some angry comments about some company or another missing - this is a personal list. I will be happy to add more companies providing the are doing some interesting research.



Openings in the US - summer 2012
Note: Openings for summer 2014 are here.

Rosie Jones, a fellow Tartan, sent me the following: the Computational Advertising team at Akamai Technologies invites applications for summer 2012 internships. Unfortunately, with the help of this blog, all positions are now filled. We should wait till next year..
Srinivasan Soundar from Bosch Research sent me the following: The Bosch Research and Technology Center, with labs in Palo Alto, CA, Pittsburgh, PA, and Cambridge, MA focuses on innovative research and development for the next generation of Bosch products. The data mining team is developing advanced statistical and machine learning methods for application to patient health and electronic medical records. We are looking for highly qualified, motivated, and innovative individuals to join our team. Internships are expected to be at least 10-12 weeks long during the summer months. Previous internships in our group have led to successful publications and/or patents. Topics include Latent Variable Models, Unsupervised Clustering, Privacy Preserving Data Mining and Association Rule Mining.




This is what I got from Grant Ingersoll, a well known Mahout contributor: Lucid Imagination, the leading commercial company for Apache Lucene and Solr, is looking for interns to work on building next generation search, analytics and machine learning technologies based on Apache Solr, Mahout, Hadoop and other cutting edge capabilities. This internship will be practically focused on working on real problems in search and machine learning as they relate to Lucid products and technologies as well as open source. Interested students should send their resume/profile, course work and evidence of open source activity (github account, ASF patches or other, etc.) to careers@lucidimagination.com. Note: position requires eligibility to work in the US.

In the NIPS big learning workshop I had the pleasure of meeting Vaclav Petricek who is a senior research of matching in eHarmony. eHarmony is an online dating startup, with around 33M users around the world, based in Santa Monica, LA.

The first time I heard about eHarmony is in John Langford's talk on Vowpal Wabbit at the same workshop. John mentioned, that out of the many companies who is using his software, he is most proud that Vowpal Wabbit is being used by eHarmony, thus promoting love in the world.

His an excerpt from their website, I was not aware of:
"Nearly 5% of all marriages in the U.S. are created by eHarmony. That’s 271 marriages per day."
This is absolutely amazing!

So if you like to promote love, and you are a graduate student in top US universities in a related area to machine learning you are welcome to apply here for internship. Relevant previous internship and an opensource project involvement are a plus. And tell them I sent you!

There is no need to introduce LinkedIn, one of the most successful social and professional communities. Ron Bekkerman, a senior researcher at LinkedIn is looking for interns for the coming summer.

With hundreds of millions of users, there is infinite amount of data and exciting new applications to explore.

Another cool company is RocketFuel, a company specializing in display advertising. I got the following from Abhinav Gupta, Founder and VP Engineering:


We’re hiring interns to work on machine learning/ optimization problems as well as our core platform (ad-serving, bidding, modeling and data infrastructure) built using a mix of proprietary and open-source technologies. We’re looking for those excited about working on tough problems related to scalable/ reliable/ available algorithms, machine learning, data mining and optimization. We are building a platform to do automatic targeting and optimization of ads. Our pitch to advertisers is very simple - If you can measure metrics of success of your campaign, we can optimize. We buy most of our inventory through real time auctions on exchanges such as Google Doubleclick. We’re integrated with real time exchanges processing requests @100k qps. We have over 1PB of data and growing fast.
You can apply to RocketFuel here.

Another hot company is Cloudera. Josh Wills gave an excellent talk at the NIPS big learning workshop where he identified some of the coming challenges in large scale machine learning. And here is what I got from him:

We're hiring data science interns to work on developing new (and not necessarily MapReduce-based) optimization and model fitting algorithms that can be used on data stored in a Hadoop cluster. Specifically, we're interested in ways to more closely integrate open-source projects like Spark, GraphLab, and modifications to MapReduce (such as AllReduce) with the rest of the components of CDH in order to optimize every step of the model building process, from feature extraction to model deployment to evaluation. At Cloudera, the work you do doesn't just impact our company, it impacts the entire Hadoop community.
If that sounds like fun, and you are a graduate student at a top US university in CS/math/operations research, email me your resume at jwills+intern@cloudera.com.
Additional opening in Cloudera is with Josh Patterson: building ML / NLP tools on Hadoop, HBase, and openNLP. Email him at josh@cloudera.com


Shon Burton is the founder of Wildcog, a company specializing in assignments of technical dudes in top bay area companies. Currently they are working with Twitter, Tumblr, Palantir, and Yahoo!. And guess what? they are looking for interns! You are welcome to email Shon at: mlinterns@wildcog.com


The wet dream for any big data lover. Who can have more data then Walmart - ranked no. 1 in Fortune 500 list? Patrick Harrington is looking for both interns and big data engineers:

@WalmartLabs is seeking outstanding engineers and scientists to build our next generation
 multi-­‐dimensional targeting system to help revolutionize eCommerce. This targeting
 system aggregates a variety of user based signals, e.g., click stream, social, web,
 geo-­‐location, etc, and outputs a portfolio of relevant products on a user specific
 basis. As a senior engineer, you will be joining a team devoted to increasing the
 percent of sales attributable to targeting via developing a portfolio
 of diverse data-­‐driven algorithms and the underlying batch-­‐oriented and
 real-­‐time systems. For more details about his opening, contact Patrick Harrington at:
pharrington@walmartlabs.com
 

And here is a note I got from Mike Spreitzer from IBM. He asks not to forget that IBM is very interested in big data, as  the whole "smarter planet" thing is about big data. IBM has internships in both product divisions and in Research.


Additional internship positions are available in the data mining and business analytics dept in IBM. And here is what I got from Priya Nagpurkar, a research stuff member:
Data mining for business analytics is one of the primary areas of focus in our department this year. More specifically our focus is on systems support (software and hardware) for high performance analytics, with the goal of designing next generation systems. Potential topics include, performance analysis for hardware-software co-design, acceleration (e.g. GPU), optimization of storage systems. For more details contact Priya.
Other internship jobs are found using IBM general job search.

Well, as a former IBMer I have sweet spot towards IBM. So it definitely gets a place in my list!

This is what I got from Hassan Shafi, Oracle Labs: Oracle Labs is investing a lot in the area of domain-specific languages.
One particular domain of interest is large graph-data analysis.
We are developing a DSL that simplifies implementing such algorithms and we are interested in all aspects from applications all the way down the hardware architecture. If you are interested in a great internship program in the SF Bay Area contact hassan.chafi@oracle.com

Anyone who ever used Mahout (and there are thousands if not more of users) knows Ted Dunning. To any question ever asked in the area of applied machine learning he knows the answer. After forming several successful startups, Ted has a new initiative for improving Hadoop infrastructure. He is looking for interns. His email is: tdunning@maprtech.com

And here is what I got from Jesse St. Charles from Knewton, a cool online education company:
Knewton is revolutionizing the practice of education with the world’s most powerful adaptive learning engine. We are a recognized leader in the
education and technology space by the World Economic Forum in Davos, and one of the top 25 best places to work by Crain’s New York Business. We're looking for Machine Learning interns with the know-how to help build an innovative online education system that adapts to each individual student. Interns will join a world-class team of data scientists and engineers who are pushing the boundaries of machine learning in both scalability and complexity. You'll get to work with a mountain of data and an exciting array of projects. If you have a passion for building scalable systems that analyze huge data sets and have coursework in machine learning, statistics, and advanced mathematics get in touch with us here.


My friend Udi Weinsberg from Technicolor raised my attention that Technicolor are also looking for interns. Technicolor Palo Alto research lab studies personalized computing, data privacy and recommendation systems. You can apply here.




Openings in Europe

I got this from Julien Nioche: DigitalPebble (Bristol, UK) is looking for a graduate / post-graduate student for this summer, ideally with the following interests or expertise : * NLP / text engineering / IE * statistical approaches and machine learning * web crawling and IR * large scale computing with Hadoop * good Java skills The internship would start in July for a duration of 2 or 3 months and will be based in Bristol. This should be a good opportunity to gain expertise in leading open source projects such as GATE, Mahout or Nutch and get directly involved in work with our clients. Note that the internship will be remunerated. To apply, email: jobs@digitalpebble.com


Now how about spending a summer in Madrid? Telefonica research is looking for interns all year long. I heard a very impressive talk by Nuria Olivier at our big learning workshop at NIPS about research done in Telefonica research. You can take a look at the slides here. In a nutshell once you have mobile phone call data combined with geographical data you can get into very interesting observations.



My avid reader alter0de sent me a link to internships in Xerox research center in Europe: http://www.xrce.xerox.com/About-XRCE/Internships. Thanks!

23 comments:

  1. If you have any info about opportunities outside the USA those would be of interest too.

    Thanks.

    ReplyDelete
    Replies
    1. Smart Me Up propose also several internship related to machine learning. The company is located in France.
      See : http://www.smartmeup.org

      Delete
  2. Hi Tim! Telefonica is in Madrid (Spain). I definitely know of some cool companies outside the states. If I get an information of internships there I promise to update.

    Best,

    DB

    ReplyDelete
  3. Xerox research has a couple of internships dealing with machine learning.

    http://www.xrce.xerox.com/About-XRCE/Internships

    ReplyDelete
  4. We are a Pune, India based online display advertising startup and have a couple of machine learning internship positions as well.

    http://adelement.com/company/jobs

    ReplyDelete
  5. Danny - Thanks a lot for your great post and connections. Based in Dresden the heart of #SiliconSaxony, I will check my networks here for further opportunities.

    Side info: We are launching an hightech innovation accelerator across disciplines here in Dresden. Anybody interested please get in touch. State: #HTxA

    ReplyDelete
  6. We are a NYC-based startup (www.styloot.com), that aims to produce visual search for women's fashion. We are hiring interns to solve challenging problems in the visual recognition space . info at stylewok

    ReplyDelete
  7. How would I contact Ron Bekkermam for an internship at LinkedIn?

    ReplyDelete
  8. His email is: rbekkerman@linkedin.com
    And tell him I sent u! :-)

    ReplyDelete
    Replies
    1. Thanks for posting the internships. I will contact Ron as well telling him you sent me as well :)
      Also, good job with GraphLab project!

      Delete
  9. Hi Danny,
    I know this might be a long shot but if you hear about internships in ML in India, do put them up too.
    Thanks

    ReplyDelete
  10. Wonderful post. Note that Knewton link is not working now.

    ReplyDelete
  11. How about a similar post for 2013 :)

    ReplyDelete
  12. Hello! I am a junior, statistics major at Carnegie Mellon University. Thank you so much for your updates. I am also looking for data scientist/analyst internship this summer and very interested in LinkedIn!

    ReplyDelete
  13. Hi Danny,
    I recently graduated with a MS in Computational Applied mathematics and Statistics from Stony brook University.
    Do you know of any current opportunites in Machine Learning/Statistics domain.

    Thanks

    ReplyDelete
    Replies
    1. Take a look here: http://bickson.blogspot.com/2012/11/the-worst-coolest-machine-learning.html for finding this year's opportunities. If you like send me your cv and I can try to help...

      Delete
  14. Hi
    I am a third year undergraduate student at Indian Institute of Technology, Kharagpur India looking for a 2 month internship in Machine Learning for 2014 summers. Any suggestions ?

    ReplyDelete
    Replies
    1. Hi Harsh,
      I am not aware of relevant internships - I guess that for graduate students it will be easier. But anyway the field of big data and large scale machine learning is very hot in the foreseeable future so I definitely recommend to target this field.

      Delete
  15. This comment has been removed by the author.

    ReplyDelete
  16. Hi I am a CS graduate student at stony brook university. I am looking for internship in machine learning, could you please suggest some current opportunities in this field ?
    Thanks
    Akshay

    ReplyDelete
  17. Hello Danny,
    I am an undergrad student at Birla Institute of Technology and Science, Pilani, India, pursuing my major in Electronics and Instrumentation. I have a strong background in machine learning and core concepts in data analysis. I have also completed respective courses on Coursera. Could you please help me find an internship in this field, preferably in India for summer 2014??

    Thanks

    ReplyDelete