Atenţie! Aceasta este o versiune veche a paginii, scrisă la 2016-05-04 21:58:51.
Revizia anterioară   Revizia următoare  

All aboard the deep learning train: Alien Labs

Cosmin
Cosmin Negruseri
05 mai 2016

I've been thinking about deep learning ever since I saw Andrew Ng's talk at Google back in 2013, but somehow never made much progress other than talking to people.

Deep learning has recently ~2010-2012 become very effective across a large set of problems where progress was very slow. It would be a shame not spending some time on it.

I think the infoarena community should move some of it's focus from solving algorithms puzzles to machine learning and deep learning in particular.

Here are some of my notes on deep learning:

Anecdotes:

  • Google voice was unusable for me a few years ago, now it gets my bad English accent.
  • DeepMind f*king solved Go. A game that 2 years ago was thought to be 10 years out of reach.
  • Object detection is real time, one can use it for self driving cars, the radar solution might be getting outdated.
  • Baidu is working on speech recognition for mandarin (lots of illiterate people with phones).
  • Word embeddings give you natural language models that do away with the tweaks one used to encode all sorts of idiosyncrasies in the English language.

OpenAI CTO: "As Ilya likes to say, deep learning is a shallow field — it's actually relatively easy to pick up and start making contributions.

  • short learning curve, deep learning techniques started to be effective recently (2010? 2012?)
  • previous experience in the old techniques is not very relevant
  • a home setup is enough to start (a google datacenter of GPUs would help)
  • state of the art results significantly improving on the previous state of the art
    (in real time object detection, speech recognition)

Interesting concepts:

  • Word2Vec - neat idea that maps words to points in n dimensional space. Then you can do algebra on the vector representation:
    vec(“king”) – vec(“man”) + vec(“woman”) =~ vec(“queen”), or vec(“Montreal Canadiens”) – vec(“Montreal”) + vec(“Toronto”) resembles the vector for “Toronto Maple Leafs” (Mikolov, Tomas; Sutskever, Ilya; Chen, Kai; Corrado, Greg S.; Dean, Jeff (2013). Distributed representations of words and phrases and their compositionality)
  • Back propagation - I see it as a dynamic programming technique that works well for the neural network setup to compute partial derivatives one needs when running gradient descent
  • Convolutional neural networks - a convolutional layer reuses the same k weights instead of having k^2 weights between 2 layers (this concept makes sense in image input and makes algorithms much faster)
  • Rectified Linear Units f(x) = max(0, x) work much better than the historically used sigmoid and hyperbolic tangent functions
  • Dropout - some neurons are ignored with a set probability, inspired by how the neurons in the brain fire with some pro

Mircea Pasoi and Cristian Strat, after their successful stint at twitter, recently founded Alien Labs. They use deep learning to build intelligent chat bots for an office environment. This is an awesome opportunity to work together again. It's been 8 years since we last did. At Google, we worked on an ads inventory management problem using network flows Our claim to fame is that we got help from Cliff Stein, the S in CLRS :). This is also an opportunity for me to jump on the deep learning train tackling real world problems.

Yesterday I've started at Alien Labs based in San Francisco. The first thing I'm going to work on is figuring out if two questions are similar. This should be fun!

I'll follow up with a few posts on deep learning that may encourage you to try it if you haven't already.

Categorii: