Vadim Markovtsev

Source Code Abstracts Classification Using CNN

Convolutional neural networks (CNN) are becoming the standard approach for many machine learning related problems. Usually, those problems are related to images, audio or natural language data. At source{d} we are trying to apply the common and novel deep learning patterns to the problems with software developers and projects as the input which is something very different. We are standing at the beginning of our fascinating journey, but already have something to share. In this particular talk I am going to present the bits of our SourceNN deep neural network that enable classification of short source code fragments (50 lines) taken randomly from several projects. The input features are extracted by a syntax highlighter and look similar to minimaps in source code editors.

Currently Vadim is a Senior Machine Learning Engineer at source{d} where he works on DNN's that aim to understand all of the world's developers through their code. Vadim is one of the creators of the distributed deep learning platform Veles ( while working at Samsung. Afterwards Vadim was responsible for the machine learning efforts to fight email spam at Mail.Ru. In the past Vadim was also a visiting associate professor at Moscow Institute of Physics and Technology, teaching about new technologies and conducting ACM-like internal coding competitions. Vadim is also a big fan of GitHub (vmarkovtsev) and HackerRank (markhor).

Buttontwitter Buttonlinkedin
This website uses cookies to ensure you get the best experience. Learn more