Jean-François Gagné, the founder of leading AI company Element.AI, calculated that there are fewer than 10,000 people in the world currently qualified to do state-of-the-art AI research and engineering. Many of them are gainfully employed and hard to poach. If you’re looking to recruit machine learning graduates, the head of a prominent Silicon Valley AI lab recently admitted to me that American universities only graduate about 100 new researchers and engineers each year who have the requisite skills to be hired.
The high demand for specialized AI talent coupled with the painfully low supply means that companies need to adopt fundamentally different recruiting strategies. NYTimes recently highlighted how freshly minted Ph.D.s and masters students with just a few years of experience are paid $300,000 to $500,000 a year or more in salary and stock. Wealthier firms can afford to throw money at the problem by acqui-hiring AI startups at $1 to $5M per engineering head. Based on a study of public job listings among US employers, the top 20 AI recruiters, led by Amazon, Google, and Microsoft, spend more than $650 million annually to woo elusive researchers and engineers.
What should you do if you don’t have the deep pockets to go head-to-head against the Googles and Amazons of the world? We spoke to technical leaders from companies big and small who have successfully built strong machine learning teams despite being comparatively disadvantaged. First, make sure you’re not doing these seven things to scare off the AI talent you’re trying to hire. Then, whether you’re a new startup or an established enterprise looking to build out an AI team, these tips may help your company stand out in the noisy and competitive AI job market and hire the machine learning engineers you desperately seek.
Many companies struggle just to understand what “artificial intelligence” is, much less the myriad of titles, roles, skills, and technologies used by experts. Titles and descriptions for AI roles vary from company to company and standard terms are not well established in the industry. However, most of the AI / ML roles you encounter will resemble the following:
Machine Learning (ML) Engineers
These specialized engineers deploy models, manage infrastructure, and run operations related to machine learning projects. They manage databases and build the data pipeline and infrastructure necessary to productionize code (they make computer code usable by the end user). As their title indicates, they are “engineers” who apply machine learning and data science algorithms. You can consider them akin to structural engineers but in the business of building structure for data science.
Data scientists typically work in an offline setting and do not deal directly with production experiences (what the consumer/end user sees). They focus on discrete problems using preexisting data to substantiate models. They tend to have PhDs in data science or statistics, or backgrounds in computer science, math, and physics. Though not fluent in applied research, they usually work in statistics or research languages like R.
Researchers & Research Scientists
Researchers tend to focus more on greenfield technologies. Rather than tackle specific problems using data, they take freedoms to explore where the data leads. They establish hypotheses regarding particular use cases and build models to test them. They are less applied and closer to the fundamental scientists in university research labs.
Applied Research Scientists, Applied Research Engineers
These individuals straddle engineering development and research. They typically have backgrounds in data science and computer science. Depending on the individual, some are more on the research and analysis side, and others are more on the engineering side. They usually work in C++ and Scala.
Data & Distributed Systems Engineers
This role is not explicitly focused on machine learning, but is a vital complement to the ML team. Given the vast amounts of data and computation power required, most ML models face distributed scale issues. A talented infrastructure person can resolve challenges associated with large data sets, allowing researchers and data scientists to focus on their models rather than data infrastructure issues.
The composition of an ML project team varies depending on the nature and timing of the project. Projects in fundamental research require more data and research scientists, whereas projects closer to production will require more applied researchers and infrastructure engineers.
The skills for successful careers in ML are different than those in traditional software development. Software typically has structured tasks with well-defined timings of delivery and release. Once a program is completed, engineers usually roll off to another project. In contrast, machine learning is highly exploratory and experimental with less clear timelines. Algorithms require ongoing support, training, and feedback to perform optimally, and need someone to refine the model continuously. These are the essential qualities to look for when hiring for AI talent.
Mathematical Aptitude: A background in mathematics and statistics is far more valued in ML than in traditional software engineering. Training ML models requires knowledge to understand which algorithms to apply and how to interpret and improve upon results. For cutting-edge AI research positions, advanced mathematical intuition is a prerequisite in order to design and develop novel methodologies.
Curiosity: Inquisitive people will constantly ask “why?”, a fundamental trait in training ML algorithms. Individuals in this space need to take abstract information and make sense of it through continuous experimentation. This person will need to enjoy learning new strategies and taking on new challenges.
Creativity: As the tools and methodologies are still relatively new, the ability to think through ideas and come up with novel ways to tackle a problem is highly valued. There will inevitably be many challenges that require new perspectives and solutions.
Perseverance: ML research is an ever-evolving pursuit. There are few simple answers and it may easily take months to train an algorithm successfully. A relentless and persistent individual will not give up and will continue trying new techniques until she can find a solution.
Rapid Learning: AI is evolving at a rapid rate and it is critical to keep up with the latest published literature. New algorithms mean that even those who have been AI work for years may not be up to date on the new strategies. Furthermore, given the lack of experienced AI talent, employers are broadening their hiring radius to trainable recruits. Companies are creating training programs to retrain existing engineers or bring on junior staff and train them.
Passion For Your Problems: “We get plenty of resumes from people with talented machine learning and data science backgrounds,” says Zhen Jiang, Lead Analytics Supervisor at Ford. “What I am much more concerned about is whether they have a passion for cars and mobility.” Talented engineers and researchers can go to any company in any industry that they want. Focus on finding the ones that are particularly excited by the unique problems you face and competitive datasets you own in your industry. Check whether they have done past research or projects related to your space, and seek out talent at topical events that attract enthusiasts and a more focused audience.
Cast A Wide Net
The prevailing strategy these days when looking to hire junior level machine learning engineers is to cast a wide net. Companies tend to widen their scope beyond traditional engineering backgrounds into other sciences for talent. According to Cole Shiflett, Head of People Operations at technology company ThoughtSpot, most junior hires for AI work tend to be young and well-educated, but relatively inexperienced. “They don’t necessarily have that background or experience where we can easily go through a list of checkboxes.” Companies tend to look for adaptive learners with a commitment to tackling hard challenges. They look for people with an interest in the space and are also excited to learn from more senior engineers.
Exploit University Partnerships
This preference for young and ambitious learners makes university partnerships another powerful means of recruitment. Companies will pitch ideas for and sponsor student projects. This mutually beneficial scenario allows businesses to identify top young talents and lets participating students experience machine learning work firsthand. It acts as a pipeline straight from academia to industry. For example, SnapLogic, a software company headquartered in San Mateo, California, sponsors projects at the University of San Francisco (USF). Successful students progress from academic work to a paid internship to eventual post-graduate employment. These partnership programs have become so popular that there are more companies proposing projects than students ready to staff them. It has become highly competitive for companies; companies must now offer exciting projects and articulate the benefit for the student teams to attract students.
Host A “Hackathon”
A hackathon is an event where people with technical backgrounds come together, form teams around a problem or idea, and collaboratively code a unique solution from scratch. They are increasingly being used to identify top coding talent and quick-thinking creatives. ADP hosted successful hackathons at Georgia Tech and followed up with targeted recruiting events.
Recruit From Specialized Training Programs
In order to meet the rising demand for machine learning talent, many education programs have emerged to train junior talent and help them find job placements. Abhi Jha, Director of Advanced Analytics at McKesson, initially hired data science students from Galvanize, a technical skills training provider. “We’ve had a lot of success hiring from career fairs that Galvanize organizes, where we present the unique challenges we solve in healthcare,” he adds.
Hiring experienced data scientists and machine learning researchers requires a different approach. For these roles, employers typically look for a Ph.D. or extensive experience in machine learning, statistical modeling, natural language programming, or related fields. Companies source these talents through network connections, academic papers, and academic conferences. To this end, many companies partner with universities or research departments and sponsor conferences to build their brand reputation.
Companies also host competitions on platforms like Kaggle; companies provide a problem, dataset and prize purse for competing teams. This is a way to get international talent working on a problem and to build a reputation of a company that supports AI.
As with any industry, A-level AI talent attracts other A-level AI talent. Dominant tech companies build strong AI departments by hiring superstar leaders. For example, Google and Facebook drew university professors such as Geoffrey Hinton, Fei-Fei Li and Yann LeCun with plumb appointments and endless resources. These professors either take a sabbatical from their universities or split their time between academia and corporate.
Retrain Existing Engineers
Many companies are updating the skills of current employees rather than go on the hunt for new talent. What existing engineers lack in AI and ML training, they might make up for in loyalty and company knowledge. There are many processes by which to update an employee’s skill set. Larger firms can offer corporate training programs. Both Google and Facebook are doing this, and even their internal programs are full. Smaller businesses bring in external trainers. Alternatively, some companies provide employees paid access to extended education courses offered by online platforms like Coursera and Udacity or by local universities. Apprenticeships are another way for engineers to add to their existing skill sets. Mentoring programs with more senior engineers or with data scientists can lead to fruitful partnerships.
Find Third-Party Solutions
Despite every effort to build your own in-house machine learning talent, the process is likely to be onerous and slow. To meet business needs in the near-term, consider evaluating third-party solutions built by vendors who specialize in applying AI to enterprise functions. Both startups and established enterprise vendors offer solutions to address common pain points for all departments, including sales and marketing, finance, operations and back-office, customer support, and even HR and recruiting.
At the end of an interview cycle, a strong AI candidate will typically have multiple offers in hand. To close the candidate, a company needs to differentiate itself. In addition to compensation, culture and other general fits criteria, AI talent tends to evaluate offers on the following areas.
Availability of data. Candidates want to be able to train their models with as much data as possible. The data should go back many years if possible and be real rather than inferred data.
Quality of data. The data is ideally clean and annotated. In a recent survey of data scientists, 57% reported that data cleaning was the least enjoyable part of their job and that it accounted for 60% of their time.
Diversity of problems. Companies with smaller data stores can appeal to applicant’s intellectual side by offering different challenges to solve.
Quality of the team. A level players want to work with other A players. Offering junior candidates the opportunity to work with established experts or offering experts the best and brightest of the new recruits appeals to both parties.
Impact of work. Candidates want to see their work have meaningful impact and be core to a business’s success in a reasonable timeframe. Small companies that move quickly from idea conception to algorithm development can show the impact of work faster. Larger companies with millions of customers can show impact in the by the number of people an algorithm affects.
This article was originally posted on TOPBOTS on November 1, 2017.