How To Find and Hire Data Scientists

Dobrin was hired by IBM two years ago to build out the Data Science Elite … data science and AI, and isn’t just jumping on the big data bandwagon.


So you’re building a data science team. That’s great news! As a business leader, finding a qualified data scientists is a critical step in your company’s ability to harness big data and machine learning technologies, which is a competitive advantage. But it’s also a process fraught with difficulty and pitfalls. We reached out to data science leaders to get their thoughts on the matter.

One of the most important steps to building a successful data science team is hiring a senior data scientist who can lead the further development of the data science team, says Seth Dobrin, who heads up IBM‘s Data Science Elite Team.

“Until you get a credible senior person in your organization that’s a data scientist, it’s hard to get others to come on board,” says Dobrin, a PhD with more than 20 years of experience in data science fields. “There are some clients that just can’t find talent.”

Dobrin was hired by IBM two years ago to build out the Data Science Elite Team, which is a new endeavor where IBM data scientists engage with organizations in six to 12 week engagements to collaborate on data science and AI projects. The service is free to customers, although there are some requirements (like your willingness to serve as a public reference).

After travelling the world to meet with IBM clients for a year, Dobrin set out to assemble the team, which currently consists of 60 data scientists, machine learning experts, and others with related expertise. Dobrin is currently looking to hire 30 more data scientists this year – which means he might be competing with you.

Respecting Elders

In Dobrin’s view, hiring a senior data scientist signals to other prospective data scientists that the company is serious about data science and AI, and isn’t just jumping on the big data bandwagon. The newly hired leaders will also be able to exercise their own professional networks to fill out the data science team, much as Dobrin has done himself.

Hiring a senior data scientist is a good way to get the ball rolling

“The hardest part is getting that first person in who has that deep network, who can bring in additional talent,” he says. “We all work for people. We don’t work for companies. We go change jobs to go work with someone, not necessarily to work for a specific company.”

In some situations, the company will rely on the senior data scientist for setting its data science and AI strategy. Ideally, however, the company will already have ideas where they want to apply data science and AI technologies and techniques, and the senior data scientist is brought in to execute those ideas with process and rigor.

“Ideally it comes from the top,” Dobrin says of the idea-creation process. “In an ideal situation, it’s the CEO. That’s a rare situation though. Usually it’s one or two people who get it. It’s the CIO or CFO or CMO who gets it, that starts pushing us and starts driving it within the company and getting the resources.”

If a company is struggling to come up with the big idea, there are plenty of consultancies that can help with that. System integrators like Deloitte, KPMG, E&Y, and PwC all have large staffs of data scientists and others who are expert at analyzing business models and figuring out where data can give them a boost.

Headhunters can also be hired to bring in an experienced data scientist to get things started. “It is a little bit of a chicken and egg problem if you don’t know how to build that value proposition,” Dobrin acknowledges.

Dobrin used familiar tools and channels to work his network, including making phone calls, emails, and LinkedIn. Getting the right job description on job boards is critical to clearly communicating the role, and acting quickly on nibbles is also important to hooking the big fish.

“If you take six weeks to go through an interview process from first contact to offer, you’re going to lose people,” he says. “My goal is 10 days.”

Skills Matter

When it comes to specific technical skills, there are a few areas of expertise that are absolutely critical, such as Python. If you don’t know Python in this day and age, you had better have a hard-to-find talent in some other area. Apache Spark has become a critical tool in many data scientists’ toolboxes, so being familiar with how to use it is important. R is still a popular language for data science too.

On IBM’s Data Science Elite Team, XGBoost has become the go-to algorithm for traditional machine learning problems, thanks to its power, tunability, and forgiveness of overfitting, according to Dobrin. “There’s a constant barrage of new tools, methodologies, and packages that are out there that people just need to be up to date on,” he says.

Graduating from a data science bootcamp is a good start, but it’s not enough to consider yourself a full-fledged data scientist, says Pedro Alves Nogueira, who heads up the data science and AI business at Toptal.

“There are not a lot of people in the market with proven experience,” says Nogueria, who has a PhD in AI, human-computer interaction and affective computing from the University of Porto in Portugal. “Doing a bootcamp on AI and data science is probably not going to be enough for you to be a data scientist. It’s good enough for you to learn a skill…but it’s not going to give you the basic mathematical knowledge.”

Toptal prides itself on having the top 3% of talent – hence “Toptal” – in a given field of development. The company started by offering developers on demand for general application and Web development. As more clients looked to Toptal for data science and AI expertise, the company decided to formalize its data science and AI business by creating a dedicated department.

Education Rules

While data science is becoming more automated by the day, it’s still critical that a data scientist know how machine learning models work at a deep level, and to be able to build them by hand, if necessary, Nogueria says.

“We allow developers to use whatever technology they want,” Nogueria says. “What we’re most interested in is having the fundamental ability to understand the models and to implement them from scratch.”

(Courtesy: Western Digital)

Becoming one of Toptal’s data scientists or AI experts is a rigorous process, Nogueria says. The first step is ensuring that the prospective data scientists are proficient in English, which is important considering that the company heavily recruits from eastern and southern European universities. Next, they must prove their mathematical chops by solving a series of ML and AI problems.

“Then somebody on existing screening team, which is himself or herself a senior AI or data science developer who has been working with us for two years, [takes the prospect] through a live coding session,” he says. “Then you have to do a two-week sample project that you have to present to us as if we’re the client. We spend a lot of energy and time to make sure they really know what they’re talking about.”

Ultimately, pairing a Toptal data scientist with a specific client takes careful analysis of the business outcomes that are sought and the capability of the worker to fulfill the technical requirements.

“It’s not just about building models,” Nogueria says. “It’s about knowing what you’re building and making sure what you’re building is intelligible to people who are going to be using it, and that it is solid and useful for the business itself.”

Once you find a good data scientist, retaining them is also important. Providing good data science problems that impact the bottom line is arguably the best way to keep them around. Giving them the freedom to learn new technologies and techniques is also important. Of course, offering them a competitive salary and good benefits are critical too.

Related Items:

How To Build a Data Science Team Now

What Kind of Data Scientist Are You?

Taking the Data Scientist Out of Data Science

IT managers look to deploy BI, analytics in the cloud

IT managers are looking to position BI and analytics in the cloud. … Big data platform/enterprise data warehouse/data lake (26.5%) rounded out the top …

IT managers are looking to position BI and analytics in the cloud.

That’s according to TechTarget’s 2019 IT Priorities Survey, which asked a total of 624 IT professionals from a wide range of industries in North America about what’s on their to-do lists.

The survey posed the following question to respondents: Which of the following applications are you most likely to deploy in the cloud this year? Among the 231 IT professionals who responded to this question, the top response was BI/analytics (27%), followed by customer relationship management (23%) and big data platform/enterprise data warehouse/data lake (21%). In last year’s survey, respondents said they were most likely to deploy CRM (34%), ERP (29%) and business process management (27%) in the cloud.

Respondents who plan on deploying data warehouses, data lakes, BI and analytics in the cloud this year are in alignment with a growing enterprise trend, according to experts.

Jen Underwood, senior director at machine learning software vendor DataRobot, said the results are “not at all surprising.”

“Analytics in the cloud is usually ahead of on-premises offerings,” Underwood said. “With rapid weekly updates, on-demand scale, speed and ease of simply getting things done, cloud is a no-brainer for many organizations outside heavily regulated industries.”

Application deployments planned for 2019

Isaac Sacolick, president of StarCIO and author of Driving Digital, said, to drive efficiencies, improve customer experience and target optimal markets, organizations are going “from centralized BI functions to more distributed analytics teams supported by citizen data scientists using self-service BI tools.” Deploying BI and analytics in the cloud is a sensible next step in that transition.

With rapid weekly updates, on-demand scale, speed and ease of simply getting things done, cloud is a no-brainer for many organizations outside heavily regulated industries.
Jen UnderwoodSenior director, DataRobot

“Cloud offerings enable organizations to quickly and more easily ramp up BI tool usage, provide access to more data sets and scale usage of produced analytics with less effort by IT to enable and support infrastructure,” Sacolick said. “IT is then better poised to partner with the business on data governance, integration and modeling initiatives that fuel ongoing analytics needs.”

Underwood and Sacolick aren’t alone in their thinking. Feyzi Bagirov, data science advisor at B2B data insight vendor, said he also is seeing more organizations deploying BI and analytics in the cloud, but that the trend is still in the early stages. He cited 2018 Gartner research that found on-premises deployments still dominate globally, ranging from 43% to 51% of deployments.

Data governance, predictive analytics are priorities

The 2019 IT Priorities Survey also asked respondents what information management initiatives their companies will deploy in 2019. Among the 215 IT professionals who responded to this question, the top response was data governance (28.8%), followed closely by predictive analytics (27.9%) and data integration (27.9%). Big data platform/enterprise data warehouse/data lake (26.5%) rounded out the top four.

Bagirov said he thinks these results more or less align with enterprise trends. He said that priorities may vary by industry — companies in the financial sector might be more inclined to push data lake initiatives, for example.

Data governance and integration will top IT professionals’ objectives this year, Bagirov said. “Those are the steps that are essential before predictive analytics can be scaled up,” he said.

Management initiatives planned for 2019

As for Underwood, she said the European Union’s recent rollout of GDPR likely influenced data governance’s top placement in the survey. Governance probably won’t be as prominent next year, though, she said.

“In my machine learning and artificial intelligence work … I am seeing early adopters achieve astounding results that I have never seen happen throughout my entire 20-plus-year analytics career,” Underwood said. “The artificial intelligence gap is already being exploited as a game-changing competency for competitive advantage in the algorithm economy. As a result, I forecast predictive analytics to be No. 1 on your ranking next year. Artificial intelligence and automation is changing analytics as we know it today.”

Related Posts:

  • No Related Posts

AI, cross-industry collaboration will continue to reshape healthcare in 2019, Optum says

At HIMSS19, cross-industry collaboration and adoption of emerging technologies are two health IT trends that will be the focus of Optum, a health IT …

At HIMSS19, cross-industry collaboration and adoption of emerging technologies are two health IT trends that will be the focus of Optum, a health IT vendor whose products include analytics, population health and pharmacy care.

Health systems and physician practices have a long history of information-sharing to support population health goals and improve the patient experience, said Mark Morsh, vice president of technology at Optum, and that trend will only accelerate in 2019.

Practical results of emerging tech

“In the past year, we have seen a concerted effort to convene stakeholders on emerging technology capabilities like blockchain and AI/advanced analytics, which can improve the effectiveness and efficiency of care,” said Morsh.

“This year, our presence at HIMSS shows how emerging technology can have practical results, taking a closer look at personalized medicine, AI and machine learning, and IoT, and how these advancements can improve financial performance and population health, enable risk-based reimbursement programs, and modernize the military and veterans’ health systems,” he added.

Those might appear to be lofty goals, but technology has improved how stakeholders in healthcare safeguard, interpret and share a trove of healthcare data. The industry often focuses on the power of technology, he said, but it’s really about clinicians’ ability to apply it in ways that allow them to practice at the top of their license.

A standout 2019 health IT trend for Morsh is the fast pace at which healthcare organizations are adopting emerging technologies, such as artificial intelligence and advanced analytics.

“We recently surveyed 500 healthcare executives, and three out of four indicated they’re in the process of, or are going to, implement an AI strategy,” he said. “More than 91 percent of respondents also expressed confidence that their organization will see a full return on investment in about five years. We have not seen that kind of momentum since the wave of electronic health record adoption earlier this decade.”

The proliferation of data, including social determinants paired with more traditional claims and clinical data, and the ability to mine that information with new tools, is yielding new insights about cost, quality and access, and opportunities to improve, he added.

“What I’m most excited about is the way this can move us closer to a denial-free future where care is personalized and where we can redeploy resources to improve the experience of patients and families,” he said.

Let technology do it

Morsh believes people in general have reached a point in time where humans are more comfortable allowing technology to augment their day-to-day experience and enhance their decision making.

“Although there are some generational differences, think of how quickly healthcare organizations have moved to adopt devices like tablets and smartphones,” he noted. “The reality now is that technology is central to the care delivery experience for providers and patients and the proliferation of devices, data and connectivity is changing how individuals work and receive care.”

“To be successful in a business setting, investments in advanced analytics and AI must have a defined objective and align with your overall technology strategy.”

Mark Morsh, Optum

On a related front, healthcare faces a talent shortage, especially when it comes to data science, he said.

“It’s difficult for health systems and physician practices that are worried about margins and changing reimbursement to compete for talent in an increasingly competitive and expensive global marketplace,” Morsh explained. “That’s one reason we are seeing a renewed emphasis on industry collaboratives and vendor partnerships to solve large-scale technology strategy and delivery needs.”

It’s also a reason that organizations are looking for practical applications of data and analytics – to make them more efficient and increase overall performance, he said.

Advice for HIMSS19 attendees

Asked what he would advise HIMSS19 attendees, Morsh stuck with his big theme: collaboration and AI.

“Collaborate for a shared vision: Healthcare is a very specialized industry and evidence-based decision making is at the core of our DNA,” he said. “For that reason, it’s really important to build the right team of multidisciplinary professionals. Sometimes you’ll be able to find the right skill set within your organization, but more often than not, success requires bringing in individuals from the outside, who can share a fresh perspective and keep you current with market movements.”

That said, Morsh recommends that healthcare organizations make sure the organizations they partner with have enough grounding in healthcare to understand their business models and operating environment so they don’t end up with technology investments that have little practical value.

“And focus on AI with ROI,” he added. “To be successful in a business setting, investments in advanced analytics and AI must have a defined objective and align with your overall technology strategy. One of the most interesting developments in healthcare is that more applied uses of artificial intelligence, like natural language processing for revenue cycle management or deep learning models that support disease prediction.”

These use-cases must be backed by a business case and defined problem where technology can augment the human – patient, provider or administrator – experience, he said.

Twitter: @SiwickiHealthIT

Email the writer:

HIMSS19 Preview

An inside look at the innovation, education, technology, networking and key events at the HIMSS19 global conference in Orlando.

Related Posts:

  • No Related Posts

There is enough appetite for innovation: Craig Stires of Amazon Web Services

Big Data, Internet of Things (IoT), Artificial Intelligence (AI) or Machine … Head of Analytics, AI, Big Data at Amazon Web Services, Asia Pacific, tells …
big data, appetite for innovation, Artificial Intelligence, machine learningbig data, appetite for innovation, Artificial Intelligence, machine learningCraig Stires, Head of Analytics, AI & Big Data, Amazon Web Services (Asia Pacific)

Big Data, Internet of Things (IoT), Artificial Intelligence (AI) or Machine Learning (ML) are buzzwords in tech circles these days, but are CIOs in sync with these technologies? Especially in emerging economies such as India? “We see CIOs wanting to make investments that are going to fundamentally change the direction of their companies. Whether you hear Big Data, IoT, or AI/ML, there is some really interesting technology behind these and there are some really interesting methodologies that have come to the forefront,” Craig Stires, Head of Analytics, AI, Big Data at Amazon Web Services, Asia Pacific, tells Sudhir Chowdhary in an interview. Excerpts:

Are investments really happening in some of these niche technologies?

CIOs, today, aren’t just investing in Big Data to understand its underlying potential. They believe that connected devices have something really interesting in helping them listen to their customers’ needs. Previously, they may have had to spend extensively to get started on a project. With AWS, instead of having to lay out millions of dollars just to get started, they can simply pull in a month of connected device data on demand, test and see if they are on the right track and do something that’s fundamentally right for their business. The reality is that the information and the technical processes that are available to CIOs today may not have been available five years ago.

What is your vision for helping businesses on their digital transformation journey?

One of the things that we really encourage is that each customer’s starting point should be different. There is a methodology that we use at Amazon when we innovate. It’s called the ‘working backwards from the customer’ approach. Everything we do starts with our customers and works backwards from there. Roughly 90-95% of our road map is driven by what our customers tell us. We start with the specific customer outcome, working backwards to a minimal technology implementation to start with and then, only scaling it if it is delivering results. We are trying to build relationships with our customers that outlast us all.

READ ALSO | TRAI launches app for DTH, cable TV viewers; select your channels and find out your total bill

How is the data analytics market turning up in emerging markets such as India?

If we segment the Indian market based on domestic opportunities, and the ability to serve international markets, there are two different speeds that are running within India. On the domestic opportunities front, there is a vast amount of information that people are creating and there is an opportunity to provide personalised experiences for customers, where earlier, there might not have been enough information about each customer.

If you look at what is doing in this space now—it is able to provide personalised match-making services, driven from the ability to observe not just the behaviours on the website but also looking at some larger group dynamics. Then, there are companies that service the global markets, like Punchh which competes with all the big players, using the most advanced machine learning algorithms and the most advanced fully-managed Hadoop services and Data Warehousing. Therefore, I believe that data analytics is relevant for the companies that are operating in the domestic market as well as for the Indian companies that are servicing global markets.

To what extent do you think businesses are adopting Big Data or IoT?

What tends to happen when enterprises adopt an emerging technology or methodology like Machine Learning, Big Data or IoT, is that there is going to be one part of the organisation that does really well. Let’s take the telcos, for example. They have been struggling to fully adopt these new technologies because sometimes, they look at it as trying to solve a million-dollar problem when they should be thinking about billion-dollar problems. So, these innovation projects have had a hard time flourishing in the industry. However, we are seeing faster adoption even though large organisations have traditionally not been able to implement these too quickly.

There is enough appetite for innovation. The digitally native or emerging enterprise customers tend to adopt the new technologies, faster.

Where is the demand for data analytics coming from?

I cannot find an industry that doesn’t have the appetite for it. For instance, retail organisations are completely driven by purchase behaviour, so they move faster. Banks have only been waiting on regulations to be updated. Now that more and more regulatory agencies are saying that they understand how to move safely to the cloud, banks have also started to move, quickly.

If you look at the National Australia Bank (NAB), they have built four Cloud Guilds so they don’t just have a few people who are building, they are now enabling the whole workforce to think and build. The adoption is even expanding across other industries, like the construction industry, which has always been manually driven. They have all realised that in order to evolve they have to start adopting new technologies. So we see adoption across all industries, especially the ones that are consumer driven.

Get live Stock Prices from BSE and NSE and latest NAV, portfolio of Mutual Funds, calculate your tax by Income Tax Calculator, know market’s Top Gainers, Top Losers & Best Equity Funds. Like us on Facebook and follow us on Twitter.

Related Posts:

  • No Related Posts

Elected Leaders Need Operations Research and Analytics to Deliver Better Results from …

What could be more important for our elected leaders than using the best data and insights to make a positive difference for the American people and …

With midterm elections behind us and the 116th United States Congress ahead of us, Washington has a unique opportunity to advance operations research (O.R.) and analytics as a priority to enhance how the federal government implements public policies, makes critical decisions, and conducts its day-to-day operations. What could be more important for our elected leaders than using the best data and insights to make a positive difference for the American people and to strengthen our position vis a vis our nation’s allies and enemies?

In an era where we have vast amounts of data readily available, the challenge for government is not to access that information, but rather to understand what the data is revealing and how best to act on it. Misunderstanding or misapplying data has serious repercussions. It is imperative that government decision-makers at all levels have the advanced scientific tools necessary to make the right decisions using the right data.

One of the best proven ways to do this is through the use of O.R. and analytics. O.R. and analytics are the application of advanced mathematical tools that enable organizations to turn complex challenges into substantial opportunities. These powerful tools do not merely evaluate existing solutions to problems. More usefully, they relentlessly seek solutions that provide the best possible outcome, and thereby deliver prescriptive value to decision-makers. They do so by structuring data into solutions and insights for making better decisions which offer improved results.

In summary, O.R. and analytics offer policymakers proven scientific and mathematical processes that save lives, save money and solve problems. Some recent examples include:

  • Lieutenant Colonel Christopher E. Marks of the U.S. Army and colleagues, Tauhid Zaman of the Massachusetts Institute of Technology and Jytte Klausen of Brandeis University, found a way to identify extremists—such as those associated with the terrorist group, ISIS—by monitoring their social media accounts and identifying them even before they post threatening content;
  • The Centers for Disease Control and Prevention eradicated the last pockets of the Wild Polio virus around the world;
  • The Federal Communications Commission completed the world’s first two-sided auction of valuable low-band electromagnetic spectrum, contributing more than $7 billion to reduce the federal deficit; and
  • The Federal Aviation Administration deployed the Airspace Flow Program to improve air traffic management and reduce flight delays, saving hundreds of millions of dollars.
  • The Transportation Security Administration (TSA) created the PreCheck passenger screening program—a risk-based aviation security policy that affects more than one million passengers in the U.S. every day—enabling low risk passengers to utilize expedited airport security screening, saving the federal government one-third of a billion dollars every year.

The power of O.R. and analytics also extends to state and local governments, including:

  • The City of Philadelphia redesigned districts for its council members based upon the 2010 census to increase public engagement and minimizing gerrymandering;
  • The Pennsylvania Department of Corrections transformed the complex process of assigning inmates to one of the Pennsylvania Department of Correction’s 25 facilities, reducing a week of seven employees’ time to less than 10 minutes; and
  • The New York City Police Department created the Domain Awareness System, a robust network of sensors, databases, devices, software and infrastructure that informs a variety of tactical and strategic decisions which officers make every day, saving at least $50 million per year.

Despite these outstanding achievements, more can and must be done to expand the use of O.R. and analytics at all levels of government. As a university professor and an operations research professional who has spent his entire career working to improve and expand this field, I am certain that expanding the use of O.R. and analytics would make significant contributions to solving our nation’s problems.

There is broad, bipartisan agreement that various federal government programs and missions continue to operate under antiquated methods, accrue excessive costs and perform inefficiently. I propose O.R. and analytics as an original, powerful and proven approach to fix these problems. The correct application of O.R. and analytic principles to compelling public policy issues will provide robust insights that inform sound, reasonable and bipartisan policy making and program implementation.

Applying these practices to a wide swath of policy areas—such as predicting future outbreaks and pandemics, influencing the design and construction of tomorrow’s communities, building new, cutting-edge transportation and power generation systems, and preventing gun violence through advanced social media monitoring and interventions—will fundamentally improve the government’s fulfillment of its public service mission.

For seven decades, O.R. and analytics have delivered proven and profound value in the private sector with documented savings amounting to many billions of dollars. Now, it is time for O.R. and analytics to become the driver of a higher level of efficiency in Washington, DC.

About the Author

Nicholas G. Hall was the 2018 president of INFORMS. With 12,500 members from nearly 90 countries, INFORMS is the largest international association of operations research and analytics professionals. He is also the Fisher College of Business Distinguished Professor at The Ohio State University. He holds a Ph.D. (Management Science, University of California, Berkeley, 1986), and B.A., M.A. (Economics, University of Cambridge). His research and teaching interests include project management, scheduling and pricing. He has published 82 articles in Operations Research, Management Science, Mathematics of Operations Research, Mathematical Programming, Games and Economic Behavior, Interfaces and other journals. He has served as Associate Editor of Operations Research (1991–) and Management Science (1993–2008). His 335 presentations include 11 keynote addresses, 8 INFORMS tutorials and 98 invited talks in 23 countries.

Sign up for the free insideBIGDATA newsletter.

Related Posts:

  • No Related Posts