TDWI Blog

TDWI Blog: Data 360

Blog archive

Growing Use Cases for Learning R and Python

By Fern Halper, VP Research, Advanced Analytics

There was a time when choosing a programming language for data analysis had essentially no choice at all. The tools were few and they were usually developed and maintained by individual corporations that, though they ensured a reliable level of quality, could sometimes be quite difficult to work with and slow to fix bugs or innovate with new features. The landscape has changed, though.

Thanks to the Web, the open source software development model has shown that it can produce robust, stable, mature products that enterprises can rely upon. Two such products are of special interest to data analysts: Python and R. Python is an interpreted, interactive, object-oriented scripting language created in 1991 and now available through the Python Foundation. R, which first appeared at roughly the same time, is a language and software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing.

Each comes with a large and active community of innovative developers, and has enormous resources readily available through libraries for analytics and processing—libraries such as NumPy, SciPy, and Scikit-learn for data manipulation and analytics in Python, and purr, ggplot2, and rga in R. These features can go a long way to speeding time-to-value for today’s businesses.

Python pro Paul Boal and long-time TDWI instructor Deanne Larson (who will both be bringing their Quick Camps to Accelerate Seattle) both see the value in learning these tools.

“R is very popular with data scientists,” Larson says. “They’re using it for data analysis, extracting and transforming data, fitting models, drawing inferences, even plotting and reporting the results.”

Scaling to the Enterprise Level

Debraj GuhaThakurta, senior data and applied scientist at Microsoft, warns that “Although R has a large number of packages and functions for statistics and machine learning,” he says, “many data scientists and developers today do not have the familiarity or expertise to scale their R-based analytics or create predictive solutions in R within databases.” Work is needed, he says, to create and deploy predictive analytics solutions in distributed computing and database environments.

And although there are adjustments to be made at the enterprise level to integrate and scale the use of these technologies, the use cases continue to grow. According to Natasha Balac of Data Insight Discovery, Inc., the number of tools available for advanced analytics techniques such as machine learning has exploded to the level where help is needed to navigate the field. She chalks it up to increasingly affordable and scalable hardware. “It’s not limited to any one vertical either,” Balac says. “Users need help integrating their experience, goals, time lines, and budgets to sort out the optimal use cases for these tools—both for individuals and teams.”

There are significant resources available to companies and data professionals to close the gap between skilled users and implementation. TDWI has integrated new courses on machine learning, AI, and advanced analytics into events like TDWI Accelerate. The next event will be held in Seattle on October 16-18, check out the full agenda here.

Build your expertise. Drive your organization’s success. Advance your career. Join us at TDWI Accelerate, October 16-18 in Seattle, WA.

Posted on July 26, 2017


Comments

Average Rating

Add your Comment

Your Name:(optional)
Your Email:(optional)
Your Location:(optional)
Rating:
Comment:
Please type the letters/numbers you see above.