Happy New Year to everyone in the TDWI community! I wish you an enjoyable and prosperous year. Squinting down the path ahead, it is indeed going to be a busy year at TDWI as we roll out our World Conferences, Summits, Forums, Seminars, Webinars, Best Practices Reports, Checklists, and more. The next World Conference is coming up February 12-17, in Las Vegas. This event is always one of the major gatherings of the year in business intelligence and data warehousing, and I am looking forward to being there and interacting with attendees, exhibitors, TDWI faculty, and a few croupiers here and there.
In Las Vegas I will be helping out my colleague, Philip Russom, who is chairing the BI Executive Summit, February 13-15. This conference has a theme of “Executing a Data Strategy for Your Enterprise” and will feature a great selection of case studies, expert speakers, and panel sessions. Check out the program to see if this event is important for you to attend.
In Vegas and throughout many of our conferences this year, you will have the chance to learn about big data analytics, which is a big topic for TDWI. Big data is getting increasing airplay in the mainstream media, as evidenced by this recent New York Times column by Thomas Friedman (read down a bit, to the fifth paragraph, past the political commentary). Friedman points out that big data could be the “raw material for new inventions in health care, education, manufacturing, and retailing.” We could not agree more, and are focused on enabling organizations to develop the right technology and data strategies to achieve their goals and ambitions with big data in 2012.
Coming up for me on January 11 is a Webinar, “Mobile Business Intelligence and Analytics: Extending Insight to a Mobile Workforce.” This is coordinated with the just-published Best Practices Report of the same name that I authored. The impact of mobile devices, particularly tablets, on BI and analytics made nearly everyone’s list of key trends in 2012, and with good reason. The potential of mobile devices is exciting for furthering the “right data, right users, right time” goals of many BI implementations. Executives, managers, and frontline employees in operations such as customer sales, service, and support have clear needs for BI alerts, dashboard reports, and capabilities for drill-down analysis while on the go. There are many challenges from a data management perspective, so organizations need to examine carefully how, where, and when to enable mobile BI and analytics. I hope the report provides food for thought and perspectives that are helpful in making decisions about mobile.
I expect that this will be an exciting year in our industry and look forward to blogging about it as we go forward into 2012.
Posted by David Stodder on January 5, 20120 comments
Where is the biggest battleground today in the business intelligence and analytics software market? On the technology front, one of the main battles is in the addressable memory space of systems that feature 64-bit computing and operating system platforms. The “in-memory” revolution is upon us, and no BI or analytics vendor wants to be left out. Large memory platforms will be critical to users working with tools for big data analytics, data discovery, data visualization, and more.
While the development of large-memory computing is not really new, it took a while for the software industry to adapt to 64-bit hardware processing and operating system platforms. Throw in the difficult learning curve for creating software to work with parallel processing, and it’s easy to see why the move from older systems has taken time. When large memory and parallel processing platforms were exotic, the slow pace of adaptation might have been acceptable. Now, with mainstream systems offering up to a terabyte of addressable memory, organizations can’t wait to try them out for BI and analytics.
Traditionally, designers of these systems have had to adjust to the limits of the I/O bottleneck. The preprocessing and design work for indexing and aggregating data has been necessary because of the performance constraints involved in getting data from disk through the I/O bottleneck. If large memory systems can ease or eliminate that constraint for the majority of users’ analysis needs, then the boundaries for analytics applications can be pushed out.
Users can perform “data discovery,” asking questions that lead to more questions, without as much concern for what this iterative, ad hoc style of investigation might mean to overall performance. Unlike with BI reports that simply update standard views of data, users can engage in exploratory data inquiries without knowing exactly where they will end up. Large-memory systems can offer volumes of detailed data on systems deployed closer to users. With the right tools, line-of-business (LOB) decision makers can dive into the data to test predictive models and perform fine-grained analysis on their own rather than wait for IT’s specialized business analysts and statisticians to do it for them.
Data discovery vendors such as QlikTech, Tableau, and TIBCO Spotfire have prospered by jumping first to seize market opportunities. However, the biggest coming battle may be between SAP and Oracle. Earlier this year, SAP introduced HANA, which competes with Oracle’s Exadata by offering in-memory analytics along with traditional disk-based storage in an appliance. Oracle has been readying a response, which will most likely come at Oracle Open World in early October and be aimed at taking in-memory capabilities for BI and analytics further. In the coming year, Oracle and SAP will battle to show which vendor is better at using analytics to increase the business value of ERP investments. In-memory capabilities will make it easier for these and other vendors to deploy rich analytics for ERP that are tailored to vertical industry and LOB requirements.
Large memory is not the whole story when it comes to the future of BI and analytics. However, it is a technology trend that users will notice firsthand through deeper, more visual, and more timely data analysis.
Posted by David Stodder on September 15, 20110 comments
On airplanes, at coffee bars, at ballgames, and even while waiting out an oil change, I am, like many of you, encountering people intensely focused on their mobile smartphones and tablets. I can’t say that I’ve been nosy enough to check out whether those I’ve seen are using the devices for business intelligence, but some – at least the fellow at the oil change shop – do seem to be working with spreadsheets and charts, not just enjoying social media or entertainment. As technology and software options evolve, there’s less and less standing in the way of people using the devices for BI. The revolution is coming.
Mobile is on my mind in part because I am working on an upcoming TDWI Best Practices report, “Mobile BI and Analytics: Extending Intelligence to a Mobile Workforce.” If you would still like to participate in the research, we would be glad to have your input. The survey is still open.
Also, I recently had a chance to talk about mobile BI on a CIO Talk Radio program dedicated to this subject. The Internet-based show is aired through Voice America Business Radio and is hosted by Sanjog Aul, vice president of Programs for the Chicago Chapter of the Society for Information Management (SIM). Also appearing on the program was Howard Dresner, chief research officer of Dresner Advisory Services, and well known for his many years as the lead analyst for BI at Gartner. Howard, of course, had a lot of interesting things to say, and I enjoyed our discussion very much. If you would like to hear the program, follow this link
In my initial analysis of the TDWI survey results, I am seeing that senior executives currently dominate as users of mobile BI. This is expected; senior executives often are the first to try “the new toys” for data access and analysis. However, the survey shows that #1 benefit organizations seek to achieve from implementing mobile BI and analytics is the improvement of sales, service, and support. This indicates a strong desire to put mobile BI in the hands of frontline managers and other personnel who are in daily touch with customers.
If you have experiences with mobile BI and analytics or thoughts about how you see this technology evolving, please drop me a line at dstodder@tdwi.org.
Posted by David Stodder on September 9, 20110 comments
Delivering value sooner and being adaptable to business change are two of the most important objectives today in business intelligence (BI) and data warehouse development. They are also two of the most difficult objectives to achieve. “Agility,” the theme of the upcoming TDWI World Conference and BI Executive Summit, to be held together the week of August 7 in San Diego, is about implementing methodologies and tools to that will shorten the distance to business value and make it easier to keep adding value throughout development and maintenance cycles.
We’re very excited about the programs for these two educational events. Earlier this week, I had the pleasure of moderating a Webinar aimed at giving attendees a preview of how the agility theme will play out during the week’s keynotes and sessions. The Webinar featured Paul Kautza, TDWI Director of Education, and two Agile experts who will be speaking and leading seminars at the conference: Ken Collier and Ralph Hughes.
Agile methodology has become a mainstream trend in software development circles, but it is much less mature in BI and DW. A Webinar attendee asked whether any Agile-trained expert could do Agile BI. “No,” answered Ken Collier. “Agile BI/DW training requires both Agile expertise as well as BI/DW expertise due to the nuances of commercial off-the-shelf (COTS) system integration, disparate skill sets and technologies, and large data volumes.” Ralph Hughes agreed, adding that “generic Agile folks can do crazy things and run their teams right into the ground.” Ralph then offered several innovations that he sees as necessary, including planning work against the warehouse’s reference architecture and pipelining work functions so everyone has a full sprint to work their specialty. He also advocated small, mandated test data sets for functional demos and full-volume data sets for loading and re-demo-ing after the iteration.
If you are just getting interested in Agile or are in the thick of implementing Agile for BI and DW projects, I would recommend listening to the Webinar, during which Ken and Ralph offered many wise bits of advice that they will explain in greater depth at the conference. The BI Executive Summit will feature management-oriented sessions on Agile, including a session by Ralph, but will also take a broader view of how innovations in BI and DW are enabling these systems to better support business requirements for greater agility, flexibility, and adaptability. These innovations include mobile, self-service, and cloud-based BI.
As working with information becomes integral to more lines of business and operations, patience with long development and deployment cycles will get increasingly thin. The time is ripe for organizations to explore what Agile methodologies as well as recent technology innovations can do to deliver business value sooner and continuously, in a virtuous cycle that does not end. In Ken Collier’s words, “The most effective Agile teams view the life of a BI/DW system as a dynamic system that is never done.”
Posted by David Stodder on July 14, 20110 comments
When you’re 100 years old, as IBM is this year, it would be easy to think that you’ve seen it all. What could possibly be new to Big Blue about “big data”? In the view of Robert LeBlanc, SVP of Middleware Software for the IBM Software Group, quite a bit.
The new problem set, defined by business opportunities opening up due to the availability of new sources of information, cannot be solved with traditional data systems alone. Kicking off the IBM Big Data Symposium for industry analysts at the Yorktown Research Center on May 11, LeBlanc itemized a number of challenges, including multi-channel customer sentiment and experience analysis, detection of life-threatening conditions at hospitals in time to intervene, Medicare fraud interdiction before payment, and weather pattern predictions to optimize wind turbine locations. (Note: The next TDWI Solution Summit, September 25-27 in San Diego, will feature case studies focused on the theme of “Deep Analytics for Big Data.”)
“Big data” is both an evolutionary and revolutionary phenomenon. Given that organizations have been working with large data warehouses and other types of files for some time, it should come as no surprise that the sheer quantity of data would continue to grow. Data is a renewable resource; the more applications and systems that use it, the more data that they tend to generate. Data warehouses will continue to be important, but even as the terabytes of structured data pile up, organizations are hunting down unstructured sources to tap their value and discover new competitive advantages.
IBM’s view of what makes big data revolutionary comes down to the convergence of the three “V’s”: volume, velocity, and variety. Volume is the easiest to understand, although IBM speakers at the Symposium described scenarios where so much data was streaming through in real time that storing it was impossible. Huge data volumes plus the velocity with which it is flowing in are opening up opportunities for technology alternatives, including Hadoop, MapReduce, and event stream processing. Variety, the third “V,” adds in the unstructured and complex data sources growing up on the Web, particularly in social media. Some organizations, of course, do store all this data; Eric Baldeschwieler, VP of Hadoop Development at Yahoo!, described their use of the Hadoop Distributed File System (HDFS) to store petabytes of data on nodes through its vast array of clusters. “Hadoop is behind everything we do,” he said.
It was not surprising news, but Baldeschwieler and IBM experts gave a full-throated defense of Apache Hadoop and the importance of having open source software at the foundation of big data programs. IBM did not mention EMC explicitly, but it was clear that the company was responding to EMC’s May 9 announcement of the new Greenplum HD Data Computing Appliance, which offers its own distribution of Apache Hadoop. IBM execs warned of the dangers of “forking,” which is what happened when vendors created their own versions of the UNIX operating system and users had to deal with competing standards. Baldeschwieler and IBM execs did acknowledge, however, that Apache Hadoop is far from a finished product, and in any case is not the solution to all problems.
I came away from the Symposium excited by the future of big data analytics but also aware that there’s a long way to go. “Big data” is not about a single technology, such as Hadoop or MapReduce (for more on Hadoop, see my colleague, Philip Russom’s interview with the CEO of Cloudera here). These technologies are more of a complement to data warehousing rather than replacement for it. Yahoo!’s Baldeschwieler made the point that Yahoo also has data warehouses. As each industry’s requirements become clearer, vendors such as IBM will assemble packages that will bring together the strengths in their existing solutions with new technologies. Then, organizations will have a better understanding of how to compare the vendors’ offerings. We’re not quite there yet.
Posted by David Stodder on May 17, 20110 comments
Teradata’s recent acquisition of Aster Data Systems is a huge signal that worlds of “big data” and data warehousing are coming together. The deal itself was not a surprise; Teradata made a down payment on Aster last September, when it bought 11 percent of the company. And before making that initial investment, Teradata proved that it was not averse to bringing in other people’s database engines by acquiring Kickfire, an innovator in MySQL and analytic appliances. However, unlike Kickfire, which was floundering in the market but offered interesting “SQL on a chip” technology, Aster was successful and well-funded. Teradata will now have an opportunity to expand its appeal beyond traditional, SQL-based data warehousing into the realm of particularly unstructured big data – and provide the technology to bring these worlds together.
“Big data” refers to the massive volumes of structured and unstructured data being generated by relatively new data sources such as Web and smart phone applications, social networks, sensors and robots, GPS systems, genomics and multimedia. For customer interaction, fraud detection, risk management and other purposes, it is often vital to analyze this data in something close to real time so that decision makers can be aware of events, trends and patterns for immediate response or predictive understanding.
The extreme requirements brought on by big data have accelerated the technology shift toward massively parallel processing (MPP) systems, which generally offer better speed and scale for the size and workloads involved in big data analysis compared with traditional symmetric multiprocessing (SMP) systems. TDWI survey data shows that data warehouse professionals intend to abandon SMP in favor of MPP. Not surprisingly, MPP’s growing appeal was a driver behind the market explosion in recent years of new data management systems and appliances that could take advantage of parallelism. Now, that market is consolidating; EMC bought Greenplum, IBM bought Netezza, HP bought Vertica and now Teradata has picked up Aster. And during this period, we’ve seen Oracle introduce Exadata, IBM introduce its Smart Analytics Systems and other developments that are bringing MPP into the mainstream for advanced analytics.
To take advantage of MPP for big data, many developers, particularly at Google, Yahoo! and other firms that bet their business on analysis of online data, have chosen to look beyond SQL, the lingua franca of relational databases, and implement Hadoop and MapReduce, which offer programming models and tools specifically for building applications and services that will run on MPP and clustered systems. Aster, with its nCluster platform, has strongly supported MapReduce implementations; as part of its “universal query framework” introduced with the 4.6 release of nCluster last fall, Aster released SQL-MapReduce to support a wider spectrum of applications.
My colleague at TDWI Research, Philip Russom, notes that while there are many synergies between Teradata and Aster – the technologies from both companies are fully capable of handling extreme big data and both assume use cases involving both big data and analytics – there are significant differences. “Teradata is designed for data that’s ruthlessly structured, even third normal form, whereas Aster, especially with its recent support for Hadoop, is known for handling a far wider range of data types, models, and standards,” Philip noted. “Most Teradata users are data warehouse professionals who are hand-cuffed to SQL, whereas Aster’s user base includes lots of application developers and other non-warehouse folk who are more interested in Pig and Hive. It’s a good thing that having diversity is strength. Assuming the Teradata and Aster camps can overcome their differences, they have a lot of great things to learn from each other.”
TDWI members have been ramping up use of advanced analytics against multi-terabyte data sets for the last several years, and Teradata platforms have been in the middle of that trend. Teradata’s move gives data warehouse professionals a strong reason to evaluate whether Aster’s technology can enable them to further exploit the power of MPP for both SQL and non-SQL applications that require advanced analytics of big data.
Stay tuned to TDWI for more insight into how organizations can expand data warehousing into the realm of big data. We are in the planning stages now for our TDWI Solution Summit, “Deep Analytics for Big Data,” to be held in San Diego, September 25-27.
Posted by David Stodder on March 11, 20110 comments