Over the last 40 years, we’ve seen a revolution in data analytics, starting with the invention of SQL and the Relational Database by IBM in the late 1970s, the advent of the personal computer (also IBM), the spreadsheet (Lotus 123), the internet protocol, web browser, NoSQL databases and data lakes. Then things moved onto more sophisticated solutions like SaaS, APIs, end-user tools like Tableau (visual analytics) and Alteryx (easy ETL), FiveTran (easy EL), Matillion (cloud-ETL) and Snowflake (high-performance cloud database). Data has never been more accessible and less risky to deploy at scale.
How has this big trend changed the way data analytics is done?
The Old Stack
I’ll use three example companies to explain the old stack:
- IBM
- Oracle
- Microsoft
There are many others, but these provide a window into three different generations of old-stack vendors.
IBM
IBM was founded in 1911 and invented some of the earliest commercially successful computing platforms. Many of the largest companies in the world run IBM applications. Engineers from IBM invented the foundations of the modern relational model and SQL.
Oracle
While Larry Ellison didn’t invent SQL or the relational database, he was there at the beginning, and Oracle actually beat IBM to market with the first database under the name Software Development Laboratories. He was inspired by Ed Codd’s (IBM) paper on relational database management systems.
Microsoft
Bill Gates adapted another IBM invention to create the operating system that ran PCs (DOS) and created a personal computing revolution. He then created the most popular graphical operating system and the most enduring and popular office suite: Windows. Microsoft came into the database world in the 1990s with SQL Server by attacking the incumbents with a good enough, relatively low-cost relational database. The gigantic Microsoft Office user base provided a powerful installed base who were eager for better tools to store and retrieve data.
Defining Characteristics of the Old Stack
IBM, Oracle and Microsoft represent three of the established legacy stack vendors. The defining characteristics of this group are:
- Large product ecosystems
- Huge installed base of product
- Growth via acquisition
All of these companies are huge organizations with revenues from $35 billion to $125 billion, with Oracle being the smallest, Microsoft being the largest and IBM in between ($77 billion). All of them have pursued a growth model that has been dependent on acquiring companies and technologies to fuel innovation and growth for the past 20 years.
A common feature of the old stack—the least expensive experience you’ll have is the first experience. Once you’re in, getting out of the vendor stack is costly. These vendors price their products with the knowledge that transition costs to another stack are very high and fraught with peril.
The New Stack Vendors
The new BI stack vendors are deploying BI platforms in the cloud. My sample group includes:
- Amazon (AWS)
- Microsoft (Azure)
- Google (GCP)
This trend was initiated by Amazon with AWS, then Microsoft via Azure and Google via the GCP. The defining characteristics of these solutions are that they are cloud-native, provide limitless scale, solid security, and offer large, cloud-centric software marketplaces that provide many toolsets, some homegrown and some acquired. Each company has developed its business model around a singular strength/focus and then added (mostly via acquisition or repurposing existing capabilities) additional complementary products to build out fully functional cloud stacks.
Of course, the same (old stack) transition costs for migrating to another cloud vendor exist if you buy the stack within your cloud vendor’s ecosystem. So, your ability to negotiate rate reductions is predicated on usage increases. Your ability to move to another cloud “stack” may present similar challenges as the old stack transition. So, once again, your overall costs in a single stack will be higher, and your user experience will be potentially diminished.
Acquiring Technologies
All of the cloud stack providers also acquire many companies every year. They are all building out their cloud BI through a combination of repurposing legacy tools, in-house development and acquisition. Time is money. They can get to the Promised Land faster via buying the best available toolsets or reusing what they already have.
New Stack? Looks Like the Old Stack, Plus the Cloud
Data portability and the cloud have enabled rapid innovation in every part of the BI stack workflow. More and more people within an organization want to turn data into actionable information. Tableau has been one of the most successful vendors with its Desktop and Server products. Tableau has developed the best drag-and-drop visualization experience using a large number of non-homogeneous data sources. Tableau works on all of the cloud stacks. Google’s recent acquisition of Looker is an attempt to add a Tableau-like tool in their stack.
All of the cloud vendors are trying to create full-fledged BI platforms that they can sell to existing customers on top of their great cloud services. The problem with this approach is that the best products don’t get invented in large companies. The large companies can’t all buy the best tools in every part of the data stack.
I’m not saying that the major cloud vendors aren’t creating complete toolsets for BI. What I’m saying is that their toolsets are too expensive and not the best-of-breed.
Best-of-Breed Solutions
If you are undertaking a build of a world-class BI ecosystem, do you want the second- or third-best tools in every area, or the very best tools for your particular needs? Prior to the advent of the cloud/APIs, a single-vendor approach could save time because data wasn’t as portable as it is today.
Now, the disintermediation of data and the migration from on-prem to the cloud has facilitated an explosion of toolsets within the BI workflow. We now have many cloud-based tools from a variety of vendors. The evidence we have seen over the past 15 years? The best and most cost-effective solutions for our clients always come from selecting the best-of-breed tools that work on any of the public clouds.
The old barriers and concerns regarding best-of-breed have gone away. Connecting to a huge number of data sources has never been easier. Ingesting, cleaning, storing and performance-tuning data for analytics has never been less risky. Making that data available to non-technical end users has matured to the point where the original vision of BI is being realized by many companies of all sizes.
InterWorks’ Best-of-Breed BI Toolsets
We spend a lot of time understanding our clients’ needs and searching for emerging tools that better serve those needs. We found Tableau in 2007 and Snowflake in 2016. We like to identify tools before the market recognizes their potential and help those vendors mature their products, so when mass adoption happens, we have the knowledge and experience (and the case studies) to prove value, teach methods and deploy solutions very quickly with low risk and high value.
Our preferred tools that fit 90% of our client use cases include:
- Data Ingestion/Cleaning (ETL/EL/ELT): Alteryx, FiveTran, Matillion and Dataiku
- Database: Snowflake (cloud), Exasol (on-premises)
- Visualization: Tableau
- Portals: Curator by InterWorks
There are other tools we recommend for special use cases, but this set of tools meets the needs of nearly all of our clients very cost effectively. Some combination of these tools is appropriate for a one-person company or companies with 20,000 end users.
Because they are all available as SaaS/Cloud products, initial costs are low. Ongoing maintenance and operating expense are competitive. Scalability is nearly infinite. We think the SaaS model make a lot of sense for both software developers and clients. It aligns the interests of software developers, consulting firms and customers very tightly.
If I were going to start a company today, 100% of my data for analytics would be in the cloud. Security concerns have been addressed. Governance toolsets are built into these products, or the partners surrounding these companies have created add-on products to provide for niche requirements.
We now have many cloud-based tools from a variety of vendors. The evidence we have seen over the past 15 years? The best and most cost-effective solutions for our clients always come from selecting best-of-breed tools that work on any of the public clouds.
In short, there isn’t a good reason for most companies to fear having their analytical data in the cloud. Amazon, Microsoft and Google know how to manage hardware and security. You can’t deploy a server for less money than these companies do, unless you have a budgetary system that allows you to cheat on your cost model because somebody else’s budget covers some of the cost. Your analytical data belongs in the cloud where it is secure and scalable on best-of-breed tools that run on any of the public-vendor clouds. This approach gives you the best solutions and the most leverage when negotiating with the cloud providers.
The Unique InterWorks Value
At InterWorks, we focus on the client need and position the right people and new technologies that will facilitate the best possible outcomes at the lowest possible cost. We try to identify technologies that show promise before they become adopted by the mass market, so we can gain deep experience before mass adoption occurs. This takes time. We don’t have 10,000 team members to learn every new tool, so we focus our efforts on what we believe are the best available tools for each part of the BI workflow. We can provide technical knowhow based on actual experience, training, enablement, strategic vision and tactical execution.
If you have an interest in exploring these ideas, please reach out. We can provide specific answers and address your concerns regarding the challenges you may face in evolving your analytics strategy for moving your BI data to high-performance cloud computing.