Stop Wasting Time with Modern Analytics

Transcript
So first off, just in terms of introductions, I'm James Wright. I am the Chief Strategy Officer here at InterWorks and what that means essentially is I spend my time thinking about data strategy all day every day. That impacts what I think we try to help our clients do in terms of what their data strategy should be. And that in turn then drives our strategy in terms of looking at new technologies, looking at going deeper or broader. And so hopefully that informs part of our conversation today. Joining me is Ben. Ben, maybe you want to introduce yourself as well. Yeah, I'm Ben Bausili, I'm Global Director of Product. So Curator by InterWorks, our support offerings, all those things roll to me. I'm just really passionate about making the user experience around analytics better so that people can more effectively use analytics to inform their decisions and change their business. In terms of housekeeping notes, we do have a few. Ben, maybe since you're managing the panels today, maybe you can walk through what to expect here. Yep, so James will be presenting for about thirty minutes. We'll have time for Q&A. During the discussion, we'll have four polls that will pop up. So you'll see a pop-up come in and it'll have some multiple choice options for you to select. These are just for fun, so make your choices and we'll be able to share the results to everyone in that same panel. If you have questions today, we do have the chat open, but the Q&A tool is the best way to make sure that your question gets answered. It gives us a good way to track those questions and mark them. So please use that Q&A panel. Most of the questions we will save towards the end, but if there's anything that's particularly relevant, I will interrupt James and bring it in. I will be monitoring the chat as well. So if you have any commentary or things like that, please feel free to chat in there and we'll be watching it. After the session is over, in a few days you'll get a recording and resources provided as email follow-up. If you have to drop or anything like that, you will get a recording for the session. Perfect. Thank you, Ben. Moving along, the time we have today is going to be structured into these five segments. This is really the beginning of a conversation where I want to walk through the big picture as we see it and as I think we can generally describe across our client base and then look at how do we leverage that understanding to save time, improve effectiveness, get better results for our businesses. And by the way for those of you who don't know Ben or I or InterWorks all that well, when I say our client base we have a really interesting cross section that's worth mentioning there. Here in the States where Ben and I both are today, we have a cross section that goes everywhere from Fortune 50 down to companies that I'd never heard of before we picked up the phone the first time. And many of those are clients of ours that have been working together with us for greater than ten years and in fact with Ben and I personally for more than ten years. So we have a lot of longevity and some fairly mature businesses. At the same time we have maybe twenty-five percent of our business based out of Europe, so the UK, Germany, Switzerland, Netherlands, etc. And what we see there is still very mature businesses, still very long tenure for us, but maybe they're a little later on the technology adoption than you might be if you were in San Francisco or Silicon Valley or something like that. For a variety of reasons, some of them regulatory, some of them cultural. And so what we end up seeing is interesting effects that are shifted over time and then again we run about ten percent of our business out of Asia Pacific so hubbed out of Singapore and Australia and again what we see there is that that time shift where the patterns aren't dissimilar but they're getting refined, they're coming in a little bit later on the calendar is happening again in Asia. So what we're trying to do here is take some of the insights we have over the last ten years and certainly the last few years here in the States and look at how they're going to apply here going forward but also apply around the world. The genesis of this conversation is this concept of yesterday. I've been consulting with InterWorks for over a decade and data for more than two decades and the mistake or the funny question I sort of ask these days all the time is when a client and I get together, when do you want this? Whether this is a dashboard or a dataset or a business outcome. And the answer every single time is essentially yesterday. And what that always says to me is look, we don't have enough time. Our clients don't have enough time, they can't get enough time. Even if you're hiring, you're catching up to the amount of time you needed last month, last year, last quarter. And so the ebbs and flows of the business cycle always leave us with we want to get more value for the time invested. And that's really ultimately what we're trying to talk about today. Part one, the modern data landscape. I'm a big believer in the notion that we can't really understand where we're going if we don't really well understand where we are. And what I've tried to do is simplify some aspects here for discussion's sake. I'm sure every single one of you is potentially more of an expert in any single piece of this than I am. What I'm really trying to do is approach what I would call the forest and the trees problem. And we do a lot of consulting on exactly that. What should an environment look like when the folks who are really heavily invested in working with that environment are so deep in the weeds that it's hard for them to see the big picture? So bear with me for a few minutes, I'd like to talk about the big picture. How we got where we are and what we do about it. Back in the day most of our clients ran their data businesses in one big stack. The stack was often purchased from a single vendor and was often running on machines or even appliances, custom built machines for a specific purpose in the back room. And conceptually this was a much simpler world than where we are. Sure, we still all had all the same problems around semantics and business logic and internal politics that we have today, but the amount and the complexity of the tooling was at least smaller in domain. Well that ran into some very significant challenges as we look at the Web 2.0 explosion and around the turn of the century. Those challenges were largely in terms of storage. We were generating more data than those machines were built to hold. Or compute, we were generating more data and trying to use it in a way that those systems weren't built for and so we lacked flexibility, we lacked scalability. And where we could scale, what you saw is a step function. If you had a million dollar machine and you wanted to increase the compute, you had to buy another million dollar machine. So you saw this cost function which didn't really scale with the way business functions generally work, right? Your next project isn't generally a million dollar project, right? So we ended up with buying functions and cost cycles that were much longer than business cycles and it caused challenges. So around 2006 we have this notion of the cloud. And what cloud got us was certainly an increased diversity in terms of where our data is coming from. We saw storage essentially become infinite because machines were interchangeable and sort of over there we could have two of them or ten of them or twenty of them and that wasn't for the buyer or for the consumer a step function anymore. It was much more linear in pricing. And we saw cloud SaaS providers start coming in, right? Today many of these exist, some of the ones on the screen, but there are too many to list in any number of decks that reduce overhead, right? We no longer had to manage every system in-house. This was generally good. But then when we brought it back into that existing BI infrastructure or data landscape, we saw that the increase in volume on the left-hand side was starting to put pressure on the right-hand side. We saw that the rest of the BI stack was having challenges of scale. Certainly we had Moore's law growth curves in terms of the number of rows we were storing. We were having growth curves in terms of the amount of those rows that we were trying to analyze. There was more to move rather than having it all on a set of machines in the back room on the same network. All of a sudden these things were outside of our network and we had to deal with making bigger pipes to come in and bigger pipes to come out. And because each one of these SaaS systems potentially brought their own view of the world, we had conformity challenges that were very hard to solve all within the data warehouse of early 2000s. So, and hopefully you're seeing the trend, right? What we're seeing is evolution creates challenge, creates evolution, creates challenge and that's been the trend. This created stress in the data warehouse and we needed more storage, we needed to be able to process larger workloads, and we needed to be able to essentially keep pace with the amount of data we're generating. I am curious and I think a few times during this conversation today we're going to try and jump out and understand how relevant this is to the folks on call. So Ben, I'm not sure if you could push the poll button please, but I'd like to ask you all to spend a minute. This is multiple select so please just click as many of these as you use. I'd like to understand what does your data landscape look like today. We put some representative examples up in here. I'll come back to that in just a minute, so keep filling that out and we'll come back to look at that in a second. Anyway, so cue the cloud data warehouse. Cloud data sources, cue the cloud data warehouse. And what we ended up with here was infinite scalability. Everyone knows that essentially you can't overload Amazon. When Microsoft is building storage centers, they're building in the petabytes of storage. We also saw infinite power. These systems now have challenges generating enough power to actually run the processors, right? And all of that really helped us fit to the model generate as much as you can. But they generated a lot of expense because doing that all within the structure of a Teradata or a traditional data warehouse was quite costly both in terms of machinery but also overhead to structure all that work. So what we saw is an improvement in the notion of lakehouse which is we're skipping a few steps. We're skipping MapReduce, we're skipping Hadoop. There's a whole lot of things that happen in there but essentially where we got to was we have a lot of storage for what we call objects. They're unstructured data, they could be one thing, they could be another. We'll figure out how to use them later. And when we figure out how to use them, what we're going to do is essentially apply business logic. We're going to apply structure and we're going to be able to leverage the finished object like we could a data warehouse. It's in the cloud probably but we have rules, we have structure, we have expectations of this dataset. This was great because we were able to detach storage from compute and certainly today I think we can assume storage essentially is a zero cost or at least asymptotically approaching zero and compute is the expensive piece. We could maintain SQL access which is really important because most business intelligence tools speak SQL. And we could connect the two. Of course, and this is where actually I'll jump back into your results here. Most of you and most of our clients actually have this hybrid world. Yes, we have cloud sources of data. Yes, we have on-prem legacy sources of data that may still be very key to our business. We certainly have a hybridized world where we have some S3 data, have that cold lake concept. We certainly probably have a cloud data warehouse, and we probably still have an on-prem data warehouse. So Ben, I think you've shared the poll results. Yep, everyone should see. And this is really pretty indicative of what we see here in the States. When we look at SQL Server and Snowflake coming in as number one and two, that's certainly the exact combination of what we have here on the screen. Not really a whole lot of surprises in this list and again Synapse really hasn't seen the adoption we've seen. Redshift certainly was early in and I think we're seeing a lot of that effort or that value shift into Snowflake and Databricks. Looks like BigQuery, oh I see. I didn't scroll down here. And of course if I scroll down I see Excel at seventy percent. Yeah, I mean we've got a combination of things, right? So this is what's happening and a lot of those longer tail sources, Excel, SQL Server, Postgres, they're not going anywhere. I think what we're going to see is more of you are going to get to a place where you have both Snowflake and or let's say Azure and or Redshift and those local sources. Anyway, so still moving through history now. Thanks Ben, you can probably close that poll out. When we get into the impact lakehouse has on big picture architecture, what we've achieved is scale both of storage and of compute. And what we've achieved now is linear cost expansion, right? So we no longer have to deal with a million extra dollars to the next project. Certainly for those of you on the cloud today, I think you'd find that spinning up the next project is trivially easy. In fact, ultimately that's going to create a problem we'll talk about solving later. But that's certainly the experience that most of our clients have today and frankly enjoy. And then of course we've actually figured out good ways to rationalize the old and the new next to each other. And most of our clients, we're not counseling to turn off Teradata or SQL Server today. Maybe the technology pricing and the effort will indicate that as a good answer down the road. But today, mostly we're focused on the outputs rather than the perfection of the model behind the scenes. Okay, well that brings us to the ETL part of the diagram. And again, just looking at where is modern analytics. Ben, I think if you can open the next poll, I do have some questions around how you all are thinking about ETL. And this is actually, if we think about the last two or three years, probably the most exciting part of the BI space. We've seen massive expansion mostly driven through investment of a number of different ETL providers out there. I expect what we're going to see in the next cycle of a year or two or three years is contraction and combination of those tools as some of them realize they're a feature instead of a product. But anyway, for the most part, what we see in our client base and what we recommend is that ETL moves from being a single machine somewhere in the back room or a single server probably into a distributed pattern where any number of cloud-based tools are amazing at probably shifting that order of operations. Extraction is trivial from most cloud sources and something like Fivetran essentially has solved that problem. We simply plug your credentials in, plug your destination in, hey presto, you're done. And so we would see extracting and loading happening. Yeah, from on-prem sources probably still going into the on-prem data warehouse and then from cloud sources either going into archival, so the lake or going into the warehouse extraction. Tools like Matillion, Fivetran, certainly DBT are the most common tools we see and frankly we recommend and help our clients implement. I think we're seeing a lot still of Alteryx which doesn't really fit into this space for me and I'll talk about where I see it a little bit later. If we look at the poll results, I'd be curious to see what we find here Ben, so maybe if you could share that back out. What we're seeing is surprisingly little Fivetran. I'm really frankly pretty shocked that only ten percent of you are using either Fivetran or Matillion, but of course when we look at, sorry, that's just SSIS on the bottom there. We're looking at a combination of largely this being done in SQL. I think certainly as we look at the more modern aspects of this we're seeing that DBT, Matillion and Fivetran are really an effective combination for certainly customers in the mid-market space. At enterprise, very large enterprise we see some advantages to looking at those plus things like Airflow or some sort of orchestration layer. But certainly for the folks who are thinking about is my SQL Server going to be around in five years, I think we'd be thinking about how does cloud data warehouse and some of these cloud ELT tools fit into that conversation. When I talk about that, that's basically what we have up on the screen. We're moving down towards the business intelligence, so the last aspect of this. But again, looking at where is that time going, how do we get better value out of the platforms? Certainly, when we talk about getting lost in the trees rather than seeing the forest, I think you're starting to see a lot of trees are out there in terms of the modern data landscape with transforms happening in potentially many places and in potentially many redundant places. So remember that point as we come back to that in terms of opportunities and action items. When we look at the business intelligence landscape, and this is we're looking at that last mile of BI tool now, we certainly are now thinking about being able to share tools like dashboards or Excel sheets or insights. Ben, we have the third of four polls that maybe I'd like you to trigger now please around BI tool usage. And as you do that I'm going to take a just quick sip. I'm surprised I haven't coughed at anyone yet, which I'm really frankly pleased about. And then I'm going to jump into what we broadly see and then talk about some recommendations in there. I'll just jump in and say if you want to continue to add the things that aren't on the option list in the chat. It's been neat seeing some of the things that we didn't put on there. Oh great, yeah, it's fun. I wish I could simultaneously do all the things but I'm going to keep working on not coughing and talking to the microphone. Ben I think we can end the poll because I am not at all surprised by the results there and I think we can share those results. So as expected, what we're seeing is essentially all of you have BI 2.0 tools like Power BI and Tableau. And certainly I would think we can easily say that ninety-plus percent of the BI market is dominated by those two players today. Looking back at the poll here, very few of you, maybe five percent, have what I would call legacy BI, and I probably put MicroStrategy even though it is the latest of what I would call legacy BI into that space, right? We have it, it works, we haven't turned it off, but we probably aren't thinking about building our next big project in it. And then several of you I'm delighted to see have what I probably think about as this BI 2.5 gen. I think ThoughtSpot's a great example of that. The next generation of we saw what folks do in dashboards. I don't think we're going to build a better bar chart. Certainly machine learning and artificial intelligence may build us a good enough bar chart without us touching a button. I mean that's certainly a future, but I think we're seeing ThoughtSpot overturning that paradigm of let me just search for it. I think that's really part of what we're going to see as the future in this BI 2.0 space, 2.5 space. Particularly when we think about interaction with AI and machine learning in there. So those results, thank you for everybody who's participating. That's really exactly what we're seeing across the markets. I think what we're going to expect to see here is a bit less fragmentation going forward and a bit more adoption. Sorry, a little bit more fragmentation frankly, a bit more adoption of some of the 2.5 gen tools. And certainly as we look at the LLM or large language models, ChatGPT, etc., over the last few months I think we're certainly looking at today and we're also looking at, many of our clients are looking at how is that going to interact. I know ThoughtSpot's come out with Sage which is their newest gen or their first gen interaction layer which is a natural and obvious addition. I think we're only going to see more of that from all the players. And eventually I think BI 3.0 is going to be dominated by how much is the computer doing for us versus how much are we actually helping ourselves. But let's move on. The biggest challenges we see here are essentially governance and avoiding the Wild West. You'll see I added in this extra transform layer. This is where I would put Alteryx as a good example. Excel fits in here as well. This is desktop-level transformation or purpose or endpoint-driven transformation. Certainly I think that this is essential frankly to do anything new in business. And as we'll talk about visibility later, the problem in a lot of cases is it's completely invisible to the folks who run the data platform. So a lot of what we'll talk about is how do we rationalize the need for this with the stability we get on the left-hand side of the screen. For completeness sake and because this pendulum is going to be swinging back, I think it is also worth calling out that there is some edge compute happening in most of these BI toolsets and they'll probably continue. We can talk about that in another session, but I do expect that the interaction of edge compute and compute cost is going to be something that folks are going to be thinking about and we're going to be thinking about with our clients. Which brings us to the completed refrigerator art. This is what I would tell you modern BI looks like for most of our clients around the world. Depending on where you are, you may have all of this, you may have this plus, you may only have one pathway here. And I think what I have on the screen is almost certainly inaccurate for any one of you. And what you'd need to do, and frankly what we do a good amount of, is helping our clients rationalize where are they today looking at a map like this, and where do they need to be to accomplish key business goals? And if there's any take-home from this conversation I would say that's one of the biggest things you should do. But we can talk about that when we talk about solutions. Okay, that's the landscape. We now have a bit of a history as well as a current state as well as some notions of where is the BI landscape going in the future and where is the data landscape going in the future. That creates some opportunities for us all and I think I'd like to spend some time looking at how can this benefit your business. We'll look at some of the common pitfalls and hazards that we've seen in our client base and then we'll look at some good tactical, strategic solutions we can put in place together that will get the best of both worlds. So, opportunities. Certainly, I said the word cloud enough in this conversation, right? And what that fundamentally gets us is infinite scale and infinite speed. We're still going to be collecting more data and that will mean more storage, but I think what we're seeing is for most applications we know what good enough looks like in terms of being able to crunch numbers and get us a result. I think we're seeing that that's going to be less of a differentiator for most technologies going forward because it's so ubiquitous. While that does come with a cost, our big benefit from moving into this modern landscape is linear cost control. And I saw a lot of you out there on SQL Server as a platform. I'd be really curious if you're looking at and saying why are we still there? I'd hypothesize that for many of you it's not broke, we don't need to fix it. But I would also hypothesize that there's the challenge of SQL Server where when it gets to a certain scale there's this step function of cost. And a lot of folks end up being stuck in a place where it's good enough, it's not as fast as it should be, but it would cost us twice as much or three times as much to make it any better and so we stay where we are. I think a lot of what we see in this new SaaS and cloud-based lakehouse strategy is the ability to escape that glass ceiling challenge there. I think a big callout, and this is probably one that we see as a consultancy more than most companies see, is that this modern infrastructure actually supports languages that the company needs to consume. We often see that when we look culturally at an organization or a customer of ours, language is a big piece of it, of what drives siloed behaviors. And so the ability to mix and match tools that speak SQL but that allow end-user drag-and-drop or even just search as well as allowing for that very technical user to code and script. Being able to manage tools together to achieve that is an opportunity and certainly when we look at the best-of-breed strategy it's a lot that we help our clients understand is how do we mix and match those to get to the right combination. Particularly when we look at what is the function a tool achieves in my landscape. The more we can commoditize or understand the function of each tool, the more we can abstract that, the less we want our clients and the less our clients will ever be stuck with a tool that doesn't work exactly what they need because they can replace that function with a better version. So flexibility for the future is a big piece of why move towards a modern data landscape. In terms of risks and challenges, some of the pitfalls, the biggest one, and we'll come back to this one over and over again today and through our consulting, is visibility. Modern tools maximize output and sacrifice governance or structure. So awareness even of what's happening requires a pretty high degree of effort. Certainly if we look at any of this transform that's happening here near the BI tool layers, like I said earlier, that is blind in terms of what's happening on the backend and vice versa. So the number one thing that we're going to talk about being able to do and what you should be doing is making sure you understand how to interpret your own version of this map and how your teams understand the interrelatedness of pieces. And that's particularly true when we talk about governance. One of the things I talk to a lot of clients about is cloud spend. And when we talk about let's say Snowflake spend, it's not like there's a dial there where you can turn Snowflake spend up or down and it's just a Snowflake button. A lot of these transform tools do what's called pushdown computing. So they're actually pushing data into Snowflake and using the Snowflake engine to do that transform, which is great, we want that. But it means that a lot of your Snowflake spend is actually being contributed by Fivetran or Matillion or one of these other toolsets. So understanding how these tools are connected and even what is the landscape is essential in order to govern and manage that system as an enterprise and even understand who to talk to about where that spend may be coming from. And again, it's a big piece of what we do is help our clients understand that. Once we understand that landmine map we can understand where the silos are going to be. Does one set of folks only speak Python? Does one set of folks only speak SQL? Well, you get those folks together and help them understand they can work together in the same toolset, they're naturally going to gravitate towards two toolsets, one that speaks each language. And that's ultimately a challenge because we want a best-of-breed, not a reaction to a tiny need. And then lastly, there are a lot of pitfalls in terms of architecture out there. I see a lot of customers over-engineer a solution where yes, it might be perfect but at what cost? We don't see value oftentimes from the long tail there. We often see that solutions get so specific that they create a heavy talent reliance. And if we have learned anything in the last three or four years, we need to make sure that talent in a lot of ways in our data space is fungible because we're going to have challenges in retaining a lot of our data talent. Certainly I think we've seen a lot of architectures, particularly ones that have frankly one of the first two problems, create a response that they measure in terms of quarters or years, not months. I think that creates a big problem for a lot of our clients. So those are the hazards. In terms of solutions, I have five solutions for you that I think are implementable by everyone here, frankly are implementable in not a huge amount of time. Number one, build your own data map. Sure, my refrigerator art, maybe it's awful, maybe it's a great start point, but spend some time, spend an afternoon, spend an off-site lifting your team's heads up and challenge them to build this map for your business. How does data flow from source to customer? You'd be surprised at how many unknown dependencies you have in there. Oh, I moved a little too fast on the mouse wheel there, but I'll leave it up. Number two, define a repeatable approach. If you're going to approach a new business problem, how do you understand which tool fits the job and how does not just your team understand that, but how do other teams in business understand the same constraints or same opportunities. That's a big challenge and frankly writing it down goes a long way, particularly when you combine it with number five. Number three, pay technical debt. If you don't know what technical debt is, the classic example for me is documentation. If I'm writing code or if I'm building something, it's way easier and faster for me to get a result right now by just keeping going. I'll do one thing, I'll do another thing, I'll keep being productive. But that day I'm out sick, or that day I need to go back and fix something, if I didn't pause and pay that debt of documentation along the way, we've got a big problem and I really probably can't pay that. So if you run an analytics team, if you run a data business, I would ask you please budget your interest payments on your technical debt. Get a team out there who one sprint out of three does nothing but look around and say what happened that we need to make more stable? What happened that we need to take that fast-moving piece and bring it back to the rest of our stack? Being deliberate about paying technical debt, even with ten percent, fifteen percent, twenty percent of your time, will pay huge dividends when the boss comes along and wants a new project, and the first two months of that are fixing old problems. And lastly, cross-train. Again, trying to avoid that language silo. So if you have a SQL team, help them understand what BI tools look like. If you have BI teams, help them understand what SQL team tools look like. If you can put technologies in between they both speak, that's best of all possible worlds because now that tax on every time you need those teams to work together is just reduced. These are all things that if you need help with we can help you with, but I think in general most customers, most clients need to spend a little bit of time on this, and I think you can come up with some good behavioral mechanisms to address those challenges. The fifth bullet point is missing because when we look at this diagram something is missing. And yes, I know there's probably lots of things missing, but the big piece that we're missing here for me is a consumer perspective. Sure, this is interesting from how does data move and how do we as data specialists think about this stuff, but nine out of ten of your data consumers aren't that. They spend their time in one or two or three critical systems. They're busy doing a single task, and they probably do tomorrow what they did yesterday because that's important to their job description. And Ben, while I'm ranting here, do you mind pulling up the last poll please? So I'd like you all to think a little bit about how do all the good things you do in your data business, how do they get to your customers? Internal customers, maybe external customers as well. We can talk about data monetization and we probably should have in this conversation. But where I'm going with this is I would guess that of the fifty-five people listening to this conversation, most of you probably lack a big green thing on my next slide which is what I would call a data hub. You need a single place for most of your customers to get information about the business, to be able to have a decision support tool that not only stays in their task but connects to their next task. That's what this big green box on the screen is representing. Ben, maybe you could share those poll results and we can take a look at it. Those are interesting. Those are really interesting. Well, can't say I'm surprised at all by PowerPoint, I can't say I'm surprised at all by Excel, those things are great tools, I use them every day, they're not going anywhere, they shouldn't. What I will say I'm a little surprised about is the number of you using SharePoint, and I'd love to drill down into that. I don't think we're going to have time, but that's where I'm particularly curious about looking at that data. In terms of zero percent of you use Curator, great. I think there's some real opportunities here. I'll show you what Curator can look like. Email sure, a subscription is important, right? The important thing isn't reading the email, it's what happens when you click on something in an email. Where does it go? I think the important thing here is you need a data hub. We have one here at InterWorks, I build them for clients around the world. A data hub, I would define it this way. It's something that's tool agnostic. It should be the same experience regardless of whether you're looking at something in Tableau or Power BI or PowerPoint or Excel, you can get to every one of those toolsets in the same place. And if you could do that, you can reduce the time required to get to insights, which means you can reduce the time required to build a thing that you already had that someone didn't know you had. You can engage with non-data users and frankly you can begin creating that single language your entire business speaks. And you can connect directly to tools like Slack or Teams, email through subscriptions, or even just get something done directly in the tool. I think I can't underestimate or I can't describe to you how valuable that is. I'll show you a good example. This is, I think you should see an instance of Curator. Curator is a product we build, Ben's team builds it, and essentially it's a portal for analytics. And we build out-of-the-box integrations with all the tools you can think you might have in your business, right? So we saw Tableau, there are any number of Tableau dashboards up here that are native. You can see that we've avoided a lot of the confusion of the Tableau interface here. I don't particularly love Tableau Server as an interface tool. We've given folks very common navigation here, mobile friendly. We've given people exactly the same experience. Maybe they're looking at Tableau and ThoughtSpot together even. So here we have both those together where I can have a high-level dashboard for my execs and if I want to be able to drill down and say, oh tell me more about sales by let's say year and let's say, I don't know, product. I can now actually drill down directly from that top view into this ThoughtSpot view because I happen to have Tableau and ThoughtSpot both in a single place. The nice thing here is I didn't have to remember the URL for either of those servers. I didn't have to leave my experience. And if I want to take this dashboard here, or this experience, and save it or download it as an image or whatever it is, and then put it in my PowerPoint, I can do that directly from the same platform. So I can jump into Box, which is the storage system we use here at InterWorks, and I can take that image that I just did the analytics on, I can take it into this presentation that I'm giving you right now, and I can jump in here and I could insert it right into the slide that we're working on, right? And the value there is I didn't have to remember twelve different URLs, didn't have to log in twelve different places. I got access to however many endpoint BI tools I have in one place. And if I was curious about what else is available, I could just come up to navigation I understand and be able to access them. This isn't the perfect implementation. This is just a quick example, but the point here is if I can do all this in one place, your users can do it all at one place and they're going to. I'm going to jump back into just the last couple slides of the presentation here. I would invite everyone by the way, Ben certainly can handle any of the Q&A or follow-up on that. Curator has free demos available. I think if you don't have access to an analytics hub or data hub, I think ours is incredibly amazing value and it's a great user experience. To sum up, the modern BI landscape is complex. Embrace it. You can fight the complexity or you can embrace it and I would choose the latter. You should assume that you're going to have storage for all data you ever want. You should assume that compute is going to be where your costs are going to be focused on and you should optimize for eighty-twenty. And your biggest cost is going to be doing the same thing two or five or five thousand times and not knowing it. You should expect your consumption is going to come from everywhere. And it's going to come from one BI tool or another on a desktop, in Excel, in AI/ML, and most of it's going to come at the last minute. And so to get most data to most people you need to have it very easy to get to. So think two or three clicks to get to the most recent data. And then as an analytics organization, your biggest priority should be mapping, discovering and rationalizing what does this picture look like. You really can't completely control it because two or three years is a long time and we're going to have new tools that come up. What you need to understand is what are the key functions in our business and what tools do we use today to do that. If everyone in your organization understood, even if seventy percent of your data consumers knew that answer, you would save yourself huge amounts of time and money. So the five tactical solutions I would suggest are: A, build your own data map. B, define and publish to your teams what tool selection looks like for your company. I bet most of them don't know or they have their own siloed version of it. Cross-train them. Maybe Data Hub is the first step in cross-training. At least it gets everyone reading from the same book. Then lastly, modern data tools generate huge amounts of value and they generate huge amounts of technical debt. Be deliberate about paying back on that technical debt. These are five pieces of advice that we give our clients all the time. These five pieces of advice will save you time and increase your user satisfaction, your value you're bringing to your internal clients and your external clients. That's it folks. Look, really appreciate the time. We're going to come back with, of course for all of you, the recording, presentation materials, some links. I'm happy to follow up on this conversation now through Q&A as well as potentially through any other conversations you'd like to have. Ben, I can see some thoughts up here for chat and for Q&A. How would you like to start? Yeah, so just so everyone knows, we do have the last poll up just about next steps. So please go ahead and answer that. And we can start with Preston's question in the chat since I saw that one first. He said Curator was an option for us. And frankly, I wanted it. It's an awesome tool. How do I pitch the idea to upper management? So maybe we broaden that to just pitching data hubs generally. But do you have any advice for persuading upper management? I'm a big fan of cinema and I think one of the key rules of cinematography or film writing is show don't tell. I think that in analytics a picture speaks a thousand words and I probably could jump through any number of other trade idioms in here, but I think my point would be spin up a trial, put five systems together and challenge the payer there to say, do you know all these five URLs? Did you know we've already done this? Could you bring all this up on your iPad in less than a minute in any other way? And I think the answer is clearly no. I've been building content management applications for twenty-five years, whether it be in Drupal or Joomla or any number of other toolsets, the best value is showing someone how it brings all of their workflows together. And then looking at what is the cost of saving half an hour per person per month. Add that up and you get a lot more zeros than you do in the cost of Curator or SharePoint or any of these other toolsets. But Ben, you probably have a better chance for that. We have this conversation all day long. Yeah, I mean I think starting with, hey, what are the things you use every day? What are the things that matter to you? What data do you use? That can help you see where the pain points are and propose the solution. A lot of times people want to see things. So being able to do a prototype, whether that is, hey, I'm drawing something in a PowerPoint or FigJam or some other whiteboarding tool to illustrate what the problem is, is nice. But we use trials a ton with Curator to show people, hey, this is what it could look like for you. And with something like Curator, we can actually customize the experience per user. And so we can do a lot of executive experiences where their KPI dashboard or scorecard or the thing that they want to look at on a daily basis is the first thing that launches, and then they have easy shortcuts to things. And so by tailoring it to them and saying, yes, we can do this for the rest of the business, here's a more general experience, by targeting it to something that they can practically see how it improves their experience, that really resonates. So we see a lot of success with that. I think I'll just add one more thing to that. We have any number of clients, and this isn't, again, you can use Curator to solve this problem and it does a beautiful job. You can use a lot of other tools to solve the problem. We have a ton of clients who when they want to present to the board, to the weekly management meeting, to the quarterly business review, what they're doing is they're going out and generating manually ten, fifteen, twenty pages of reports that some of them may come from Tableau, some of it may be Excel, some of it may be narrative. And they're spending hours of a person's time doing that. We have automated that process for so many clients where that person can now show up and actually answer business questions rather than just generate the same thing they did last month. That effort alone is worth the cost savings. Yeah. So I can answer this question from Robbie White. It says, do you see greater consumption or activity from business end users who need access to the data when implementing Curator? So we just actually recently had a customer do a study of this and look internally. They are a fairly large implementation with multiple instances, but they saw user adoption increase by three hundred fifty percent. So it was a very dramatic, steep increase in adoption and use of the tool. That really does come down to making it easier means that people are there, they know where to go, they're confident, they don't have to remember what platform, what server, what Tableau site, what Tableau project. It's all in. And also when you've spent the time using this as part of your governance process and making sure that the dashboards are production and trusted, it's not just the Wild West that a lot of Tableau servers or other systems become, they feel like they can trust it, and so they turn to it more often. So you see a lot of that. It does also help that it's a familiar place. So when you have control over the design, the look and feel, it can be the same UI across multiple tools. That really helps, again, making people feel comfortable and that matters a lot. I'll add one more thing to that. Don't leave it to chance. One of the things I love doing and using a tool like Curator does this is it gives me actual consumption-based information about what are people clicking on. I mean the challenge with having five different BI tools is you only have visibility to three of them in terms of what people are actually doing. What Curator does is it gives me a layer over all of those so I can actually understand where is the real traffic going. Who is engaged? Which groups are engaged? Which teams are? Who's clicking what? How many times are they clicking on it? When? Once we understand that we can follow the value and just keep building towards it. I think a lot of big data teams are flying blind. They're building and hoping and the field of dreams isn't a great strategy. Yeah, and on a practical level, I mean that can help companies make decisions around costs, right? So if you look at the dashboards that are out there, a lot of times the most popular dashboard on Tableau Server or Power BI is something that's being used to download Excel and data that's being consumed elsewhere. And so it opens up ways for you to solve that in other cheaper ways potentially. You can look at where your investment, how each tool is performing. So a lot of good stuff there. An anonymous attendee asks, at what point of the process do you suggest building the data map? It's a good question. I'll give you two alternative answers. If you want the most important but least functional answer, go ask your VP or CEO or whoever it is that makes decisions and ask them what are the top five data points they care about. And then work backwards from who gave them that number and how did they get that. I think you would immediately identify a ton of manual work which exists outside in most companies of all the hard work you've done on the data side. And I think that last mile is really critical to understanding what actually matters. So that's the ultimately valuable, really informative but somewhat painful answer to your question because most folks don't love the answer they get. Thinking about this another way, maybe a very tactical thing is look at the usage logs for your SQL Server, Snowflake, Tableau Server and look at top five users or user groups or assets and then just work backwards from that and understand where is that coming from? How well do we understand the dependencies or the failure points of that pathway and establish that. If you do top ten or top fifteen or top twenty depending how big your company is, you're going to get a really good sense of what that data movement map looks like. And you're going to find some idiosyncratic exceptions on report A or report B or report John who comes in every morning, Monday morning and has to do the same thing. And you're going to find some legacy toolsets stuck in there like you probably have some SaaS licenses you don't know about that are still being used for ETL. But for the vast majority of the data map discovery you can start with just top-end analysis and I think you get a long way. So another question in the chat is, would we recommend learning higher-level math such as linear algebra to better understand and manage data? And I'll add to it. Maybe there are some things that we could recommend for learning around data. I think the vast majority, we work for hundreds of clients every year around the world. The vast majority of what we do is what I would call meat-and-potatoes reporting. How many shoes did I sell yesterday is still a problem for most folks. Where we do need complicated math, statistical analysis, I would suggest that you go back to an argument a professor and I had in the 90s around me manually doing a p-test or a t-test in statistics. And my argument at the time was I have a computer, I don't ever need to do this by hand again. I need to understand when to do this and I need to let the computer do it. I would make that same argument to you today around advanced mathematics. Most of the time it's not about how do you do the math, it's about when is this math appropriate and what decisions is it appropriate to support and what conclusions does it not support. I would spend more time understanding when is that useful than I would in actually doing the mathematics. Yeah, I'll just add, I have done a lot on the data science side of InterWorks. And for many years, what I told internally to our consultants is, if you want to go down this path and learn more, great, it's a very fun area. I have a math degree, I love math. It's fun to know, but what the world is going to need is more engineers and people who understand how things fit together and the business value that we can translate from it, than know the models. The models are moving at a very fast pace. And things like GPT require tons of resources to generate, and we're not going to outperform Amazon, Microsoft, OpenAI, things like that. But we can certainly leverage what they build to do things. And so I think studying things like product development is actually a really good place to look at, like strategy, how things fit together, how do you think about business value and what consumers and customers need. Those are things that can really inform how you approach the tech and in good ways. I think for technical audiences it's often underappreciated how much those softer skills matter in building the right thing. Nothing is less efficient than building the wrong thing. So learn what to build. And I think that'll help a lot. Ben, I think you're going to have the last word. Thanks everyone for joining us for this hour together. Ben and I, as you can tell, would love to continue the conversation. So don't be a stranger. And hopefully this was useful. Please give us feedback.

In this webinar, James Wright and Ben Bausili traced the evolution of the modern data landscape from single-vendor on-premise stacks to today’s complex hybrid cloud environments. Wright emphasized the importance of organizations mapping their data flows and understanding dependencies across distributed systems including cloud data warehouses, ELT tools and business intelligence platforms. The presenters identified visibility as the primary challenge in modern analytics environments and offered five tactical solutions: building comprehensive data maps, defining repeatable tool selection approaches, paying technical debt, cross-training teams and implementing data hubs. They demonstrated InterWorks’ Curator as a tool-agnostic portal for unifying analytics consumption and showed how centralized access can increase user adoption by 350 percent.

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072

×

Love our blog? You should see our emails. Sign up for our newsletter!