Tableau Prep Essentials

Transcript
Alright. People have already joined in. Good afternoon, everyone. I see that people are finding that Zoom link and getting logged in. Welcome to Tableau Prep Essentials. Gonna give it just a couple minutes to let everyone else find that link and, get started up here. I do wanna mention a few things that we do have on the screen here. One, make sure that you guys are visiting the interworks blog. You have a ton of great pieces in there that, one, we'll talk about Tableau prep, which we're gonna be mentioning today. But also, We have a ton of other pieces over webinars that we've hosted in the past. Also, a lot of great information behind Tableau and other tools that we support. You're gonna see a QR code here on the screen. That is for our next upcoming webinar, and it's on cloud nine or on Taflo cloud nine. So that's gonna be on Wednesday, September twenty seventh. So feel free to join in for that. So, Rachel, I have a I have a question for you. Now that summer's sending and we're getting into fall time, I start baking. And I know that you like to bake too. Yes. So much. Have have you made anything recently? Because I did. I'm curious if you've made anything for inspiration. Well, actually, at we were at a team meetup in Oregon a couple weeks ago. And so everybody had to sign up and, had to sign up for a meal kinda thing or volunteer for it. So I got to make breakfast one time. And so I made two things and I baked two things. One, I made homemade cinnamon rolls the night before, so then they just kinda like rose overnight and everything. So I made those. And then the other one was this sheet pan quiche. And, basically, what you do is you put a pie crust and you bake that for a little bit, and then you put, like, eggs and cheese and bacon and all the things that you want, and you bake it a little bit more. But what was funny was, we forgot to get pie crust at the grocery store. It's six thirty in the morning. I'm half awake. Haven't even had my first cup of coffee, and I realized this, and I instead of thinking maybe I'll pivot and make something else now, like, just make scrambled eggs and bacon. My response was, I guess I better figure out how to make pie crust. So we made a pie crust from scratch. So that was the last thing that I made was just a couple weeks ago, homemade cinnamon rolls and, pie crust for a sheet pan quiche. I love that. I I make a, crush crustless quiche all the time for breakfast because it's so easy. Actually made some cupcakes last week, and I had one just before we started today and a cup of coffee. So I'm gonna be extra energized and have all the sugar. But it's from, there's this person that competed on the Great British baking show. Her name was Nadia. I don't remember if she put it on. But she started to own her own little series called Nadia Bates. And she made these cupcakes that were like these strawberry shortcake cupcakes. So it's basically an oreo on the bottom of the cupcake tin. A fresh strawberry put on top of that, And then we made this, like, shortcake batter that you, like, pipe around the strawberry. You go ahead and bake those. And then the icing is made from, melted strawberry haagen daz ice cream, infection, sugar, and stuff. And Oh my god. They were absolutely delicious because the strawberries, like, melted on the inside. The oreo at the bottom, the little crunchy, Actually, I have the recipe. That's so good. I'll put it here into the chat for everyone. They are delicious, and you can't stop at just one. Yeah. And if any of you, I see we've got a lot of people coming in from all over, which is really exciting. So hello everybody. And if you have, we're just talking about what we're baking and what we like to bake. And now I wanna make cupcakes, but if you guys have any things that you really enjoy baking, feel free to add that in the chat while we're talking about I also see that we seem to have somebody from Korea. Cannot read Korean, but I can recognize it. So It's all the high from Chicago. High from Chicago. Go ahead. Hello neighbors. I'm talking to you guys from the West Loop in Chicago. So so happy to see that. Oh, key lime pie, German pancakes. I'm not sure what are the difference between German pancakes and regular pancakes? Do you know Maxwell? I don't know. I feel like they might be thicker if I know correctly. I don't remember. Yeah. Not to look that up. I think it's, like, Belgian waffles. Right? Like, I'm trying to, like, equate that way. Oh, they cough up in the oven. Oh, okay. That sounds good. Well, I'm I'm glad that we well, it's one o'clock here in Chicago that I've had lunch and a cupcake already because I would be starving watching our workshop today. Well, thinking of that, we are right at time. So we're gonna go ahead and get things started this more or, sorry, this afternoon. Maybe morning to some of you on this call. Welcome. Today, we're gonna be talking about Tableau Prep Essentials. And I'm so glad to have a lot of you back, and for our newcomers, welcome. Wanna mention a few things that are here on the screen. One, make sure that you guys are checking out the interworks blog. We have a lot of great blog pieces there that are gonna give you information about what we're talking about today, information about Tableau or all the other platforms that we support. Also, you're gonna see a QR code here on the screen. That is for our next upcoming webinar. And it's on cloud nine, or we can say on Tableau cloud nine. So that's gonna be on September twenty seventh. Feel free to join in for that. Before we do get started, I wanna give you guys a quick little intro about interworks. Again, if this is your first time being with us, welcome. We're so glad to have you here. Interworks is a analytics consulting company. Were global. I saw that there were some people here from Chicago. That's where I'm talking to you guys from. I know Rachel and Raleigh, and we have people scattered all across the US, Europe, and Australia. We specialize in helping people and teams succeed in tech. Tableau is a great example. We've been a long time long term partner of Tableau, and they've built a great analytics product that we love, and we're gonna be showcasing today. But what interworks does alongside our partners and customers is we help you build strategies and problem solving solutions to exceed with technology you've invested in. We can help you with anything in your analytics journey with data such as analytics, data preparation, which we're gonna be talking about today. Data management, architecture, cloud migration, and all kinds of things. We're gonna be talking about Tableau prep today. But I recommend you also, again, look at the interworks blog, and you're you're gonna find out more of all the tech that we support and get some great advice from our global team. Feel free to check us out on LinkedIn. I know that we advertise this webinar on there as well. We'll also have a listing of all the upcoming events that we have. A few reminders, we are recording this session, and it'll be on our website in a few days. And we'll we'll send you an email if you register to let us know whenever it's available. One quick thing that we're gonna ask is that you guys use the Q and A function in Zoom today to help us keep track of any questions that you have along the way. And Rachel is gonna nicely or who knows really? I don't care, interrupt me and, try and see if any of those questions need to be answered during the session. If we don't answer them, during the webinar itself, we'll make sure that we follow-up and try and answer any of those questions that you guys have. So with that said, who's talk talking to you guys today. My name is Maxwell Croft. I am a membership enablement lead at Inowarex. So my goal is to help you guys succeed with learning all these different technologies and platforms that we're talking about. Rachel, I'll let you go ahead and introduce yourself. Yeah. Hi. My name is Rachel Kurz. I'm an analytics architect based in North Carolina in Raleigh, North Carolina, where it is Crazy hot, and I'm so ready for fall now. I've been working here for about six years. My background is in data science, but I do everything from data prep, like we're gonna talk about today. To data science, to data visualization as well. And, Rachel, I'm so glad to have you because just like you said, you've done a lot of data work and that's what we're gonna be talking about today. So our agenda is we're first gonna introduce what is Tableau prep? Why should we be using it? We're also gonna, demo the platform, look at what a standard workflow might look like, and also quickly try and make one ourselves. Then we're gonna talk about looking at prep in production, meaning that yes, we can create a work flow on our local computer, but how can we share this with others and ensure that we're automating this process for So before we jump into things today, let's first stop and take a little poll. I have two questions here. We're curious about how much time do you spend preparing data? And then how would you describe your level of expertise with data preparation? Rachel, if you could go ahead and start that poll for us. Sorry. Good. Alright. So take just a second. You're gonna see the first question at the top and then just scroll down a little bit and answer the second one for us. While you're doing that, I see that somebody else is from Raleigh, North Carolina as well. So hello, hello neighbor. So we got somebody from Chicago with Maxwell and somebody from Raleigh with me. So We love it. Love seeing that. It's not as hot here, Rachel. It's my watch says that it's sixty eight, which is crazy because last week it was super hot. Ninety four. I'll let you keep those temperatures. I'll I'll be a little cooler today. I'm done. We'll give it just a few more seconds. Yeah. I like this two parter poll. That's fun. Yeah. It is fun. So I used to just doing, like, one one question for Paul, but it's been good to see this. It seems like we've got a nice little kind of spread of people. So I think I'll just give one more second, and then we'll end the poll and share the results. Right? And end the poll now and share those results. So the first one was how much time do we spend preparing data for an for analytics or dashboarding projects? And it looks like twenty five to fifty percent is the highest answer here, which is not uncommon. I wish that we had perfect data. Right? None of us would be on this call today, but we have to spend time making sure that this data is ready and clean before we start building those dashboards. It looks like the second answer was fifty to seventy five percent. Oh, that's a scary slash truthful number right there. So truthful. Oh, and then the two percent, I don't. My data is always perfect. I wanna know your secret because we would all love to have that. Very much. And then the second one, how would you describe your level of x tease with data preparation. It looks like a lot of you are intermediate, which is great. That means that maybe you have been cleaning data in different ways throughout the years. And possibly you're wanting to learn more about this specific tool or maybe about some other tools in particular. And I see that we do have a good amount of beginners as well. Rachel any comments or callouts about that? No. That all seems pretty good. I know that, like, having a background, like I said, in data science, the data preparation side, there is always a joke that, like, eighty percent of your time is preparing your data, and it seems like it's a little less now, which is nice, at that time, it was, you know, majority just excel and blah blah blah. Last of data, but all this seems pretty pretty accurate with it. So Yeah. Agreed. Well, let's continue on here. And today, again, we're talking about Tableau prep. So what is that? This right here is a picture of Tableau's overall landscape. And whenever I'm doing a Tableau training, I always like to start off comparing Tableau to art. So we're connecting to a visual analytics software, and we're wanting to tell a story with our data. So let's say we're gonna paint our picture, we need a few things. One, we need paint. We need a canvas to paint on, and then we need a gallery to show it off in. Now where we all start is making sure we have all the right colors of paint that we want. If they're mixed up, the right consistency before we take that brush to the canvas. So in Tableau, those are gonna be all of our different data sources, and we can have a lot of different ones at our company. Tableau has eighty plus native data connections available. So maybe you're working with Snowflake, Excel files, maybe Tableau files, all these different sources that we're wanting to bring together, and, again, connect to a desktop. Now I saw that there was that small percentage that doesn't have any data issues, but for the bulk of us on this call, we have to spend time making sure that cleaning and shaping that data to work correctly for us. So for those of you that have a creator license, you're able to get Tableau prep for free. This is where you can connect to all those different data sources. You can shape and combine the data. Save those workflows and possibly publish them to the server. So this is what we're gonna be talking about today, is this data preparation piece. So once we have cleaned that data, then we can push out that hopefully singular clean data set and connect to it in desktop. This is where we're going to be painting that picture, drawing what's happening in the background or that overall data story, and then we're gonna be publishing it directly to the server. Now what are the issues we're really trying to solve for? Again, we're wanting to try and clean that data so then we can start building those visualizations that we need. Maybe we have a lot of fragmented sources. Meaning data in multiple platforms. Maybe we're trying to connect to Microsoft Excel, Google sheets, and some other source, and we need to do all that together. Instead of trying to solve all that in desktop, push that back to tableau prep to let it do the heavy lifting. Maybe we're trying to work with fragmented tables tables from multiple data sources, or most likely maybe the data is not ideal for Tableau. Where we need to pivot the data or maybe aggregate it to different levels so we can connect and push that data out. Tableau likes its data, nice and tidy. You can see here there's a picture, some columns with data listed underneath it. This is how Tableau likes to quickly read and interpret that data itself. I'm sure that a lot of you have connected to an Excel workbook or something with data that looks like this. I have project name, project lead, and then some different dates with some numeric values underneath it. Now Tableau could easily connect to this data But us as creators, if we had to go and start building with it, I know that q one of twenty twenty, q two of twenty twenty, those would all be separate measures. And if I was trying to build a bar chart quickly with that, it'd be really frustrating to have to continue to pull out each one of those individually. Though we wanna try and avoid that. Maybe you would wanna pivot that data so we could more easily build with that. Another example, might be like this. Where, again, we have that project name and project lead, but look at the columns over to the right. First, I'm gonna see biology. And then underneath it, I see January and February students. Now I know if I go to connect to this data in Tableau, I'm gonna get a lot of null values. Because Tableau is confused as to which column should it be reading. And since someone has merged the cells for biology here, Tappo's gonna bring back that the no values for us. So, again, another instance of something we want to avoid. So let's take another poll here. I'm curious. What is your biggest challenge when it comes to data preparation? We're gonna see a few options here. Maybe it's cleaning messy data, which I'm sure a lot of us deal with merging data from different sources. Dealing with large data sets, creating a repeatable process, really important, or other, feel free to let us know in the chat. And there's always the option. Oh, I always feel like there could be a all the above. Yep. See, somebody has responded. Callings at all the above. Yeah. Right? Like, that's definitely a new So it's kinda like, what is the one that is either the biggest or the one that is the most of a headache, I guess, is the thing? So it's like, you might end up spending a lot of time doing one or the other, but, like, for some reason dealing with large datasets is really what, like, is the most annoying in your day to day. Yeah. I see the joining different levels of aggregation. That's a big one for sure. Or, the duplicating data from multiple data source or pulls. Yep. Mhmm. Slow workflows. Most definitely. Most definitely. Alright. In the database. Yeah. So, yeah, I think we're getting close to a quorum. So I'll end the poll. Yeah. So it looks like the bulk of you guys are saying merging data from different sources. I think that that's something that prep is really good at handling because in Tableau desktop, sometimes one, maybe we're having to deal with performance, and desktop starts to perform slowly because we're merging all these different data sources, or possibly we have multiple different sources that we're trying to bring together and it's not playing correctly. And prep can also help serve that is solve that issue. Looks like the second answer was cleaning messy data. Absolutely. Whenever we connect, and, I even, like, using prep to get a quick overview of what the data looks like. I think it has a really nice interface to where you can quickly scan and see where some issues might be unlike desktop. Desktop, you have to start building worksheets to try and answer those questions. Mhmm. Rachel, any other comments about our results here? No. Those all seem pretty, like, what you would you would think. I think the merging data from different data sources is a big one. It would be great if everything was in a single c table or a single Excel or whatever it might be, but it often is not. So I think that is definitely the one that I've seen the most in a project that I'm working on right now actually is working with, survey results. So it's a, you know, eighteen different surveys trying to bring it all into one place. So I completely agree with that one. Is notorious for being dirty. Especially if you allow people to freely input answers, that is just a pitfall because someone's gonna spell it wrong. They're gonna have different cases all the above. Yep. For sure. Well, let's continue on here. So how does, Tableau like its data? So maybe some of you feel like Homer Simpson right here. And you're like, I don't even know where to start. I have all these different data sources. My data is messy, and I'm trying to build a desktop and everything sucks. So let's think about whenever we should be using Tableau desktop as compared to maybe using a data cleaning software. So ideally, in Tableau desktop, this is where we'd wanna do our basic joints and pivot. So let's say we've connected to a file, that we have a few different sheets that we wanna bring together. Great. We can do that in tablet desktop. Maybe we have some standard calculations that we need to add in. From standard may metadata. Absolutely, that is something that Tableau is capable of handling. Custom sequel possibly. Now Tableau prep, it can do all the above. It can do all of our joints, our calculation, that custom SQL if needed. And what's great is we're taking the heavy lifting off of desktop of making all these connections adding in this metadata and allowing prep to handle all that forest and pushing that singular clean data set to Tableau desktop. Now, again, prep does a really good job of doing a lot of the data cleaning processes. Now there's still other options outside of prep. Excel. We have some other tools like Alteryx, Matillion, maybe you're trying to create some custom tables in SQL, There are still other options out there. And for instance, if you're working with really, really large data sets, maybe you should be working with AllTrex instead of Capital Pratt That's a conversation we're more than happy to handle. Possibly, you know that you have a data team. And the data set that you're trying to bring together is maybe a company wide data set that everyone needs to use. If that's the case, maybe you shouldn't be creating a prep workflow to push out that data. Possibly, you need to be going to the data team to ensuring that they're creating a specific view in SQL for everyone to continue using. So know that prep is a fix for a lot of your use cases. But if you're trying to then make this a data set that is continually used by everyone, then you might wanna reconsider does this make sense? So let's take one more poll here. Now I'm curious, what tools do you predominantly use to clean and prepare your data. Maybe it's a dedicated data preparation tool, like Alteryx or Tableau prep. Hopefully, not everyone answers this, but in spreadsheets or manual table manipulation. If you're going back to excel and manually editing and creating that, I get it. I've been there before. It is sometimes a necessity because maybe some other options are not available. A built in data prep visually those visualization tool, like Tableau or Power BI, Maybe you have some options like Python, r, or other scripting languages, or in a database or a warehouse with SQL. Mean, I think before I started here, I think a lot of it. I think a lot of it was ex Excel. For sure. Mine. Mine too. Yep. Yeah. Or, like, in my grad school, we did SAS. Oh, somebody else just said SAS. Yeah. In grad school, we worked with SAS a lot. So there is an enterprise miner, so that you could actually, like, bring things together. It's kinda like the original alteryx, if you look at that, or the original tableau prep, something that SaaS created. I like the answer of don't underestimate search and replace in Excel. I am amazed at Excel wizards. You all are just absolute beasts. I don't know how you do it. I've seen people like type super fast in Excel and, like, never even touched them else. Meanwhile, I'm, like, remember how to get a second line. I have to always look up whether it's, like, alt enter or alt shift enter to do, like, a second line in a single cell. So I think we're hitting the end. So we'll end this poll and share the results. So it looks like my guess is correct. Fifty percent of you are saying that in spreadsheets or manual tables, you're going back and manipulating the data. And Rachel and I can relate to you in both of our previous jobs. That's what we had to do. Those were the only options that we had or what we knew was the quickest way to go and manipulate that data. Another the second highest was in a bay database or warehouse with SQL. Absolutely. I think that that's very common across most industries now to where we're going back and creating new views and other things to fit our needs. And That's actually a good answer. If you go back to the original source itself and edit that, then not only yourself, but others that have access to it are gonna be able to connect and have that data available and cleaned for them. So that is a good answer. Rachel on the other call out before we continue. No. I don't think so. I think, yeah, a lot of people have different tools that they're using SaaS, data, talent, data bricks, a lot, a lot of people have, have may do with things that we had available to us at the time. SPSS, like, if we're gonna go really far back. So, yeah, That's about it. Perfect. So what do we like about prep? Something that I really enjoy is that it has a quick user friendly interface. Perhaps came out a few years ago, and whenever I first opened it up, honestly, I didn't have to have a whole lot of training to understand what the buttons are and how to understand it. It also gives us detailed documentation of everything that we do. It leaves an audit trail for all the changes that are made, which I love because if I go and create a workflow today, publish that to server, and then maybe I have to come back eight months later and figure out why my data is not coming out the exact way that I'm expecting, I'm gonna forget what I did to build that workflow, but thank goodness. Prep is gonna have all those steps all the changes that I made to that data so I can go and review it. And this allows us to have that repeatable process as well. Where we can create that workflow. And as data is continually coming in, it's automatically pre pushed out to the server and cleaned into the way that we're expecting it. Also mentioned this earlier, it gives us insight into the data during the cleaning process. A lot of time, clients, and even consultants will say to me, you know, I don't really know what's behind this data. And I agree with you. If you connect to a dataset and you're trying to figure out what the data is using Tableau desktop, It's gonna take you a while. You're gonna have to sit there and create a lot of different worksheets to try and figure out what's happening behind the data itself. Rather than just opening up the raw data source. Tableau prep has a great profile pane that's gonna give you a high level overview of what those columns are the data types and the highest and least values and such that are included in each one of those columns. So let's go ahead and look at the tool itself and see what a sample workflow would look like. I'm gonna be using a workflow that is called sample Superstore. And for those of you that have joined one of these workshops before or gone through a Tableau training, you've seen this data set. It's kinda like an Amazon store, and we're selling different products across the US, and we're trying to figure out how much money we're making. So in this workflow, it looks like there's a lot of steps. But I promise you whenever we open this up, you're gonna see that it's not that overwhelming. So on my dock down here, you're gonna see Tableau prep builder, and it has a similar icon to Tableau desktop, except this one is blue. We're gonna open it up. And this is what Tableau, builder prep builder looks like. So I can see I have my connections ping over to the left. Again, it has all the same data data connections that we would have available in Tableau desktop. In the middle, we're gonna have our most recent flows that we've looked at. Sample ones down here at the bottom, and per usual, just like in Pueblo desktop, You're gonna have some pieces for blogs and also training videos over here to the right. Now what I wanna look at is this sample Superstore sample workflow. I'm gonna go ahead and open that up. So the data set that we connected to is a combination of text files and Microsoft Excel. And I can see that I have orders all across the US. South, east, west, and central. So the very first icon right here is an input to where we're inputting that specific file. Whenever I click on the tool, I'm gonna see that it opens up the menu that tells me what's included behind the input. I see that I've brought in twenty two columns. They're all listed down here in the bottom right. Now one thing that I really like about prep, as compared to desktop is let's say that I'm looking at all these columns of data. And possibly, I don't need all of these. Maybe I didn't need a row ID. I can select to remove this column. Now what's different about removing a column here in prep as compared to Tableau desktop is once it's removed here, it's gone forever. While in Tableau desktop, it just hides that column in the background. So I can select, hey, I wanna get rid of a few of these different columns, and we're done. Now let's say that a cohort worker or even yourself, you come back a few months later and you're like, wait. How do I know if Maxwell removes some columns or something else? Again, Tableau prep is always tracking our changes, which we love. Up here at the top, I'm gonna see a little changes tab. Where I can see that I've removed both of those fields. Maybe Rachel's thinking Maxwell, why did you remove those? We're gonna need them later. Rachel could come in and say, well, I'm gonna bring those back so then we can continue using those. So this is one of the great things that I really like about prep. So once we've taken a second to select what data we wanna bring forward, then we would want to add in a clean step. A clean step is to where we're gonna get a profile of all the columns, and we're gonna be able to edit and change the values if we need to. On the input tool, you will see a little plus sign. These are all the tools that Tableoprep has included. Gonna be looking at this clean step, which is the one that you see listed here. Now this is what I was talking about using Tableoprep as a quick way to review and get a snapshot of what your data looks like. Again, I can see that I brought in twenty two columns. Over two thousand rows here, and I can have a great profile of what each one of these columns are. I can see that I'm selling a lot of items, and I'm typically selling between three to four items for the most part. My profit is typically between zero to a thousand, And quickly looking, I see that Florida most likely is my highest selling state. So whenever I have client data or even data that I'm working with on my own, I start here. I can quickly scroll and see what the data looks like and then decide to take action from there. Now also here, you have a lot of options available to you. This is where you would spend time cleaning each of these columns. You'll see a lot of options like filtering, cleaning, may maybe you wanna change the case, remove numbers, punctuation, get rid of spaces. Rachel will mention survey data earlier. This is a great tool to go and clean up survey data. I do it all the time. Grouping values splitting and all sorts of functions available. So if I'm honest with you, you're probably gonna spend a lot of time in Tableau prep in the clean steps itself. And, again, it's gonna constantly track every single change that is made, and you can go back and review that as needed. We have a quick question. Is there a limit on the number of fields that you can have within, like, a tableau prep? There is not a strict number that Tableau push puts out. I will say that if the dataset is really, really large, and you're connecting to it. What Tableau will do is it'll automatically sample the data So you'll see at the input tool, there's a data sample section, and it's already on automatic. So if it sees that it exceeds the number of rows, then it's gonna go ahead and turn on a sample that you're working with throughout the entire cleaning process. Now The data is only sampled whenever you're building the workflow itself. But whenever the data is outputted, then you're gonna have the full number of rows outputted at the end. Awesome. Thank you. Yeah. Good question. Alright. So here in this interface, again, we have four inputs. I have four clean stat And even if you don't perform a cleaning, I do recommend to sometimes have a clean step just to review the data. Down here in central, you can see we did a lot of steps. We cleaned a lot of fields. We removed a lot of fields and transformed some data types too. Now in desktop, if I had four inputs of the same type of data, most likely I would want to union that data. We can do the same thing here in prep. And I actually really like using the interfaces of prep compared to desktop. Because in prep, whenever I'm performing a union, it's gonna tell me which columns of data came from which input. In Tableau desktop, it doesn't allow us to see that. The important part of that is what if some of those columns are mismatched? Notice, I can see that there's one field in particular that didn't match, all the inputs. I'm gonna show only the mismatch fields. So the data from the south had an extra column called file paths. Now most likely, I wouldn't need this data. So I could select here for a move it if I wanted to. Or I could decide to split it and keep just the dates, whatever I needed out of this dataset itself. I think something difficult whenever we do perform this in desktop is figuring out which column came from where. And do I need it or not? But in prep, it's going to allow us to quickly display that and fix it if needed. Now once I union this data together, I've made a singular source longer by appending more rows. Possibly then, I'd want to added more data. I know that I had all this ordered data. People came and bought all these things for me. Hopefully, I'm making a lot of money. But we can imagine people are gonna be like, well, I also want to return stuff because I don't want everything. So possibly we have another dataset during this workflow that we need to then join to the original data itself. So we can perform a join, much like we would, again, in Tableau desktop. Now selfishly. Again, I I do like the interface of prep for joining as compared to desktop because we have a lot more option. I'm gonna quickly navigate back to our PowerPoint really quick. And just a quick little overview here, We do have our standard join types. A lot of us are probably used to this. We have four that we use in desktop, an outer join where we're bringing everything from both sources, an inner join where we're bringing the matching rows from both sources, A left, meaning we bring in everything from the the the left matching of the right, the right being the opposite. Now in Tableau prep, we actually have a few more options. So here, we see that VIN diagram of what we're used to seeing for the different join types. What if we wanted to bring in data of rows that did not match Possibly, we were wanting to do some data validation and figure out why don't these rows have corresponding ones in each one of the data sets. I could select to have a joint type that is just bringing ones again that aren't matching between both. I could do non matching from the left, non matching from the right. I think that this is a really cool functionality that prep gives us. Again, really good for data validation. Now let's say that maybe I just wanted a standard inner join. Great. I have it. I also really like the summary, pain down here. It's gonna tell me from each source, how many rows are included, how many are excluded and what the total join results were, six hundred and twenty one. Now I think all of us know This is not what we get in Tableau desktop. So it's really nice to see these results here in the tool. We could also create a option to show only mismatching values. Maybe we know that we have three rows that did not match on that dataset from the left. Possibly we wanted to fix them to match the ones on the right. You have the option to come in and fix these as needed. And, again, the whole time tablet's gonna be tracking every single one of those changes that we'd make. So now that we've connected to all this data, we've made the source longer and wider by performing a union and a join, then possibly we're ready to output the data. That brings us to our output step. The output step gives us a few options as well. We can maybe select to output this directly to a file. Maybe we're doing some ad hoc reporting to where we wanna build a POC dashboard for a client or for an internal project to make sure that this makes sense before we're building a production dashboard. Or we could select to publish this directly to the server. Which, in most cases, probably is gonna be your use case. You can see where you can come, select how you want that to be published, Let's say that I select publish data source, you would then wanna make sure that you're signed in the server for that specific site that you wanna published to, select the project that you want it to go to, and also give it a name. We do have a few right options as well. Initially, Tableau is going to let us create a new table, but let's say that continuing using this workflow time and time again, you probably don't wanna refresh the entire table every single time. Someone talked about a large data set earlier. We don't wanna have to refresh that entire large data set. Instead, we might wanna select to append data to the table. Just bring in the newest record as needed. So that's gonna be start to finish your workflow. But then you could even have people come to you and say, hey, that date is fine, but maybe I need it aggregated to a different level. We could also add in an aggregation step. Maybe we know that the other data that we have is rolled up to the year level. The aggregation step will allow us to do that. Where we can select that we wanna roll this up to year in region and then have that at a different level of granularity. And then output that data to a different, file so then people can work with both as needed. So this gives us a lot of flexibility and allows us to answer a lot of our customer or our internal needs. So I wanna pause here. Rachel, do we have any questions or comments that we wanna call out before we continue? We've had a few questions. I think I've answered a couple of them. Some people are what when you're publishing the the the information later on, like, if you're publishing it, what does why does it show up as live? On Tableau server, and it's because of an it's actually con it's actually an extract. It's a hyper extract behind it. But you can't just, like, refresh the extract on server to connect to the data. You have to actually run the workflow. So it's, like, in a way, it's kinda live in the that the user outside of you scheduling the workflow cannot update it. So that's just one call out. I wanted to say, One, another question is where is it? Can it output to a TDS with an extract? I don't know that one. Do you know if we can do a dot dot dot t d s? So if we're Yeah. One of the questions is what are all the formats it supports for output? Yeah. So if we're doing it to a file, it's gonna allow us to do a hyper, Excel, or a CSE. Those are the, three file types that we have. Mhmm. So, yeah, we just had a question about can it be used to add and populate new sheets in an Excel workbook? It seems like it can output to Excel. So Yep. I'm not sure about individual sheets, but it can output to an Excel. All good questions. And then one more question is can you undo the changes at any time? Mhmm. You sure can. We could come back to any of these clean steps. We could remove some of these columns as needed. I will make one little comment. Let's say that I created a calculated field that was used in this particular clean step here towards the beginning. If that field was used in the workflow throughout at any other point and I remove it, Tableau prep might be like, whoa, wait. Where did that field go? And give me a little warning indication. Now let's say if you go back and undo a change that you never use throughout any of the other workflows, It's gonna be fine, and it's gonna allow you to continue on. So just know what you're removing and make sure that it's not being used anywhere else. Good question. Well, I'm gonna go ahead and continue on. Yes. We have a lot of questions. So I will continue if anybody wants to also have the QN up once they're answered, I believe you should be able to see that. So I'm gonna just be answering some on there as well. So I wanna look at a quick little example, and this is a very small data set of just creating our own little workflow here during the call. So let's say that I was trying to create, connect to some self data. I have two sheets. The one on the left, I can see I have sales data where someone in Excel has nicely decided to put in a title. Their little date range, and then the column or data itself, employee, month, and sales. Very common. I know that all of us have seen this before. The other sheet is what we might expect to connect to typically in Tableau. Employee dates with all the data underneath it. So we have a few issues that we need to solve for here. One, we need to make sure that we're not getting no values whenever we connect to the data on the left. And then on the right, we're gonna need to transform that data so then it matches the data here on the left. So in prep, I'm gonna go ahead and close out that workflow that we were just looking at. And I wanna connect to this Excel file that I was just showing you guys. Alright. So I'm gonna go ahead and drag this beat this table into the view, and it's gonna create an input step. And in this input step, I'm already reviewing it, and I see that I have some nodes here, and the column header being f two and f three. Now for those of us that have connected, in Tableau desktop, we know what's happening here. We're getting null values in the f one and f two because someone put this title and this null information up here at the top. So in Tableau, if we're using Excel, we can select use the data interpreter. Just like we would in Tableau desktop. Once I select that, Tableau's then gonna give me the columns that I was expecting, employee, month, and sales. So now I can add in that clean step, and I can view that data, and it looks good. So now I'm happy with this, what I was expecting. Now I also have a bonus sheet that I was wanting to connect to this data set as well. I'm gonna go ahead and add in the bonus sheet. Now I did a little trick. I double clicked on the sheet or on the table, and it automatically went into the workflow. Now this one, I can see that the columns came in just fine. Because, again, someone didn't put in random information at the top of that Excel sheet. Now what's different about this one is I need to transform this data. I need to have a column for months and I need to have a column for sale. So in Tableau prep, this is where we would want to add in what is called a pivot. Much like what a lot of you do in a Tableau desktop. I'm gonna select the pivot step here, and I just need to drag in the fields that I wanna pivot. It's gonna be these different months that I have selected. So I'm gonna go ahead and select the three months, and I do want this to be from columns to rows. And I'm gonna drag them into the pivot fields. So now I'm gonna have my pivot names, which is month, and then my values is going to be sales. Perfect. So now I can add in my clean step, and I can see that Now the data between both of these sources are matching. I have sales, month, employee. If I look at the top clean, I also have employee month in sales. So now since I've cleaned and transformed this data, I can now join both of these together to make one singular data set. Now our join step will have that VIN diagram that we're seeing, and I will need to drag the clean step over the join and select add. We can then select how we'd want this to be joined. I'd want to at a join clause. And these would talk through the column of employee and selected an inner join to where I'm bringing in the matching rows of both. So this is what we're wanting to use Tableau prep for. Connect to two sources. Quickly clean and manipulate that data to where we can output a singular clean source of data. Now this data set's very small and also didn't have a lot of columns included. But I know that you guys are gonna work with large data sets. Want you to spend the prior the majority of your time first trying to figure out what is the data that I'm working with. You're probably gonna stay in this cleaning profile to interpret the data, edit those different fields, and then connect and pushed out that data itself. Now I've talked about using prep builder to create these different workflow. And right now, these are just living on my local computer. The next step, which is really important, is then productionizing our data, our workflows. Prep has a, option which is called prep conductor. Prep conductor allows you to publish workflows so others can access and leverage them. And, also, we can schedule these workflows to run at certain time. Now to use prep conductor, you do have to have the data management add on. So not on you on the call or your companies might have that. But if you do, then you can start using this today. And, honestly, this is a game changer whenever you do have this add on because it makes your live so much easier. So this is what prep conductor will look like. Honestly, it it would look like you're going to open up a dashboard in in server But instead, you're gonna see that overall workflow. You're gonna see the different output steps that are included. And you'll see where you can also select to run these on a schedule. I will share a link here that's gonna open up. But this is all the information behind prep conductor. So you guys are more than welcome to review that if you would like to, if you have any questions. But I'm gonna come back to our sample data set here, and I'm gonna close out this one that we just made. But I'm gonna open up that sample Superstore one again. Now before I publish this to server, I would need to go ahead and change both of these output steps to be publishing the data source to the server itself. Again, selecting a project and giving it a name. So give me just a second as I am setting this up. Perfect. Alright. So now I have created both of these workflows to publish to the server. Now there's a little difference here. Right now, if I just press run, what Tableau's gonna do is it's going to send both of those outputs just the dataset itself to the server, which is fine. Maybe from my desktop, I wanna go ahead and just share both of those datasets. Now, ideally, I'd want to share this entire workflow with Rachel or others so they can go back and review it and edit it as needed. So much like in desktop, if you were gonna publish a dashboard and prep, you're gonna go up to server, and you're gonna select to publish the flow. Whenever I publish the flow, it's gonna take a second to load, and it's going to ask me again, where do I wanna publish this? Give me some kind of name, and then a description, much like you would with the dashboard, and then select publish. Now this one is not very big, so it should publish pretty quick to the server. And once it does, it's gonna go ahead and open up the window just like it would if we published a dashboard. So tada, there is our workflow. Now here's where Tableau prep conductor comes into play. Notice that up here at the top, it's going to ask me about a run schedule. When do we want to clean these datasets and refresh this data. So I can see that no task had been set up. I can create a schedule. So let's say that I wanna create a new task. It's first gonna ask me what schedule do we want this to be run on? This is gonna be depending on your server admin of how many of these are added, but maybe I want to refresh this at the first of the month at one AM. You can automatically include all the output steps, or you can run these independently if you want to. And we can go ahead and create the task. Now once the once this task is created, it knows that we need to refresh it the first of the month. I also have the option to go ahead and run this now if I want to. So if I press run all, it's gonna say, hey, we're gonna go ahead and run this whenever resources are available. And then it's gonna go ahead and refresh that data. And have everything ready for us. So I'm gonna have a pending and then in a few minutes or a few seconds, it will stay complete. Again, what I really like about this is also I have the entire workflow created and uploaded here. So let's say that Rachel runs those workflows, She connects that data in Tableau desktop, and then she notices that I missed something. Rachel has the ability then to come here and select edit flow and make live changes to this workflow. She doesn't have to go back to, prep builder, download the workflow, edit it, and publish it back up. She can make those changes here in the server environment, publish it, and then make that available to everyone else. This is awesome, and I love using prep conductor for it. So Again, you do have to have the data management add on for this. If you're unsure, feel free to reach out to your server admin or your internal teams and ask them. And if you do, you can go ahead and start using that today. So what's next? So we've learned about what Tableau prep is. How can we start utilizing some of those different toolboxes like the input tool, the clean step, everything else? And then how can we productionize that data? Using the data management add on and publishing that to the server. Now I'm sure a lot of you are gonna have questions. Rachel brought up a very good one at the start. Maybe you're working with survey data. If you're trying to map some different tables, combine different files and structures, or possibly you're working with some complex calculations or cleaning steps. We have a ton of great resources available to you, and you'll get a, listing of these in your follow-up email as well. So, again, I do wanna thank you guys for joining us this month or today for Tableau Prep essentials. Again, this session was recorded. So you will get a follow-up email with the link here in a few days, and you'll be able to watch that replay anytime that you need to. Also, feel free to check out the interworks blog We have a lot of great content about Tableau prep specifically, also about general data. Maybe you just have questions about performance and How should I be setting up the data, my data at, in some kind of other format? We have a lot of great pieces on that as well. Also, next week, we're gonna be hosting on Tableo Cloud nine. So feel free to come and join us. And I will look forward to seeing you guys in the next workshop. So with that said, we do have some time for Q and A. And, again, I do wanna thank you guys for joining us on this Thursday. Rachel, were there any questions or callouts that anyone had. Yeah. But before we get into that, I just wanna do one more survey for everyone. Just trying to see a little bit of information about it. Just seeing how we can can help you out with your your, like, further expanding your knowledge. So I'm not gonna share these results. These are gonna just for us. So feel free to answer as you would like. We have a few more questions, I think, in the QNA, a lot of them have to do with whether prep is considered is is comes packaged with desktop. Or if it's a separate thing and a couple of different questions about that. So do you know what as far as licensing comes, if they get prep if they hit desktop, they also get prep. Correct. So if you have a creator license and, much like whenever you download desktop, most likely you're logging in through a server. In the past, they'd give you a specific key, but a lot of times companies now have it logged in where you just say like, hey, here's the server link, and then you use your standard company credentials to log in to server or to log in to desktop, the same things for prep. Download prep, it's gonna be like, hey, either what's that same key to use for desktop, or what are your credentials that you log into the server, and then it's gonna gonna allow you to use it. So, absolutely. Yeah. It's not conductor or the add on, but Correct. You should be able to get as is. Couple more questions. Is there a version control for flows, like when you're uploading it? Yeah. That is a great callout. You need to be aware of the server and typically, you want to be following the same version of the server if you're using prep conductor whenever you are publishing that workflow. So, yes. And yeah, the ability to revert back to a previous version. I think it's similar to, like, when you have workbooks and and on server, you can always revert back. Yeah. Correct. How does this work in Tableau Cloud versus Tableau Server? There is, like, a couple different documentation online about how you're using both of those. And it's available on both. So yeah, I would just say look online to see what's going on. The kind of the differences between that. And then I think that's a lot of it. There's a couple other questions. I realize that if I did not get to your question, some of them are a little bit more detailed. So we might reach out to you separately from there. I know that I've been focusing on the Q and A's, but I think I saw a few questions come up in the chat as well. So apologies again. We have those transcripts. So if there's anything that we didn't get to, we'll be sure to try to cont can bring that together. Yeah. We'll make sure that we'll reach out to the email and, answer your question and see if you guys do need any help with follow-up. We are here to support you and try successful using Tableoprop. Well, I see all you think you use in the chat. We love doing this, so we're glad to see you guys coming back to listen to us. Again, there's a workshop coming up next week, September twenty seventh, and we also have a Tableau dashboarding essentials coming up in October. So go to the Interworks events page, and you can sign up for any of those there. So, again, I wanna thank all of you for joining, and Rachel, thank you for being my cohost. Love having you here too. I hope all of you have a great rest of your Wednesday or, sorry, Thursday. We're not trying to You know? We sure we get you every time. They give me every time. Absolutely. But we'll make sure that we cap catch all of you next time. So enjoy it, and we'll see you soon. Thank you all. Thanks guys.

In this webinar, Maxwell Croft, Membership Enablement Lead, and Rachel Kurtz, Analytics Architect, presented Tableau Prep Essentials for analytics professionals. This session covered the fundamentals of Tableau Prep, including best practices for cleaning, combining and shaping data for dashboarding. Maxwell and Rachel demonstrated hands-on workflows for data preparation — such as input, clean, union, join, pivot and output steps — showing how to transform messy or fragmented sources into tidy, analysis-ready datasets. Attendees also learned about integrating Prep workflows with Tableau Server and Prep Conductor for scheduling and collaboration, with practical advice and Q&A to help users establish repeatable, documented processes in their daily analytics work.​

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072

×

Love our blog? You should see our emails. Sign up for our newsletter!