Data users rejoice! Dataiku 5.0 has been released, and it brings with it fresh new features. Dataiku has always been an amazing tool for building data products as a team. With the release of Dataiku 5.0, they’ve made the tool even better. Their focus in this release was on enabling your team to communicate better while still providing new tools that allow you to do more with your data. You can dive into the complete list in Dataiku’s release notes, but here is the rundown of our top five features.
1. Data Together with New Collaboration Tools
Say goodbye to comments and hello to discussions! The new discussion feature in Dataiku lets you and your team improve every part of your data project. Whatever you’re looking at, from datasets to receipts to models, it’s simple to spin up a new discussion and get to the right solution. All these discussions are rolled up in a single location (your discussions inbox) so that it’s easy to track the things you care about. Discussions feature-rich editing capabilities, notifications and integrations, so it’s never been easier to collaborate with your team.
2. Easier Deep Learning in Keras & TensorFlow
Dataiku’s machine learning interface has always brought data science to the masses. It enables citizen data scientists to provide value without mastering Python but also accelerates your data science experts in creating the perfect model (and handles all the nasty documentation to boot). In Dataiku 5.0, it gets a significant upgrade by enabling these same groups with new deep learning capabilities to build powerful, state-of-the-art models.
Deep learning in DSS is “semi-visual,” so it greatly reduces the amount of code that needs to be written. To start, you need to write the basic Keras code that defines the architecture of your model. From there, Dataiku DSS will handle all the other tasks, including preprocessing the data, feeding the data into the model, training the model, creating charts to evaluate the model and integrating it into Tensorboard. Dataiku’s deep learning is based on the Keras + TensorFlow couple.
3. Enabling Production with Containers on Docker & Kubernetes
If you’ve been following DevOps in recent years, you know that containers provide an excellent solution to bring your project to production. In Dataiku 5.0, you can now run parts of the processing tasks of the Design and Automation nodes on one or several hosts in either Docker or Kubernetes flavors. This includes:
- Python and R recipes
- Plugin recipes
- In-memory machine learning
DSS deep learning supports training on CPU and GPU, including multiple GPUs. Through container deployment capabilities, you can train and deploy models on cloud-enabled dynamic GPU clusters.
4. A Huge Interface Upgrade
I’ve always thought Dataiku had industry-leading and thoughtful design, but the latest upgrade takes huge strides in making it even better. The new homepage has been reimagined to show you the most relevant and recent items. The new addition of drop-down menus means you can get to your tools in fewer clicks, and it’s much easier to find exactly what you need. Finally, the way they handle wikis and being able to reference them while you work is simply fantastic. Speaking of wikis …
5. Spreading Knowledge with Project Wikis
Every Dataiku project now a knowledge base in the form of a wiki. You can use the wikis for numerous purposes, including tracking project goals, sharing insights and standard project documentation.
The wiki is based on the well-known Markdown language, includes the standard features you’d expect (such as uploading files) and has a host of features to help you organize your knowledge articles. If there’s information that needs to exist beyond a single project, you can also promote wikis so they appear in a global “wiki list” that all Dataiku users can see.