Introducing Dremio, a Gnarly Data Solution

Data

Introducing Dremio, a Gnarly Data Solution

InterWorks recently joined Dremio’s Global Partner Network, so naturally we’re here to give you a rundown of what Dremio is and when you’d look to use it. Check out our post announcing this new partnership.

Dremio is a SQL Lakehouse Platform that features a shared semantic layer and provides high-performance queries for interactive analytics on data lakes. This is in part due to the open data architecture ideology, which separates compute and data. In this paradigm, the customer’s data remains in their cloud storage, on top of which they can adopt and utilize best-of-breed technologies more flexibly than ever.

Sweet Dremios Are Made of This

Dremio utilizes several open-source projects for the internals:

  • Apache Arrow as a columnar memory format
  • Apache Arrow Flight for performant data transport
  • Iceberg as an open table format (also supports Delta Lake)
  • Project Nessie for atomic data changes

As for the pieces that developers will see, Dremio mainly focuses on physical datasets and virtual datasets. Physical datasets are the various data lakes and sources that contain the data. Virtual datasets are built on top of physical/virtual datasets to make up the semantic layer. This can be thought of as the presentation layer where users can access the available datasets.

Another important feature of Dremio is the ability to create “reflections”. These reflections are pre-aggregated copies of data that can be used over the physical datasets if the query optimizer determines the reflection would be more efficient. These reflections can also include sorting, partitioning and various levels of date/time roll-up. Multiple reflections can be set up to cover a variety of scenarios, and Dermio’s query optimizer will use the lowest-cost query that provides the data required.

Who Am I to Disagree?

Dremio isn’t for every customer, and it won’t fit everyone’s needs. It also doesn’t eliminate the need for data warehouses across the data ecosystem. In general, Dremio is going to best fit customers who have already made significant investments in large or multiple on-prem/cloud data lakes.

Many clients have their data in cloud data lakes with untapped potential due to more traditional query engines. These are typically slow and require users with very specific technical experience. Dremio can help these data lakes and interactive BI efforts flourish by providing extremely high performance and a user-friendly development experience.

That’s not to say that this is the only fit for Dremio. It can also help with offloading EDWs or migrating clients to the cloud. If a customer has overwhelming amounts of data in their EDW and the pricing no longer makes sense, they could be a good candidate for Dremio. Likewise, if their BI capabilities are being limited by a traditional RDBMS and they’d like to explore migration to a cloud data lake, Dremio could help enable their success.

I Travel the World and the Seven Seas

If your data landscape is a good fit for Dremio, there are plenty of data lake integration options, including:

  • Azure Data Lake Store
  • Amazon S3
  • Amazon Glue
  • Google Cloud Storage
  • HDFS
  • NAS
  • MapR-FS

Dremio also supports many relational database systems, as well as several NoSQL databases.

Everybody’s Looking for Something

Want to learn more? Visit the Dremio docs, check out Dremio’s list of resources, start your path to proficiency at Dremio University, or reach out to us here at InterWorks!

Contact Us

More About the Author

Justin Lemmon

Data Architect
Introducing Dremio, a Gnarly Data Solution InterWorks recently joined Dremio’s Global Partner Network, so naturally we’re here to give you a rundown of what Dremio is and when ...
Adding a CSV Export button to Drupal 6 Views displays Views is something that a lot of Drupal developers (myself included) likely could not do without. It provides a great way for us to ...

See more from this author →

Subscribe to our newsletter

  • I understand that InterWorks will use the data provided for the purpose of communication and the administration my request. InterWorks will never disclose or sell any personal data except where required to do so by law. Finally, I understand that future communications related topics and events may be sent from InterWorks, but I can opt-out at any time.
  • This field is for validation purposes and should be left unchanged.

InterWorks uses cookies to allow us to better understand how the site is used. By continuing to use this site, you consent to this policy. Review Policy OK

×

Interworks GmbH
Ratinger Straße 9
40213 Düsseldorf
Germany
Geschäftsführer: Mel Stephenson

Kontaktaufnahme: markus@interworks.eu
Telefon: +49 (0)211 5408 5301

Amtsgericht Düsseldorf HRB 79752
UstldNr: DE 313 353 072