Dremio’s new functionality simplifies query construction, optimizes performance and storage use, and enhances compatibility for unified data lake experience
SANTA CLARA, Calif.–(BUSINESS WIRE)–Dremio, the easy and open data lakehouse, today announced robust new features that enhance the performance and versatility of its data platform. These new capabilities empower organizations to accelerate their data analytics and enable faster, more efficient decision-making. Dremio is ensuring easy self-service analytics—with data warehouse functionality and data lake flexibility—across customer data.
Among the key features unveiled by Dremio are querying, performance, and compatibility enhancements that include:
- Effortless Iceberg table optimization: Data teams no longer need to be concerned about how a table is physically stored on object storage, including file counts, file sizes, statistics, repartitioning and more. Dremio now offers SQL commands such as OPTIMIZE, ROLLBACK & VACUUM. These commands optimize performance and streamline data lake management. The OPTIMIZE command improves query performance by optimizing data layout and statistics, while the ROLLBACK command enables users to revert any unintended changes made to their data. Additionally, the VACUUM command reclaims storage space by removing unnecessary data files.
- 40% better data compression: Dremio now supports native Zstandard (zstd) compression, offering an improvement of up to 40% on compression ratios and decompression speeds. This feature enables users to optimize storage utilization and improve query performance, all while reducing operational costs.
- Tabular UDFs: Tabular User-Defined Functions enable users to extend the native capabilities of Dremio SQL and provide a layer of abstraction to simplify query construction. This allows users to create functions that can serve as native row and column policies, empowering data analysts and engineers to easily build complex calculations, transformations and advanced analytics that unlock new possibilities for data-driven insights.
- New mapping SQL functions: CARDINALITY returns the number of elements in a map or list and helps customers moving array workloads from Presto and Athena; ST_GEOHASH returns the corresponding geohash for the given latitude and longitude coordinates; FROM_GEOHASH returns the latitude and longitude coordinates of the center of the given geohash. Both geohash functions help customers move workloads from Snowflake, Amazon Redshift, Databricks, and Vertica. Geohashing guarantees that the longer a shared prefix between two geohashes is, the spatially closer they are together.
- Enhanced Delta Lake support: Dremio now supports multiple Delta Lake catalogs including Hive Metastore and AWS Glue. This allows seamless integration with existing Delta Lake-based workflows, providing a unified data lake experience across the organization.
“With these new key features, Dremio continues to provide the most powerful and flexible data lakehouse engine on the market,” said Tomer Shiran, founder and CPO at Dremio. “We are excited to empower our customers with capabilities that make lakehouses easier than ever, and allow companies to replace their expensive and proprietary cloud data warehouses with modern and open data architectures.”
These key features further solidify Dremio’s position as a leader in the data lakehouse engine space, enabling organizations to efficiently analyze, transform, and derive insights from their data at scale.
Dremio is the easy and open data lakehouse, providing self-service analytics with data warehouse functionality and data lake flexibility across all of your data. Use Dremio’s lightning-fast SQL query service and any other processing engine on the same data. Dremio increases agility with a revolutionary data-as-code approach that enables Git-like data experimentation, version control, and governance. In addition, Dremio eliminates data silos by enabling queries across data lakes, databases, and data warehouses, and by simplifying ingestion into the lakehouse. Dremio’s fully managed service helps organizations get started with analytics in minutes, and automatically optimizes data for every workload. As the original creator of Apache Arrow and committed to Arrow and Iceberg’s community-driven standards, Dremio is on a mission to reinvent SQL for data lakes and meet customers where they are on their lakehouse journey.
Hundreds of global enterprises like JPMorgan Chase, Microsoft, Regeneron, and Allianz Global Investors use Dremio to deliver self-service analytics on the data lakehouse. Founded in 2015, Dremio is headquartered in Santa Clara. CNBC recognized Dremio as a Top Startup for the Enterprise and Deloitte named Dremio to its 2022 Technology Fast 500. To learn more, follow the company on GitHub, LinkedIn, Twitter, and Facebook, or visit www.dremio.com.