Skip to main content
Version Next

Highlights

PlatformVersionHighlights
Tecnisys Data PlatformTDP 3.0.0Addition of JupyterLab: Interactive environment for data analysis, notebooks, and visualizations; Support for multiple languages and extensions; Modern tab-based interface integrating code editing, execution, and charts..
Addition of OpenMetadata: Data governance and catalog platform; Enables lineage (data provenance), data discovery, and data quality definition; Integration with various ecosystem services (databases, lakes, pipelines).

Table A. Platform Highlights See also the highlights of the updated components:

ServiceVersionCategoryFeature/Highlight
Airflow2.10.5OrchestrationCleanup tasks guaranteed even with manual failure or success — prevents resource leakage.
Resolution of several bugs, including mapped trigger rules, ID validation, XCom with “/”, event logs, and much more.
Updated Helm Chart images to more recent versions for Airflow, PgBouncer, and exporters.
Delta Lake3.3.0Optimized table formatIdentity Columns: automatic generation of unique values for new rows (in Python and Scala), supporting two modes: generatedAlwaysAs (always generated); generatedByDefaultAs (optional, with automatic fallback)
VACUUM LITE: faster cleanup of obsolete files (5–10×) using only the transaction log — more efficient than traditional VACUUM.
Row Tracking Backfill: enables row-level lineage on existing tables, including metadata such as Row Id and Row Commit Version.
Version Checksums: adds per-commit checksums, strengthening consistency, performance, and debugging, plus detailed metrics and checkpoint validation
UniForm ALTER: allows enabling the UniForm format on existing tables without rewriting data.
Type Widening: support for type widening in Delta Kernel (Java and Rust).
Druid32.0.0Real-Time Data AnalyticsFocus on stability and performance: Relevant release with active community participation: over 341 commits from 52 contributors (~30% more than previous versions)
Migration to standard SQL (ANSI SQL compliance): Legacy null-handling settings like useDefaultValueForNull, useStrictBooleans, useThreeValueLogicForNativeFilters were removed — ANSI SQL behavior is now fixed and not configurable
Introduction of “Projection” functionality, which allows precomputing aggregations to significantly improve query performance — one of the key new features of the release.
Great Expectations1.3.5Data QualityNew feature: CheckpointFactory.add_or_update and support for strict_min/strict_max in ExpectTableRowCountToBeBetween.
Maintenance & technical adjustments: Support for row_condition with datetimes in Pandas and Spark; Scheduled cleanup in BigQuery (“cleanup every 3 hours”); Added strict parameter to Window type; Clear error when cloud mode is requested without environment variables; and guaranteed run_id in ValidationDefinition.run.
HBase2.5.7NoSQL DatabaseSwitch to avoid reopening regions when editing tables — improves operational stability by preventing “storm” of RIT.
isolate_regions command in RegionMover — allows isolating and relocating regions precisely and in a controlled manner.
Hive4.0.0Data Exploration and AnalyticsRepresents a significant leap, with around 5,000 commits since version 3.1.3.
Integration with Apache Iceberg: enhanced support for Iceberg tables, including compaction via OPTIMIZE TABLE.
Improvements to Metastore and transactions: Enhanced transactions and locking mechanisms to reinforce ACID compliance; Compaction for ACID and Iceberg tables, improving performance and storage.
Docker support: official Hive images on Docker Hub to facilitate deployment.
Compilation and execution optimizations: Anti-join, branch pruning, column histogram statistics, HPL/SQL, support for scheduled queries, and refined CBO (cost-based optimizer) rules; Materialized views to speed up queries; High performance with Tez and LLAP.
Replication and compatibility: Improved replication features for external and ACID tables; Support for Apache Ozone as a scalable storage system.
Advanced features extracted from the changelog (partial list): Native GeoSpatial in Hive, support for Iceberg compaction, metadata summary in HMS, discovery threads via Zookeeper, JWT authentication over HTTP for the Metastore, optimized HMS API, and SAML 2 support in HiveServer2.
Iceberg1.8.0Table FormatsEnd of support for Spark 3.3 and Hive Runtime.
Vectorized Deletion (Deletion Vectors): new spec, APIs, and read/write support.
Variant type and UnknownType: new types supported in the spec and API.
Operational improvements: fast append, removal of unused specs, useful procedures in Spark.
Enhanced integration with AWS/Azure and extended compatibility with Spark, Flink, Hive.
Important dependency updates for better performance and security.
Kafka3.4.1Data StreamingMigration from ZooKeeper to KRaft (initial version, not recommended for production) — allows moving cluster metadata to the new KRaft mode with no downtime
New generation field in the consumer protocol — helps manage partition claims and detect more recent consumers
Option to disable the JMX Reporter — ability to disable JMXReporter in environments that don’t use it.
New configuration options for console Producer/Consumer — --reader-config and --formatter-config parameters for better customization.
Optimized Producer ID expiration — separates control of producer ID expiration and transaction ID expiration (new configurable timeout)
New configuration options for console Producer/Consumer — --reader-config and --formatter-config parameters for better customization.
Time-based metadata snapshots — automatic generation of snapshots based on time (e.g., hourly)
Rack-aware consumers — improves distribution and allows consuming from replicas geographically closer (local AZ).
Kafka Streams (KIP-770 & KIP-837): Update of configs and internal cache metrics; Ability to broadcast output records to all partitions or drop them.
Kafka Connect / MirrorMaker2 (KIP-787): Allows running MirrorMaker2 with custom resource manager implementations — facilitates integration in custom infrastructure.
Removal of quota node in ZooKeeper when settings are empty — cleans up old configurations.
Fixed resource leak in interceptors (Interceptor resource leak).
MirrorMaker2 now reads all offset syncers at startup — increases consistency during initialization
MirrorMaker2 publishes offset syncs also during task commit — improves offset traceability.
MM2 now translates consumer group offsets through the replication flow — more accurate synchronization.
NiFi1.28.1Dataflow Management and AutomationSecurity and stability: Fixed logging of sensitive parameter values in flow synchronization logs. Even with debug enabled, these values are no longer exposed — available in 1.28.1; Cross-site scripting (XSS) protection: parameter descriptions are now properly sanitized in 1.28.0
end of support for NiFi 1.x (support ended on December 8, 2024). The Apache team strongly recommends migrating to the 2.x series.
Ozone1.4.1Scalable Object StoragePrevents race condition in the datanode when creating VERSION — increases operational reliability.
Refined SCM logs — reduces noise and avoids false errors when handling sequence ID in closed containers.
Security hardening in S3 Gateway — secret-handling endpoint now restricted to admins only.
Spark3.5.3Distributed Computing PlatformThird maintenance update in the 3.5 series, also focused on security and bug fixes. Recommended for environments already on 3.5
Superset4.1.2Data VisualizationNew charts and visualizations: Big Number and time comparisons, Heatmap, Histogram, and Sankey.
Dynamic catalog in connected databases.
More intuitive upload UI with validations.
More visual Slack integration.
Time filters and Jinja macro for dynamic dashboards.
UX improvements in dashboards, SQL Lab, and permissions.
Important security fixes in 4.1.2: prevents resource takeover and access-control bypass.
Apache Tez0.10.4~28 fixes and improvements in total, focused on observability, stability, and security.
Apache ZooKeeper3.8.4Distributed Services CoordinationSupport to limit the maximum number of connections/clients to a ZooKeeper server.
Better experience in zkCli: new option to wait for the connection before executing commands (avoids immediate failures in unstable environments).
Improved log messages: Consistent inclusion of time unit in server startup logs, Network addresses now appear more clearly when listener ports are bound, More precise error messages (less misleading).
Code cleanup and optimization: removal of useless snippets in internal components (improves maintainability and reduces noise).
Documentation and website: various fixes to formatting, typos, and clarity in manuals and admin pages.

Table B. Component highlights*/}