Skip to content

Releases: StarRocks/starrocks

3.2.13

13 Dec 06:58
f0965dc
Compare
Choose a tag to compare

Release date: December 13, 2024

Improvements

  • Supports setting a time range within which Base Compaction is forbidden for a specific table. #50120

Bug Fixes

Fixed the following issues:

  • The loadRowsRate field returned 0 after executing SHOW ROUTINE LOAD. #52151
  • The Files() function read columns that were not queried. #52210
  • Prometheus failed to parse materialized view metrics with special characters in their names. (Now materialized view metrics support tags.) #52782
  • The array_map function caused BE to crash. #52909
  • Metadata Cache issues caused BE to crash. #52968
  • Routine Load tasks were canceled due to expired transactions. (Now tasks are canceled only if the database or table no longer exists). #50334
  • Stream Load failures when submitted using HTTP 1.0. #53010 #53008
  • Issues related to Glue and S3 integration: #48433
    • Some error messages did not display the root cause.
    • Error messages for writing to a Hive partitioned table with the partition column of type STRING when Glue was used as the metadata service.
    • Dropping Hive tables failed without proper error messages when the user lacked sufficient permissions.
  • The storage_cooldown_time property for materialized views did not take effect when set to maximum. #52079

3.3.7

12 Dec 16:13
00177de
Compare
Choose a tag to compare

3.3.7

Release date: November 29, 2024

New Features

  • Added a new Materialized View parameter, excluded_refresh_tables, exclude tables that need to be refreshed. #50926

Improvements

  • Rewrote unnest(bitmap_to_array) as unnest_bitmap to improve performance. #52870
  • Reduced the write and delete operations of Txn logs. #42542

Bug Fixes

Fixed the following issues:

  • Failure to connect Power BI to external tables. #52977
  • Misleading FE Thrift RPC failure messages in logs. #52706
  • Routine Load tasks were canceled due to expired transactions (now tasks are canceled only if the database or table no longer exists). #50334
  • Stream Load failures when submitted using HTTP 1.0. #53010 #53008
  • Integer overflow of partition IDs. #52965
  • Hive Text Reader failed to recognize the last empty element. #52990
  • Issues caused by array_map in Join conditions. #52911
  • Metadata cache issues under high concurrency scenarios. #52968
  • The whole materialized view was refreshed when a partition was dropped from the base table. #52740

3.3.6

20 Nov 06:38
8f01cfa
Compare
Choose a tag to compare

3.3.6

Release date: November 18, 2024

Improvements

  • Optimized internal repair logic for Primary Key tables. #52707
  • Optimized the internal implementation of histograms of statistics. #52400
  • Supports adjusting log level via the FE configuration item sys_log_warn_modules to reduce Hudi Catalog logging. #52709
  • Supports constant folding in the yearweek function. #52714
  • Avoided push-down for Lambda functions. #52655
  • Divided the Query Error metric into three: Internal Error Rate, Analysis Error Rate, and Timeout Rate. #52646
  • Avoided constant expressions being extracted as common expressions within array_map. #52541
  • Optimized the Text-based Rewrite of materialized views. #52498

Bug Fixes

Fixed the following issues:

  • The unique_constraints and foreign_constraints parameters were incomplete in SHOW CREATE TABLE for cloud-native tables in shared-data clusters. #52804
  • Some materialized views were activated even when enable_mv_automatic_active_check was set to false. #52799
  • Memory usage is not reducing after stale memory flush. #52613
  • Resource leak caused by Hudi file-system views. #52738
  • Concurrent Publish and Update operations on Primary Key tables may cause issues. #52687
  • Failures to terminate queries on clients. #52185
  • Multi-column List partitions cannot be pushed down. #51036
  • Incorrect result due to the lack of hasnull property in ORC files. #52555
  • An issue caused by using uppercase column names in ORDER BY during table creation. #52513
  • An error was returned after running ALTER TABLE PARTITION (*) SET ("storage_cooldown_ttl" = "xxx"). #52482

Behavior Changes

  • In earlier versions, scale-in operations would fail if there were insufficient replicas for views in the _statistics_ database. Starting from v3.3.6, if nodes are scaled in to 3 or more, view replicas are set to 3; if there is only 1 node after the scale-in, view replicas are set to 1, allowing for successful scale-in. #51799

    Affected views include:

    • column_statistics
    • histogram_statistics
    • table_statistic_v1
    • external_column_statistics
    • external_histogram_statistics
    • pipe_file_list
    • loads_history
    • task_run_history
  • New Primary Key tables no longer allow __op as a column name, even if allow_system_reserved_names is set to true. Existing tables are unaffected. #52621

  • Expression-partitioned tables cannot have partition names modified. #52557

  • Deprecated FE parameters heartbeat_mgr_blocking_queue_size and profile_process_threads_num. #52236

  • Enabled persistent index on object storage by default for Primary Key tables in shared-data clusters. #52209

  • Disallowed manual changes to bucketing methods for tables with the random bucketing method. #52120

  • Backup and Restore-related parameter changes: #52111

    • make_snapshot_worker_count supports dynamic configuration.
    • release_snapshot_worker_count supports dynamic configuration.
    • upload_worker_count supports dynamic configuration. Its default value is changed from 1 to the number of CPU cores on the machine where the BE resides.
    • download_worker_count supports dynamic configuration. Its default value is changed from 1 to the number of CPU cores on the machine where the BE resides.
  • The return type of SELECT @@autocommit has changed from BOOLEAN to BIGINT. #51946

  • Added a new FE configuration item, max_bucket_number_per_partition, to control the maximum number of buckets per partition. #47852

  • Enabled memory usage checks by default for Primary Key tables. #52393

3.2.12

14 Nov 07:15
5f81e3e
Compare
Choose a tag to compare

Release date: October 23, 2024

Improvements

  • Optimized memory allocation and statistics in BE for certain complex query scenarios to avoid OOM. #51382
  • Optimized memory usage in FE in Schema Change scenarios. #48569
  • Optimized the job status display when querying the system-defined view information_schema.routine_load_jobs from Follower FE nodes. #51763
  • Supports Backup and Restore of with the List partitioned tables. #51993

Bug Fixes

Fixed the following issues:

  • The error message was lost after writing to Hive failed. #33167
  • The array_map function causes a crash when excessive constant parameters are used. #51244
  • Special characters in the PARTITION BY columns of expression partitioned tables cause FE CheckPoint failures. #51677
  • Accessing the system-defined view information_schema.fe_locks causes a crash. #51742
  • Querying generated columns causes an error. #51755
  • Optimize Table fails when the table name contains special characters. #51755
  • Tablets could not be balanced in certain scenarios. #51828

Behavior Changes

  • Supports dynamic modification of Backup and Restore-related parameters.#52111

3.3.5

24 Oct 03:08
6d81f75
Compare
Choose a tag to compare

3.3.5

Release date: October 23, 2024

New Features

  • Supports millisecond and microsecond precision in the DATETIME type.
  • Resource groups support CPU hard isolation.

Improvements

  • Optimized performance and extraction strategy for Flat JSON. #50696
  • Reduced memory usage for the following ARRAY functions:
  • Optimized error messages when loading Null values into List partition keys with the Not Null attribute. #51086
  • Optimized error messages for Files() when authentication fails in the Files function. #51697
  • Optimized internal statistics for INSERT OVERWRITE. #50417
  • Shared-data clusters support garbage collection (GC) for persistent index files. #51684
  • Added FE logs to help diagnose FE out-of-memory (OOM) issues. #51528
  • Supports recovering metadata from the metadata directory of FE. #51040

Bug Fixes

Fixed the following issues:

  • A deadlock issue caused by PIPE exceptions. #50841
  • Dynamic partition creation failures block subsequent partition creation. #51440
  • An error is returned for UNION ALL queries with ORDER BY. #51647
  • CTE in UPDATE statements causes hints to be ignored. #51458
  • The load_finish_time field in the system-defined view statistics.loads_history does not update as expected after a loading task is completed. #51174
  • UDTF mishandles multibyte UTF-8 characters. #51232

Behavior Changes

  • Modified the return content of the EXPLAIN statement. After the change, the return content is equivalent to EXPLAIN COST. You can configure the level of details returned by EXPLAIN using the dynamic FE parameter query_detail_explain_level. The default value is COSTS, with other valid values being NORMAL and VERBOSE. #51439

3.3.4

30 Sep 08:24
56bcf6f
Compare
Choose a tag to compare

3.3.4

Release date: September 30, 2024

New Features

  • Supports creating asynchronous materialized views on List Partition tables. #46680 #46808
  • List Partition tables now support Nullable partition columns. #47797
  • Supports viewing external file schema information using DESC FILES(). #50527
  • Supports viewing replication task metrics via SHOW PROC '/replications'. #50483

Improvements

  • Optimized data recycling performance for TRUNCATE TABLE in shared-data clusters. #49975
  • Supports intermediate result spilling for CTE operators. #47982
  • Supports adaptive phased scheduling to alleviate OOM issues caused by complex queries. #47868
  • Supports predicate pushdown for STRING-type date or datatime columns in specific scenarios. #50643
  • Supports COUNT DISTINCT computation on constant semi-structured data. #48273
  • Added a new FE parameter lake_enable_balance_tablets_between_workers to enable tablet balancing for tables in shared-date clusters. #50843
  • Enhanced query rewrite capabilities for generated columns. #50398
  • Partial Update now supports automatically populating columns with default values of CURRENT_TIMESTAMP. #50287

Bug Fixes

Fixed the following issues:

  • The error "version has been compacted" caused by an infinite loop on the FE side during Tablet Clone. #50561
  • ISO- formatted DATETIME types cannot be pushed down. #49358
  • In concurrent scenarios, data still existed after the tablet was deleted. #50382
  • Incorrect results returned by the yearweek function. #51065
  • An issue with low cardinality dictionaries in ARRAY during CTE queries. #51148
  • After FE restarts, partition TTL-related parameters were lost for materialized views. #51028
  • Data loss in columns defined with CURRENT_TIMESTAMP after upgrading. #50911
  • A stack overflow caused by the array_distinct function. #51017
  • Activation failures for materialized views after upgrading due to changes in default field lengths. You can avoid such issues by setting enable_active_materialized_view_schema_strict_check to false. #50869
  • Resource group property cpu_weight can be set to a negative value. #51005
  • Incorrect statistics for disk capacity information. #50669
  • Constant fold in the replace function. #50828

Behavior Changes

  • Changed the default replica number for external catalog-based materialized views from 1 to the value of the FE parameter default_replication_num (Default value: 3). #50931

3.2.11

09 Sep 08:28
10a5f0e
Compare
Choose a tag to compare

Release date: September 9, 2024

Improvements

  • Supports masking authentication information for Files() and PIPE. #47629
  • Support automatic inference for the STRUCT type when reading Parquet files through Files(). #50481

Bug Fixes

Fixed the following issues:

  • An error is returned for equi-join queries because they failed to be rewritten by the global dictionary. #50690
  • The error "version has been compacted" caused by an infinite loop on the FE side during Tablet Clone. #50561
  • Incorrect scheduling for unhealthy replica repairs after distributing data based on labels. #50331
  • An error in the statistics collection log: "Unknown column '%s' in '%s." #50785
  • Incorrect timezone usage when reading complex types like TIMESTAMP from Parquet files via Files(). #50448

Behavior Changes

  • When downgrading StarRocks from v3.3.x to v3.2.11, the system will ignore it if there is incompatible metadata. #49636

3.3.3

05 Sep 05:55
312ed45
Compare
Choose a tag to compare

3.3.3

Release date: September 5, 2024

New Features

  • Supports user-level variables. #48477
  • Supports Delta Lake Catalog metadata cache with manual and periodic refresh strategies. #46526 #49069
  • Supports loading JSON types from Parquet files. #49385
  • JDBC SQL Server Catalog supports queries with LIMIT. #48248
  • Shared-data clusters support Partial Updates with INSERT INTO. #49336

Improvements

  • Optimized error messages for loading:
    • When memory limits are reached during loading, the IP of the corresponding BE node is returned for easier troubleshooting. #49335
    • Detailed messages are provided when CSV data is loaded to target table columns that are not long enough. #49713
    • Specific node information is provided when Kerberos authentication fails in Broker Load. #46085
  • Optimized the partitioning mechanism during data loading to reduce memory usage in the initial stage. #47976
  • Optimized memory usage for shared-nothing clusters by limiting metadata memory usage to avoid issues when there are too many Tablets or Segment files. #49170
  • Optimized the performance of queries using max(partition_column). #49391
  • Partition pruning is used to optimize query performance when the partition column is a generated column (a column that is calculated based on a native column in the table), and the query predicate filter condition includes the native column. #48692
  • Supports masking authentication information for Files() and PIPE. #47629
  • Introduced a new statement show proc '/global_current_queries' to view queries running on all FE nodes. show proc '/current_queries' only shows queries running on the current FE node. #49826

Bug Fixes

Fixed the following issues:

  • The source cluster's BE nodes were mistakenly added to the current cluster when exporting data to the destination cluster via StarRocks external tables. #49323
  • TINYINT data type returned NULL when StarRocks reads ORC files using select * from files from clusters deployed on aarch64 machines. #49517
  • Stream Load fails when loading JSON files containing large Integer types. #49927
  • Incorrect schema is returned due to improper handling of invisible characters when users load CSV files with Files(). #49718
  • An issue with temporary partition replacement in tables with multiple partition columns. #49764

Behavior Changes

  • Introduced a new parameter object_storage_rename_file_request_timeout_ms to better accommodate backup scenarios with cloud object storage. This parameter will be used as the backup timeout, with a default value of 30 seconds. #49706
  • to_json, CAST(AS MAP), and STRUCT AS JSON will return NULL instead of throwing an error by default when the conversion fails. You can allow errors by setting the system variable sql_mode to ALLOW_THROW_EXCEPTION. #50157

3.1.15

04 Sep 09:04
5625961
Compare
Choose a tag to compare

3.1.15

Release date: September 4, 2024

Bug Fixes

Fixed the following issues:

  • During query rewrite with asynchronous materialized views, count(*) on certain tables returns NULL. #49288
  • partition_linve_nubmer does not take effect. #49213
  • FE throws a tablet exception: BE disk offline, and cannot migrate tablets. #47833

3.2.10

23 Aug 06:13
f61f51a
Compare
Choose a tag to compare

Release date: August 23, 2024

Improvements

  • Files() will automatically convert BYTE_ARRAY data with a logical_type of JSON in Parquet files to the JSON type in StarRocks. #49385
  • Optimized error messages for Files() when Access Key ID and Secret Access Key are missing. #49090
  • information_schema.columns supports the GENERATION_EXPRESSION field. #49734

Bug Fixes

Fixed the following issues:

  • Downgrading a v3.3 shared-data cluster to v3.2 after setting the Primary Key table property "persistent_index_type" = "CLOUD_NATIVE" causes a crash. #48149
  • Exporting data to CSV files using SELECT INTO OUTFILE may cause data inconsistency. #48052
  • Queries encounter failures during concurrent query execution. #48180
  • Queries would hang due to a timeout in the Plan phase without exiting. #48405
  • After disabling index compression for Primary Key tables in older versions and then upgrading to v3.2.9, accessing page_off information causes an array out-of-bounds crash. #48230
  • BE crash caused by concurrent execution of ADD/DROP COLUMN operations. #49355
  • Queries against negative TINYINT values in ORC format files return None on the aarch64 architecture. #49517
  • If the disk write operation fails, failures of l0 snapshots for Primary Key Persistent Index may cause data loss. #48045
  • Partial Update in Column mode for Primary Key tables fails under scenarios with large-volume data updates. #49054
  • BE crash caused by Fast Schema Evolution when downgrading a v3.3.0 shared-data cluster to v3.2.9. #42737
  • partition_linve_nubmer does not take effect. #49213
  • The conflict between index persistence and compaction in Primary Key tables could cause clone failures. #49341
  • Modifications of partition_line_number using ALTER TABLE do not take effect. #49437
  • Rewrite of CTE distinct grouping sets generates an invalid plan. #48765
  • RPC failures polluted the thread pool. #49619
  • authentication failure issues when loading files from AWS S3 via PIPE. #49837

Behavior Changes

  • Added a check for the meta directory in the FE startup script. If the directory does not exist, it will be automatically created. #48940
  • Added a memory limit parameter load_process_max_memory_hard_limit_ratio for data loading. If memory usage exceeds the limit, subsequent loading tasks will fail. #48495