Releases: StarRocks/starrocks
3.2.13
Release date: December 13, 2024
Improvements
- Supports setting a time range within which Base Compaction is forbidden for a specific table. #50120
Bug Fixes
Fixed the following issues:
- The
loadRowsRate
field returned0
after executing SHOW ROUTINE LOAD. #52151 - The
Files()
function read columns that were not queried. #52210 - Prometheus failed to parse materialized view metrics with special characters in their names. (Now materialized view metrics support tags.) #52782
- The
array_map
function caused BE to crash. #52909 - Metadata Cache issues caused BE to crash. #52968
- Routine Load tasks were canceled due to expired transactions. (Now tasks are canceled only if the database or table no longer exists). #50334
- Stream Load failures when submitted using HTTP 1.0. #53010 #53008
- Issues related to Glue and S3 integration: #48433
- Some error messages did not display the root cause.
- Error messages for writing to a Hive partitioned table with the partition column of type STRING when Glue was used as the metadata service.
- Dropping Hive tables failed without proper error messages when the user lacked sufficient permissions.
- The
storage_cooldown_time
property for materialized views did not take effect when set tomaximum
. #52079
3.3.7
3.3.7
Release date: November 29, 2024
New Features
- Added a new Materialized View parameter,
excluded_refresh_tables
, exclude tables that need to be refreshed. #50926
Improvements
- Rewrote
unnest(bitmap_to_array)
asunnest_bitmap
to improve performance. #52870 - Reduced the write and delete operations of Txn logs. #42542
Bug Fixes
Fixed the following issues:
- Failure to connect Power BI to external tables. #52977
- Misleading FE Thrift RPC failure messages in logs. #52706
- Routine Load tasks were canceled due to expired transactions (now tasks are canceled only if the database or table no longer exists). #50334
- Stream Load failures when submitted using HTTP 1.0. #53010 #53008
- Integer overflow of partition IDs. #52965
- Hive Text Reader failed to recognize the last empty element. #52990
- Issues caused by
array_map
in Join conditions. #52911 - Metadata cache issues under high concurrency scenarios. #52968
- The whole materialized view was refreshed when a partition was dropped from the base table. #52740
3.3.6
3.3.6
Release date: November 18, 2024
Improvements
- Optimized internal repair logic for Primary Key tables. #52707
- Optimized the internal implementation of histograms of statistics. #52400
- Supports adjusting log level via the FE configuration item
sys_log_warn_modules
to reduce Hudi Catalog logging. #52709 - Supports constant folding in the
yearweek
function. #52714 - Avoided push-down for Lambda functions. #52655
- Divided the Query Error metric into three: Internal Error Rate, Analysis Error Rate, and Timeout Rate. #52646
- Avoided constant expressions being extracted as common expressions within
array_map
. #52541 - Optimized the Text-based Rewrite of materialized views. #52498
Bug Fixes
Fixed the following issues:
- The
unique_constraints
andforeign_constraints
parameters were incomplete in SHOW CREATE TABLE for cloud-native tables in shared-data clusters. #52804 - Some materialized views were activated even when
enable_mv_automatic_active_check
was set tofalse
. #52799 - Memory usage is not reducing after stale memory flush. #52613
- Resource leak caused by Hudi file-system views. #52738
- Concurrent Publish and Update operations on Primary Key tables may cause issues. #52687
- Failures to terminate queries on clients. #52185
- Multi-column List partitions cannot be pushed down. #51036
- Incorrect result due to the lack of
hasnull
property in ORC files. #52555 - An issue caused by using uppercase column names in ORDER BY during table creation. #52513
- An error was returned after running
ALTER TABLE PARTITION (*) SET ("storage_cooldown_ttl" = "xxx")
. #52482
Behavior Changes
-
In earlier versions, scale-in operations would fail if there were insufficient replicas for views in the
_statistics_
database. Starting from v3.3.6, if nodes are scaled in to 3 or more, view replicas are set to 3; if there is only 1 node after the scale-in, view replicas are set to 1, allowing for successful scale-in. #51799Affected views include:
column_statistics
histogram_statistics
table_statistic_v1
external_column_statistics
external_histogram_statistics
pipe_file_list
loads_history
task_run_history
-
New Primary Key tables no longer allow
__op
as a column name, even ifallow_system_reserved_names
is set totrue
. Existing tables are unaffected. #52621 -
Expression-partitioned tables cannot have partition names modified. #52557
-
Deprecated FE parameters
heartbeat_mgr_blocking_queue_size
andprofile_process_threads_num
. #52236 -
Enabled persistent index on object storage by default for Primary Key tables in shared-data clusters. #52209
-
Disallowed manual changes to bucketing methods for tables with the random bucketing method. #52120
-
Backup and Restore-related parameter changes: #52111
make_snapshot_worker_count
supports dynamic configuration.release_snapshot_worker_count
supports dynamic configuration.upload_worker_count
supports dynamic configuration. Its default value is changed from1
to the number of CPU cores on the machine where the BE resides.download_worker_count
supports dynamic configuration. Its default value is changed from1
to the number of CPU cores on the machine where the BE resides.
-
The return type of
SELECT @@autocommit
has changed from BOOLEAN to BIGINT. #51946 -
Added a new FE configuration item,
max_bucket_number_per_partition
, to control the maximum number of buckets per partition. #47852 -
Enabled memory usage checks by default for Primary Key tables. #52393
3.2.12
Release date: October 23, 2024
Improvements
- Optimized memory allocation and statistics in BE for certain complex query scenarios to avoid OOM. #51382
- Optimized memory usage in FE in Schema Change scenarios. #48569
- Optimized the job status display when querying the system-defined view information_schema.routine_load_jobs from Follower FE nodes. #51763
- Supports Backup and Restore of with the List partitioned tables. #51993
Bug Fixes
Fixed the following issues:
- The error message was lost after writing to Hive failed. #33167
- The array_map function causes a crash when excessive constant parameters are used. #51244
- Special characters in the PARTITION BY columns of expression partitioned tables cause FE CheckPoint failures. #51677
- Accessing the system-defined view information_schema.fe_locks causes a crash. #51742
- Querying generated columns causes an error. #51755
- Optimize Table fails when the table name contains special characters. #51755
- Tablets could not be balanced in certain scenarios. #51828
Behavior Changes
- Supports dynamic modification of Backup and Restore-related parameters.#52111
3.3.5
3.3.5
Release date: October 23, 2024
New Features
- Supports millisecond and microsecond precision in the DATETIME type.
- Resource groups support CPU hard isolation.
Improvements
- Optimized performance and extraction strategy for Flat JSON. #50696
- Reduced memory usage for the following ARRAY functions:
- Optimized error messages when loading
Null
values into List partition keys with theNot Null
attribute. #51086 - Optimized error messages for Files() when authentication fails in the Files function. #51697
- Optimized internal statistics for
INSERT OVERWRITE
. #50417 - Shared-data clusters support garbage collection (GC) for persistent index files. #51684
- Added FE logs to help diagnose FE out-of-memory (OOM) issues. #51528
- Supports recovering metadata from the metadata directory of FE. #51040
Bug Fixes
Fixed the following issues:
- A deadlock issue caused by PIPE exceptions. #50841
- Dynamic partition creation failures block subsequent partition creation. #51440
- An error is returned for
UNION ALL
queries withORDER BY
. #51647 - CTE in UPDATE statements causes hints to be ignored. #51458
- The
load_finish_time
field in the system-defined viewstatistics.loads_history
does not update as expected after a loading task is completed. #51174 - UDTF mishandles multibyte UTF-8 characters. #51232
Behavior Changes
- Modified the return content of the
EXPLAIN
statement. After the change, the return content is equivalent toEXPLAIN COST
. You can configure the level of details returned byEXPLAIN
using the dynamic FE parameterquery_detail_explain_level
. The default value isCOSTS
, with other valid values beingNORMAL
andVERBOSE
. #51439
3.3.4
3.3.4
Release date: September 30, 2024
New Features
- Supports creating asynchronous materialized views on List Partition tables. #46680 #46808
- List Partition tables now support Nullable partition columns. #47797
- Supports viewing external file schema information using
DESC FILES()
. #50527 - Supports viewing replication task metrics via
SHOW PROC '/replications'
. #50483
Improvements
- Optimized data recycling performance for
TRUNCATE TABLE
in shared-data clusters. #49975 - Supports intermediate result spilling for CTE operators. #47982
- Supports adaptive phased scheduling to alleviate OOM issues caused by complex queries. #47868
- Supports predicate pushdown for STRING-type date or datatime columns in specific scenarios. #50643
- Supports COUNT DISTINCT computation on constant semi-structured data. #48273
- Added a new FE parameter
lake_enable_balance_tablets_between_workers
to enable tablet balancing for tables in shared-date clusters. #50843 - Enhanced query rewrite capabilities for generated columns. #50398
- Partial Update now supports automatically populating columns with default values of
CURRENT_TIMESTAMP
. #50287
Bug Fixes
Fixed the following issues:
- The error "version has been compacted" caused by an infinite loop on the FE side during Tablet Clone. #50561
- ISO- formatted DATETIME types cannot be pushed down. #49358
- In concurrent scenarios, data still existed after the tablet was deleted. #50382
- Incorrect results returned by the
yearweek
function. #51065 - An issue with low cardinality dictionaries in ARRAY during CTE queries. #51148
- After FE restarts, partition TTL-related parameters were lost for materialized views. #51028
- Data loss in columns defined with
CURRENT_TIMESTAMP
after upgrading. #50911 - A stack overflow caused by the
array_distinct
function. #51017 - Activation failures for materialized views after upgrading due to changes in default field lengths. You can avoid such issues by setting
enable_active_materialized_view_schema_strict_check
tofalse
. #50869 - Resource group property
cpu_weight
can be set to a negative value. #51005 - Incorrect statistics for disk capacity information. #50669
- Constant fold in the
replace
function. #50828
Behavior Changes
- Changed the default replica number for external catalog-based materialized views from
1
to the value of the FE parameterdefault_replication_num
(Default value:3
). #50931
3.2.11
Release date: September 9, 2024
Improvements
- Supports masking authentication information for Files() and PIPE. #47629
- Support automatic inference for the STRUCT type when reading Parquet files through Files(). #50481
Bug Fixes
Fixed the following issues:
- An error is returned for equi-join queries because they failed to be rewritten by the global dictionary. #50690
- The error "version has been compacted" caused by an infinite loop on the FE side during Tablet Clone. #50561
- Incorrect scheduling for unhealthy replica repairs after distributing data based on labels. #50331
- An error in the statistics collection log: "Unknown column '%s' in '%s." #50785
- Incorrect timezone usage when reading complex types like TIMESTAMP from Parquet files via Files(). #50448
Behavior Changes
- When downgrading StarRocks from v3.3.x to v3.2.11, the system will ignore it if there is incompatible metadata. #49636
3.3.3
3.3.3
Release date: September 5, 2024
New Features
- Supports user-level variables. #48477
- Supports Delta Lake Catalog metadata cache with manual and periodic refresh strategies. #46526 #49069
- Supports loading JSON types from Parquet files. #49385
- JDBC SQL Server Catalog supports queries with LIMIT. #48248
- Shared-data clusters support Partial Updates with INSERT INTO. #49336
Improvements
- Optimized error messages for loading:
- When memory limits are reached during loading, the IP of the corresponding BE node is returned for easier troubleshooting. #49335
- Detailed messages are provided when CSV data is loaded to target table columns that are not long enough. #49713
- Specific node information is provided when Kerberos authentication fails in Broker Load. #46085
- Optimized the partitioning mechanism during data loading to reduce memory usage in the initial stage. #47976
- Optimized memory usage for shared-nothing clusters by limiting metadata memory usage to avoid issues when there are too many Tablets or Segment files. #49170
- Optimized the performance of queries using
max(partition_column)
. #49391 - Partition pruning is used to optimize query performance when the partition column is a generated column (a column that is calculated based on a native column in the table), and the query predicate filter condition includes the native column. #48692
- Supports masking authentication information for Files() and PIPE. #47629
- Introduced a new statement
show proc '/global_current_queries'
to view queries running on all FE nodes.show proc '/current_queries'
only shows queries running on the current FE node. #49826
Bug Fixes
Fixed the following issues:
- The source cluster's BE nodes were mistakenly added to the current cluster when exporting data to the destination cluster via StarRocks external tables. #49323
- TINYINT data type returned NULL when StarRocks reads ORC files using
select * from files
from clusters deployed on aarch64 machines. #49517 - Stream Load fails when loading JSON files containing large Integer types. #49927
- Incorrect schema is returned due to improper handling of invisible characters when users load CSV files with Files(). #49718
- An issue with temporary partition replacement in tables with multiple partition columns. #49764
Behavior Changes
- Introduced a new parameter
object_storage_rename_file_request_timeout_ms
to better accommodate backup scenarios with cloud object storage. This parameter will be used as the backup timeout, with a default value of 30 seconds. #49706 to_json
,CAST(AS MAP)
, andSTRUCT AS JSON
will return NULL instead of throwing an error by default when the conversion fails. You can allow errors by setting the system variablesql_mode
toALLOW_THROW_EXCEPTION
. #50157
3.1.15
3.1.15
Release date: September 4, 2024
Bug Fixes
Fixed the following issues:
3.2.10
Release date: August 23, 2024
Improvements
- Files() will automatically convert
BYTE_ARRAY
data with alogical_type
ofJSON
in Parquet files to the JSON type in StarRocks. #49385 - Optimized error messages for Files() when Access Key ID and Secret Access Key are missing. #49090
information_schema.columns
supports theGENERATION_EXPRESSION
field. #49734
Bug Fixes
Fixed the following issues:
- Downgrading a v3.3 shared-data cluster to v3.2 after setting the Primary Key table property
"persistent_index_type" = "CLOUD_NATIVE"
causes a crash. #48149 - Exporting data to CSV files using SELECT INTO OUTFILE may cause data inconsistency. #48052
- Queries encounter failures during concurrent query execution. #48180
- Queries would hang due to a timeout in the Plan phase without exiting. #48405
- After disabling index compression for Primary Key tables in older versions and then upgrading to v3.2.9, accessing
page_off
information causes an array out-of-bounds crash. #48230 - BE crash caused by concurrent execution of ADD/DROP COLUMN operations. #49355
- Queries against negative
TINYINT
values in ORC format files returnNone
on the aarch64 architecture. #49517 - If the disk write operation fails, failures of
l0
snapshots for Primary Key Persistent Index may cause data loss. #48045 - Partial Update in Column mode for Primary Key tables fails under scenarios with large-volume data updates. #49054
- BE crash caused by Fast Schema Evolution when downgrading a v3.3.0 shared-data cluster to v3.2.9. #42737
partition_linve_nubmer
does not take effect. #49213- The conflict between index persistence and compaction in Primary Key tables could cause clone failures. #49341
- Modifications of
partition_line_number
using ALTER TABLE do not take effect. #49437 - Rewrite of CTE distinct grouping sets generates an invalid plan. #48765
- RPC failures polluted the thread pool. #49619
- authentication failure issues when loading files from AWS S3 via PIPE. #49837
Behavior Changes
- Added a check for the
meta
directory in the FE startup script. If the directory does not exist, it will be automatically created. #48940 - Added a memory limit parameter
load_process_max_memory_hard_limit_ratio
for data loading. If memory usage exceeds the limit, subsequent loading tasks will fail. #48495