Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] bulk load optimization #53954

Open
3 tasks
luohaha opened this issue Dec 15, 2024 · 0 comments · May be fixed by #53921
Open
3 tasks

[Feature] bulk load optimization #53954

luohaha opened this issue Dec 15, 2024 · 0 comments · May be fixed by #53921

Comments

@luohaha
Copy link
Contributor

luohaha commented Dec 15, 2024

Feature request

Is your feature request related to a problem? Please describe.

When performing large data ingestions, memory limitations during the ingestion process can result in the final Rowset containing a large number of segment files. This negatively impacts query performance and increases resource consumption (CPU and memory) during subsequent compaction operations.

Describe the solution you'd like

To address this, we propose a new large data ingestion strategy that spills data during the ingestion process and assembles it into final segment files in the final stage of the ingestion. This approach reduces the number of segment files in the Rowset.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant