跳到主要内容

4 篇博文 含有标签「weekly」

查看所有标签

· 阅读需 4 分钟

Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: https://app.databend.com .

What's New

Check out what we've done this week to make Databend even better for you.

Features & Improvements ✨

Multiple Catalogs

  • implement show tables (from|in catalog.database) (#9153)

Planner

  • introduce histogram in column statistics (#9310)

Query

  • support attaching stage for insert values (#9249)
  • add native format in fuse table (#9279)
  • add internal_enable_sandbox_tenant config and sandbox_tenant (#9277)

Sqllogictest

  • introduce rust native sqllogictest framework (#9150)

Code Refactoring 🎉

*

  • unify apply_file_format_options for copy & insert (#9323)

IO

  • remove unused code (#9266)

meta

  • test watcher count (#9324)

Planner

  • replace TableContext in planner with PlannerContext (#9290)

Bug Fixes 🔧

Base

  • try fix SIGABRT when catch unwind (#9269)
  • replace #[thread_local] to thread_local macro (#9280)

Query

  • fix unknown database in query without relation to this database (#9250)
  • fix wrong current_role when drop the role (#9276)

What's On In Databend

Stay connected with the latest news about Databend.

Introduced a Rust Native Sqllogictest Framework

Sqllogictest verifies the results returned from a SQL database engine by comparing them with the results of other engines for the same queries.

In the past, Databend ran such tests using a program written in Python and migrated a large number of test cases from other popular databases. We implemented the program again with sqllogictest-rs in recent days.

Learn More

Experimental: Native Format

PA is a native storage format based on Apache Arrow. Similar to Arrow IPC, PA aims at optimizing the storage layer.

Databend is introducing PA as a native storage format in the hope of getting a performance boost, though it's still at an early stage of development.

create table tmp (a int) ENGINE=FUSE STORAGE_FORMAT='native';

Learn More

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Checking File Existence Before Returning Presigned URL​

When presigning a file, Databend currently returns a potentially valid URL based on the filename without checking if the file really exists. Thus, the 404 error might occur if the file doesn't exist at all.

Issue 8702: Before return presign url add file exist judgement

Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Contributors

Thanks a lot to the contributors for their excellent work this week.

ariesdevilb41shBohuTANGClSlaiddrmingdrmereverpcpc
ariesdevilb41shBohuTANGClSlaiddrmingdrmereverpcpc
leiyskymergify[bot]PsiACEsandfleesoyeric128sundy-li
leiyskymergify[bot]PsiACEsandfleesoyeric128sundy-li
Xuanwoxudong963youngsofunzhang2014ZhiHanZzhyass
Xuanwoxudong963youngsofunzhang2014ZhiHanZzhyass

Connect With Us

We'd love to hear from you. Feel free to run the code and see if Databend works for you. Submit an issue with your problem if you need help.

DatafuseLabs Community is open to everyone who loves data warehouses. Please join the community and share your thoughts.

· 阅读需 5 分钟

Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: https://app.databend.com .

What's New

Check out what we've done this week to make Databend even better for you.

Features & Improvements ✨

Multiple Catalogs

  • extends show databases SQL (#9152)

Stage

  • support select from URI (#9247)

Streaming Load

  • support file_format syntax in streaming load insert sql (#9063)

Planner

  • push down limit to union (#9210)

Query

  • use analyze table instead of optimize table statistic (#9143)
  • fast parse insert values (#9214)

Storage

  • use distinct count calculated by the xor hash function (#9159)
  • read_parquet read meta before read data (#9154)
  • push down filter to parquet reader (#9199)
  • prune row groups before reading (#9228)

Open Sharing

  • add prototype open sharing and add sharing stateful tests (#9177)

Code Refactoring 🎉

*

  • simplify the global data registry logic (#9187)

Storage

  • refactor deletion (#8824)

Build/Testing/CI Infra Changes 🔌

  • release databend deb package and databend with hive (#9138, #9241, etc.)

Bug Fixes 🔧

Format

  • support ASCII control code hex as format field delimiter (#9160)

Planner

  • prewhere_column empty and predicate is not const will return empty (#9116)
  • don't push down topk to Merge when it's child is Aggregate (#9183)
  • fix nullable column validity not equal (#9220)

Query

  • address unit test hang on test_insert (#9242)

Storage

  • too many io requests for read blocks during compact (#9128)
  • collect orphan snapshots (#9108)

What's On In Databend

Stay connected with the latest news about Databend.

Breaking Change: Unified File Format Options

To simplify, we're rolling out a set of unified file format options as follows for the COPY INTO command, the Streaming Load API, and all the other cases where users need to describe their file formats:

[ FILE_FORMAT = ( TYPE = { CSV | TSV | NDJSON | PARQUET | XML} [ formatTypeOptions ] ) ]
  • Please note that the current format options starting with format_* will be deprecated.
  • ... FORMAT CSV ... will still be accepted by the ClickHouse handler.
  • Support for customized formats created by CREATE FILE FORMAT ... will be added in a future release: ... FILE_FORMAT = (format_name = 'MyCustomCSV') .... .

Learn More

Open Sharing

Open Sharing is a simple and secure data-sharing protocol designed for databend-query nodes running in a multi-cloud environment.

  • Simple & Free: Open Sharing is open-source and basically a RESTful API implementation.
  • Secure: Open Sharing verifies incoming requesters' identities and access permissions, and provides an audit log.
  • Multi-Cloud: Open Sharing supports a variety of public cloud platforms, including AWS, Azure, GCP, etc.

Learn More

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

We're about to run stage-related tests again using the Streaming Load API to move files to a stage instead of an AWS command like this:

aws --endpoint-url ${STORAGE_S3_ENDPOINT_URL} s3 cp s3://testbucket/admin/data/ontime_200.csv s3://testbucket/admin/stage/internal/s1/ontime_200.csv >/dev/null 2>&1

This is because Databend users do not need to take care of, or do not even know the stage paths that the AWS command requires.

Issue 8528: refactor stage related tests

Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Contributors

Thanks a lot to the contributors for their excellent work this week.

ariesdevilb41shBohuTANGChasen-ZhangClSlaiddantengsky
ariesdevilb41shBohuTANGChasen-ZhangClSlaiddantengsky
drmingdrmerhantmaclichuangmergify[bot]PsiACERinChanNOWWW
drmingdrmerhantmaclichuangmergify[bot]PsiACERinChanNOWWW
soyeric128sundy-liwubxXuanwoxudong963youngsofun
soyeric128sundy-liwubxXuanwoxudong963youngsofun
ZhiHanZzhyasszzzdong
ZhiHanZzhyasszzzdong

Connect With Us

We'd love to hear from you. Feel free to run the code and see if Databend works for you. Submit an issue with your problem if you need help.

DatafuseLabs Community is open to everyone who loves data warehouses. Please join the community and share your thoughts.

· 阅读需 4 分钟

Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: https://app.databend.com .

What's New

Check out what we've done this week to make Databend even better for you.

Features & Improvements ✨

Planner

  • optimize topk in cluser mode (#9092)

Query

  • support select * exclude [column_name | (col_name, col_name,...)] (#9009)
  • alter table flashback (#8967)
  • new table function read_parquet to read parquet files as a table (#9080)
  • support select * from @stage (#9123)

Storage

  • cache policy (#9062)
  • support hive nullable partition (#9064)

Code Refactoring 🎉

Memory Tracker

  • keep tracker state consistent (#8973)

REST API

  • drop ctx after query finished (#9091)

Bug Fixes 🔧

Configs

  • add more tests for hive config loading (#9074)

Planner

  • try to fix table name case sensibility (#9055)

Functions

  • vector_const like bug fix (#9082)

Storage

  • update last_snapshot_hint file when purge (#9060)

Cluster

  • try fix broken pipe or connect reset (#9104)

What's On In Databend

Stay connected with the latest news about Databend.

RESTORE TABLE

By the snapshot ID or timestamp you specify in the command, Databend restores the table to a prior state where the snapshot was created. To retrieve snapshot IDs and timestamps of a table, use FUSE_SNAPSHOT.

-- Restore with a snapshot ID
ALTER TABLE <table> FLASHBACK TO (SNAPSHOT => '<snapshot-id>');
-- Restore with a snapshot timestamp
ALTER TABLE <table> FLASHBACK TO (TIMESTAMP => '<timestamp>'::TIMESTAMP);

Learn More

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Adding Build Information to Error Report

An error report currently only contains an error code and some information about why the error occurred. When build information is available, troubleshooting will become easier.

"Code: xx. Error: error msg... (version ...)"

Issue 9117: Add Build Information to the Error Report

Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Contributors

Thanks a lot to the contributors for their excellent work this week.

andylokandyb41shBohuTANGdantengskydrmingdrmereverpcpc
andylokandyb41shBohuTANGdantengskydrmingdrmereverpcpc
lichuangmergify[bot]PsiACERinChanNOWWWsandfleesoyeric128
lichuangmergify[bot]PsiACERinChanNOWWWsandfleesoyeric128
sundy-liTCeasonXuanwoxudong963youngsofunzhang2014
sundy-liTCeasonXuanwoxudong963youngsofunzhang2014
ZhiHanZ
ZhiHanZ

Connect With Us

We'd love to hear from you. Feel free to run the code and see if Databend works for you. Submit an issue with your problem if you need help.

DatafuseLabs Community is open to everyone who loves data warehouses. Please join the community and share your thoughts.

· 阅读需 5 分钟

Databend is a powerful cloud data warehouse. Built for elasticity and efficiency. Free and open. Also available in the cloud: https://app.databend.com .

What's New

Check out what we've done this week to make Databend even better for you.

Features & Improvements ✨

Format

  • better checking of format options (#8981)
  • add basic schema infer for parquet (#9043)

Query

  • QualifiedName support 'db.table.' and 'table.' (#8965)
  • support bulk insert without exprssion (#8966)

Storage

  • add cache layer for fuse engine (#8830)
  • add system table system.memory_statistics (#8945)
  • add optimize statistic ddl support (#8891)

Code Refactoring 🎉

Base

  • remove common macros (#8936)

Format

  • TypeDeserializer get rid of FormatSetting (#8950)

Planner

  • refactor extract or predicate (#8951)

Processors

  • optimize join by merging build data block (#8961)

New Expression

  • allow sparse column id in chunk, redo #8789 with a new approach. (#9008)

Documentation 📔

Bug Fixes 🔧

Base

  • try fix lost tracker (#8932)

Meta

  • fix share db bug, create DatabaseIdToName if need (#9006)

Mysql handler

  • fix mysql conns leak (#8894)

Processors

  • try fix update list memory leak (#9023)

Storage

  • read and write block in parallel when compact (#8921)

What's On In Databend

Stay connected with the latest news about Databend.

Infer Schema at a Glance

You usually need to create a table before loading data from a file stored on a stage or somewhere. Unfortunately, sometimes you might not know the file schema to create the table or are unable to input the schema due to its complexity.

Introducing the capability to infer schema from an existing file will make the work much easier. You will even be able to query data directly from a stage using a SELECT statement like select * from @my_stage.

INFER 's3://mybucket/data.csv' FILE_FORMAT = ( TYPE = CSV );
+-------------+---------+----------+
| COLUMN_NAME | TYPE | NULLABLE |
|-------------+---------+----------|
| CONTINENT | TEXT | True |
| COUNTRY | VARIANT | True |
+-------------+---------+----------+

We've added support for inferring the basic schema from parquet files in #9043, and we're now working on #7211 to implement select from @stage.

Learn More

What's Up Next

We're always open to cutting-edge technologies and innovative ideas. You're more than welcome to join the community and bring them to Databend.

Add Tls Support for Mysql Handler

opensrv-mysql v0.3.0 that was released recently includes support for TLS. It sounds like a good idea to introduce it to Databend.

let (is_ssl, init_params) = opensrv_mysql::AsyncMysqlIntermediary::init_before_ssl(
&mut shim,
&mut r,
&mut w,
&Some(tls_config.clone()),
)
.await
.unwrap();

opensrv_mysql::secure_run_with_options(shim, w, ops, tls_config, init_params).await

Issue 8983: Feature: tls support for mysql handler

Please let us know if you're interested in contributing to this issue, or pick up a good first issue at https://link.databend.rs/i-m-feeling-lucky to get started.

Changelog

You can check the changelog of Databend Nightly for details about our latest developments.

Contributors

Thanks a lot to the contributors for their excellent work this week.

andylokandyariesdevilb41shBohuTANGdantengskydrmingdrmer
andylokandyariesdevilb41shBohuTANGdantengskydrmingdrmer
everpcpcflaneur2020leiyskylichuangmergify[bot]PsiACE
everpcpcflaneur2020leiyskylichuangmergify[bot]PsiACE
sandfleesoyeric128sundy-liTCeasonTracyZYJXuanwo
sandfleesoyeric128sundy-liTCeasonTracyZYJXuanwo
xudong963youngsofunyufan022zhang2014zhyass
xudong963youngsofunyufan022zhang2014zhyass

Connect With Us

We'd love to hear from you. Feel free to run the code and see if Databend works for you. Submit an issue with your problem if you need help.

DatafuseLabs Community is open to everyone who loves data warehouses. Please join the community and share your thoughts.