spec: create table #55

jackye1995 · 2025-04-22T23:24:03Z

Based on discussion on #5

jackye1995 · 2025-04-22T23:28:23Z

spec/catalog.yaml

        location:
          type: string

+    Schema:


I think we might be able to use this to be able to use the same library and achieve all we have discussed.

This is basically aligned with the model described in the Lance format.

To use it properly, at client side (e.g. Spark, a conversion is needed to convert Spark StructType schema to a schema of this shape), and then at server side, a conversion is needed to convert this schema to Arrow schema shape. (this can be provided by this repository as e.g. lance-catalog-apache-client-utils)

Then there is enough information to create the dataset using something like

import com.lancedb.lance.catalog.client.apache.utils.ArrowUtils Dataset.create(allocator, location, ArrowUtils.toArrowSchema(restSchema), params) .close();

Thoughts? @yanghua

You mean using a string to represent the left type, right? Then the adapter parses and maps it to the arrow type?

jackye1995 · 2025-04-22T23:29:58Z

spec/catalog.yaml

+          additionalProperties:
+            type: string
+
+    WriterVersion:


This is basically the guardrail that ensures that the server knows if there is any issue in creating a table with the user's environment. For example if the user library is too out of date, the server should reject the table creation request rather than creating a table that the client cannot consume.

You mean the SDK version?

I think it's more of the format version https://lancedb.github.io/lance/format.html#file-version

jackye1995 · 2025-04-22T23:49:49Z

spec/catalog.yaml

+          $ref: '#/components/schemas/Schema'
+        writerVersion:
+          $ref: '#/components/schemas/WriterVersion'
+        config:


We might not be able to pass anything here yet with the current shape of the Dataset API. But I remember we still had discussion about the storage options last time. This is different from that as these will be persisted in the Lance table metadata.

Do you still think it is necessary to pass storage options in as a part of the request? @yanghua My understanding is that the server should decide the storage options used based on the location. Could you provide any specific use case for the storage option that should be passed?

My understanding is that the server should decide the storage options used based on the location.

Our commercial version also uses mode.

My original thought is that maybe we can open an entry point for users, they can pass ak/sk or not, that depends themself.

OK, now let's ignore it now.

passing sk is definitely a no go I think, because the server can easily just keep the sk to use outside the duration of the request and it is a huge security red flag. We probably need to solve that when implementing AuthN for the client and server.

passing sk is definitely a no go I think, because the server can easily just keep the sk to use outside the duration of the request and it is a huge security red flag

In our cases, that depends on how we define the client. A client may always be on a cloud vendor's intranet.
Or, in some cases, the customer may give us these informations to handle emergency situations.

Keeping it doesn't necessarily mean using it or encouraging it.

Anyway, let's ignore it. Currently, close this entry point to pass storage options.

github-actions bot added enhancement New feature or request python Python features java Java features spec Restful openapi spec rust Rust features labels Apr 22, 2025

jackye1995 commented Apr 22, 2025

View reviewed changes

jackye1995 changed the title ~~feat: create table~~ spec: create table Apr 22, 2025

jackye1995 requested a review from yanghua April 22, 2025 23:31

jackye1995 commented Apr 22, 2025

View reviewed changes

jackye1995 added 2 commits April 23, 2025 12:56

rebase

e88f010

commit

b8761d5

jackye1995 closed this by deleting the head repository Apr 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

spec: create table #55

spec: create table #55

Uh oh!

jackye1995 commented Apr 22, 2025

Uh oh!

jackye1995 Apr 22, 2025 •

edited

Loading

Uh oh!

yanghua Apr 25, 2025 •

edited

Loading

Uh oh!

jackye1995 Apr 25, 2025

Uh oh!

jackye1995 Apr 22, 2025

Uh oh!

yanghua Apr 25, 2025

Uh oh!

jackye1995 Apr 25, 2025

Uh oh!

jackye1995 Apr 22, 2025 •

edited

Loading

Uh oh!

yanghua Apr 25, 2025

Uh oh!

jackye1995 Apr 25, 2025

Uh oh!

yanghua Apr 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

spec: create table #55

spec: create table #55

Uh oh!

Conversation

jackye1995 commented Apr 22, 2025

Uh oh!

jackye1995 Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yanghua Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jackye1995 Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

jackye1995 Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

yanghua Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

jackye1995 Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

jackye1995 Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yanghua Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

jackye1995 Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

yanghua Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jackye1995 Apr 22, 2025 •

edited

Loading

yanghua Apr 25, 2025 •

edited

Loading

jackye1995 Apr 22, 2025 •

edited

Loading