Reorganize the project folders#2081
Reorganize the project folders#2081yjshen merged 3 commits intoapache:masterfrom yahoNanJing:issue-2080
Conversation
| This module contains an `async` API for the [DataFusion][df] to access data, either remotely or locally. | ||
|
|
||
| [df]: https://crates.io/crates/datafusion | ||
| This module contains an `async` API for accessing data based on object store interfaces, either remotely or locally. No newline at end of file |
There was a problem hiding this comment.
If i understand the intent of this correctly, i think the data access layer could expose other data access methods than just object store (although they may not exist yet). For example a streaming provider, or database provider. If this is correct i think we would just want to make the description more generic.
There was a problem hiding this comment.
Agree. Although currently we only support object store, we can make description not based on the object store.
|
I think its very clean / organized and i do prefer this proposed structure. But im curious why an intermediate rust directory is needed? do you have an idea or plans for code that would fall outside of that? |
|
Do we need to align with this #1750? |
| "datafusion/rust/jit", | ||
| "datafusion/rust/physical-expr", | ||
| "datafusion/rust/cli", | ||
| "datafusion/rust/proto", |
There was a problem hiding this comment.
the rust directory here seems a bit redundant to me, do you expect us to have other language implementations here?
| [package] | ||
| name = "datafusion-storage" | ||
| description = "Storage for DataFusion query engine" | ||
| name = "data-access" |
There was a problem hiding this comment.
data-access feel a bit too generic to me. To be more specific, how about we name this crate and the folder object-store? the object-store crate hasn't been claimed in crates.io yet.
There was a problem hiding this comment.
i had the idea that within this other data access providers / crates could exist - for example maybe stream provider or database provider and thats why the name was generic.
There was a problem hiding this comment.
As @matthewmturner mentioned, the object store may be too specific and maybe will limit the extension of future data access abilities.
| name = "datafusion-storage" | ||
| description = "Storage for DataFusion query engine" | ||
| name = "data-access" | ||
| description = "General data access layer based on object store" |
There was a problem hiding this comment.
👍 This is certainly a more accurate description :D
Thanks @matthewmturner, @houqp , I'm good with removing the rust folder. Previously I was thinking maybe there's something datafusion specified, like Ballista did. |
|
I think we need to be careful if we make a crate with a name like |
Agree. Will change "data-access" to be "datafusion-data-access". |
|
Rebase and add refinements according to the PR review comments. |
houqp
left a comment
There was a problem hiding this comment.
LGTM, the next datafusion release is going to be fun since we will have a lot of versions to update :D
|
Hi @alamb, any chance to merge this PR this week😂 |
|
@yahoNanJing sounds good to me! I haven't been following it closely but given @yjshen and @houqp have given it ✅ I'll plan to merge it in when the tests pass Thanks for sticking with it |
|
Interestingly the failure in ballista also fails for me locally: https://github.com/apache/arrow-datafusion/runs/5726864404?check_suite_focus=true Maybe something about moving the code into different directories has caused the test to start running fixed in 463de7f |
|
Thanks @alamb for fixing the issue. |
|
Thanks again @yahoNanJing and everyone else helped! |
Which issue does this PR close?
Closes #2080.
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?