Add IMDB queries (a.k.a. JOB - Join Order Benchmark) to DataFusion benchmark suite

### Is your feature request related to a problem or challenge?

JOB (Join Order Benchmark) was proposed by a research team from TUM  in the paper ["How Good Are Query Optimizers, Really?"](https://www.vldb.org/pvldb/vol9/p204-leis.pdf).

It is also used in HyPer, DuckDB, and CedarDB. It is a  good benchmark for testing join ordering and join operators. It is also part of DuckDB's regression test suite.

I think if we add this test suite, it will also help with improvements like those discussed in https://github.com/apache/datafusion/issues/7955.

### Describe the solution you'd like

JOB utilize the [IMDB datasets](https://developer.imdb.com/non-commercial-datasets/). These datasets are provided in csv.gz format and represent real-world data, making them ideal for testing datafusion.


task 

- [ ] Convert the dataset from `csv.gz` format to `Parquet`.
- [ ] Add the IMDB license to the LICENSE.
- [ ] add benchmark queries.
- [ ] Integrate the benchmark suite into `dfbench`. 

Once everything is set up, we will be able to easily run benchmarks using the following command:
```
cargo run  --bin dfbench --imdb --query=5
```


I would like to work on this! 
Can someone help me understand the usual process for adding a third-party license in a Apache project ? 

cc @jayzhan211  @alamb 

### Describe alternatives you've considered

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add IMDB queries (a.k.a. JOB - Join Order Benchmark) to DataFusion benchmark suite #12311

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add IMDB queries (a.k.a. JOB - Join Order Benchmark) to DataFusion benchmark suite #12311

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions