Skip to content

Reading spilled file rarely fails #7537

@sarutak

Description

@sarutak

Describe the bug

I noticed a test sort::tests::test_sort_fetch_memory_calculation fails very rarely (approximately once in hundred times).

[2023-09-13T05:07:29Z ERROR datafusion::physical_plan::sorts::sort] Failure while reading spill file: NamedTempFile("/tmp/.tmp9wH6tP/.tmptarqTV"). Error: IO error: No such file or directory (os error 2)
test physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation ... FAILED

failures:

---- physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation stdout ----
Error: Execution("Spawned Task error: IO error: No such file or directory (os error 2)")


failures:
    physical_plan::sorts::sort::tests::test_sort_fetch_memory_calculation

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 1702 filtered out; finished in 1.04s

This seems a kind of race condition issue.

To Reproduce

You can reproduce this using this script.

for i in {1..500}; do
  if ! cargo test sort::tests::test_sort_fetch_memory &> /tmp/err.out; then
    cat /tmp/err.out; break;
fi
done

Or, insert a sleep to sort.rs like as follows.

diff --git a/datafusion/core/src/physical_plan/sorts/sort.rs b/datafusion/core/src/physical_plan/sorts/sort.rs
index 82badb7d8..3feedfd71 100644
--- a/datafusion/core/src/physical_plan/sorts/sort.rs
+++ b/datafusion/core/src/physical_plan/sorts/sort.rs
@@ -616,6 +616,7 @@ fn read_spill_as_stream(
     let sender = builder.tx();
 
     builder.spawn_blocking(move || {
+        std::thread::sleep(std::time::Duration::from_secs(1));
         let result = read_spill(sender, path.path());
         if let Err(e) = &result {
             error!("Failure while reading spill file: {:?}. Error: {}", path, e);

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions