Is your feature request related to a problem? Please describe.
Today, PyDeequ is a PySpark binding for Deequ which is in Scala and Spark only. While it is a good fit for DEs, Spark is not a great fit for many DS use-cases who will have datasets fit in memory and do not want to setup Spark.
See initial ideas here https://youtu.be/fvKFOfaLwBA?t=1393 from @sscdotopen.
Describe the solution you'd like
As discussed in above video, it would be good to create the proper abstractions to support another analytic engine. DuckDB who has gained popularity recently can be another analytic engine. The designs need more thoughts/discussion.
Is your feature request related to a problem? Please describe.
Today, PyDeequ is a PySpark binding for Deequ which is in Scala and Spark only. While it is a good fit for DEs, Spark is not a great fit for many DS use-cases who will have datasets fit in memory and do not want to setup Spark.
See initial ideas here https://youtu.be/fvKFOfaLwBA?t=1393 from @sscdotopen.
Describe the solution you'd like
As discussed in above video, it would be good to create the proper abstractions to support another analytic engine. DuckDB who has gained popularity recently can be another analytic engine. The designs need more thoughts/discussion.