[feat](skew & kurt) New aggregate function skew & kurt #40945#41277
Conversation
|
run buildall |
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
|
|
||
| #pragma once | ||
|
|
||
| #include <stddef.h> |
There was a problem hiding this comment.
warning: inclusion of deprecated C++ header 'stddef.h'; consider using 'cstddef' instead [modernize-deprecated-headers]
| #include <stddef.h> | |
| #include <cstddef> |
| ++m[0]; | ||
| m[1] += x; | ||
| m[2] += x * x; | ||
| if constexpr (_level >= 3) m[3] += x * x * x; |
There was a problem hiding this comment.
warning: statement should be inside braces [readability-braces-around-statements]
| if constexpr (_level >= 3) m[3] += x * x * x; | |
| if constexpr (_level >= 3) { m[3] += x * x * x; | |
| } |
| m[1] += x; | ||
| m[2] += x * x; | ||
| if constexpr (_level >= 3) m[3] += x * x * x; | ||
| if constexpr (_level >= 4) m[4] += x * x * x * x; |
There was a problem hiding this comment.
warning: statement should be inside braces [readability-braces-around-statements]
| if constexpr (_level >= 4) m[4] += x * x * x * x; | |
| if constexpr (_level >= 4) { m[4] += x * x * x * x; | |
| } |
| m[0] += rhs.m[0]; | ||
| m[1] += rhs.m[1]; | ||
| m[2] += rhs.m[2]; | ||
| if constexpr (_level >= 3) m[3] += rhs.m[3]; |
There was a problem hiding this comment.
warning: statement should be inside braces [readability-braces-around-statements]
| if constexpr (_level >= 3) m[3] += rhs.m[3]; | |
| if constexpr (_level >= 3) { m[3] += rhs.m[3]; | |
| } |
| m[1] += rhs.m[1]; | ||
| m[2] += rhs.m[2]; | ||
| if constexpr (_level >= 3) m[3] += rhs.m[3]; | ||
| if constexpr (_level >= 4) m[4] += rhs.m[4]; |
There was a problem hiding this comment.
warning: statement should be inside braces [readability-braces-around-statements]
| if constexpr (_level >= 4) m[4] += rhs.m[4]; | |
| if constexpr (_level >= 4) { m[4] += rhs.m[4]; | |
| } |
| ErrorCode::INTERNAL_ERROR, | ||
| "Variation moments should be obtained by 'get_population' method"); | ||
| } else { | ||
| if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN(); |
There was a problem hiding this comment.
warning: statement should be inside braces [readability-braces-around-statements]
| if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN(); | |
| if (m[0] == 0) { return std::numeric_limits<T>::quiet_NaN(); | |
| } |
| } else { | ||
| if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN(); | ||
| // to avoid accuracy problem | ||
| if (m[0] == 1) return 0; |
There was a problem hiding this comment.
warning: statement should be inside braces [readability-braces-around-statements]
| if (m[0] == 1) return 0; | |
| if (m[0] == 1) { return 0; | |
| } |
| ErrorCode::INTERNAL_ERROR, | ||
| "Variation moments should be obtained by 'get_population' method"); | ||
| } else { | ||
| if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN(); |
There was a problem hiding this comment.
warning: statement should be inside braces [readability-braces-around-statements]
| if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN(); | |
| if (m[0] == 0) { return std::numeric_limits<T>::quiet_NaN(); | |
| } |
| } else { | ||
| if (m[0] == 0) return std::numeric_limits<T>::quiet_NaN(); | ||
| // to avoid accuracy problem | ||
| if (m[0] == 1) return 0; |
There was a problem hiding this comment.
warning: statement should be inside braces [readability-braces-around-statements]
| if (m[0] == 1) return 0; | |
| if (m[0] == 1) { return 0; | |
| } |
| return; | ||
| } |
There was a problem hiding this comment.
warning: redundant return statement at the end of a function with a void return type [readability-redundant-control-flow]
| return; | |
| } | |
| } |
8c81afd to
ab3e14b
Compare
|
run buildall |
ab3e14b to
89ca88b
Compare
|
run buildall |
1 similar comment
|
run buildall |
`skew`,`skew_pop` and `skewness` is used to calculate [skewness](https://en.wikipedia.org/wiki/Skewness#Pearson.27s_moment_coefficient_of_skewness) of a data distribution. `kurt`,`kurt_pop` and `kurtosis` is used to calculate [kurtosis](https://en.wikipedia.org/wiki/Kurtosis) of a data distribution. The implementation references ClickHouse/ClickHouse#5200, and modified result type to AlwaysNullable since doris do not support NaN. The formula used to calculate skew is `3-th moments / (variance^{1.5})` The formula used to calculate kurt is `4-th moments / (variance^{2}) - 3` when value of any result is NaN, doris will return NULL. doc: apache/doris-website#1127
4df774f to
696ad91
Compare
|
run buildall |
cherry pick from #40945