Align melt with docs for list measure.vars#7257
Align melt with docs for list measure.vars#7257Mukulyadav2004 wants to merge 11 commits intomasterfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7257 +/- ##
=========================================
Coverage ? 99.00%
=========================================
Files ? 81
Lines ? 15252
Branches ? 0
=========================================
Hits ? 15100
Misses ? 152
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Generated via commit f552d56 Download link for the artifact containing the test results: ↓ atime-results.zip
|
src/fmelt.c
Outdated
| SET_VECTOR_ELT(ansvars, 0, target=allocVector(STRSXP, data->totlen)); | ||
| if (data->lvalues == 1) { | ||
| const int *thisvaluecols = INTEGER(VECTOR_ELT(data->valuecols, 0)); | ||
| for (int j=0, ansloc=0; j<data->lmax; ++j) { |
There was a problem hiding this comment.
there should be no such addition here, why did you do this?
There was a problem hiding this comment.
You’re right the non-list path in getvarcols should not change. I’ve reverted that addition so the only functional change is confined to the list branch
|
can you please separate dcast and melt into two separate PRs? (currently they are combined in this PR) That would be much easier to review. |
Sure |
| if (data->lvalues == 1) {//one value column to output. TODO #5247 change to !data->measure_is_list | ||
| SEXP thisvaluecols = VECTOR_ELT(data->valuecols, 0); | ||
| int len = length(thisvaluecols); | ||
| levels = PROTECT(allocVector(STRSXP, len)); protecti++; |
There was a problem hiding this comment.
please revert these changes and instead use the approach here https://github.com/Rdatatable/data.table/pull/5247/files
|
Thanks! I reverted non-list changes and followed from PR #5247 - branch on data->measure_is_list inside getvarcols. For list measure.vars, variable returns integer indices and 1..lmax levels when variable.factor=TRUE. Non-list paths are unchanged. Does this seems right to you? |
|
@tdhock what all I'm understanding and doing here is that when variable.factor is FALSE, list measure.vars give INTSXP indices 1to lmax while non-list remains STRSXP of column names and when variable.factor is TRUE, list give a factor with integer codes 1 to lmax and levels "1" to "lmax", and non list remains the original name based factor logic, keeping the if (data->lvalues == 1). |
inst/tests/tests.Rraw
Outdated
| test(2182.6, melt(DT.wide, measure.vars=list(b=c("b1","b2"))), data.table(a2=2, variable=factor(c("b1","b2")), b=c(1,2)), warning="measure.vars is a list with length=1") # measure.vars named list length=1, #5065 | ||
| # consistency between measure.vars=list with length=1 and length>1, #5209 | ||
| test(2182.71, melt(DT.wide, measure.vars=list("a2"), variable.factor=TRUE), data.table(b1=1, b2=2, variable=factor("a2"), value=2), warning="measure.vars is a list with length=1") | ||
| test(2182.6, melt(DT.wide, measure.vars=list(b=c("b1","b2"))), data.table(a2=2, variable=factor(1:2), b=c(1,2))) # list yields indices |
There was a problem hiding this comment.
please add these tests from the other pr
test(2182.6, melt(DT.wide, measure.vars=list(b=c("b1","b2"))), data.table(a2=2, variable=factor(c("1","2")), b=c(1,2))) # measure.vars named list length=1, #5065
# consistency between measure.vars=list with length=1 and length>1, #5209
test(2182.71, melt(DT.wide, measure.vars=list("a2"), variable.factor=TRUE), data.table(b1=1, b2=2, variable=factor(1), value=2))
test(2182.72, melt(DT.wide, measure.vars=c("a2"), variable.factor=TRUE), data.table(b1=1, b2=2, variable=factor("a2"), value=2))
test(2182.73, melt(DT.wide, measure.vars=list("a2"), variable.factor=FALSE), data.table(b1=1, b2=2, variable="1", value=2))
test(2182.74, melt(DT.wide, measure.vars=c("a2"), variable.factor=FALSE), data.table(b1=1, b2=2, variable="a2", value=2))
test(2182.75, melt(data.table(a=10, b=20), measure.vars=list(n="a"), variable.factor=FALSE), data.table(b=20, variable="1", n=10))#thanks @mnazarov
src/fmelt.c
Outdated
| } else {//multiple output columns. | ||
| int nlevel=0; | ||
| levels = PROTECT(allocVector(STRSXP, data->lmax)); protecti++; | ||
| for (int j=0, ansloc=0; j<data->lmax; ++j) { |
There was a problem hiding this comment.
all of this code is un-necessary. where did it come from?
all you need to do is change the test in the if statements, as in https://github.com/Rdatatable/data.table/pull/5247/files#diff-1727bfee28bf4521c354c6f4463e6fcaca14a4d62eda781552a489a749b5ec59
|
this pr is confusing to me, because it has a lot of irrelevant changes/commits. |

closes #6629
In melt, docs stated that when measure.vars is a list, the variable column contains integer indices. Actual behavior was inconsistent when the list length was 1 (character name), with a “warn now, break later” warning #5209.
So in this PR
melt breaking change (consistency with docs)(fmelt.c)
In test 2182.6, 2182.71, 2182.73, 2182.75, expectations no swiched from names to integer and also removed the warning. And in 1037.404 and 1037.407 - Int
@tdhock @joshhwuu @Anirban166 please review.