Skip to content

fix docs/errors for melt, measure, patterns, eval_with_cols: cols arg should not be provided by user#5115

Closed
tdhock wants to merge 26 commits intomasterfrom
fix-measure-cols-doc
Closed

fix docs/errors for melt, measure, patterns, eval_with_cols: cols arg should not be provided by user#5115
tdhock wants to merge 26 commits intomasterfrom
fix-measure-cols-doc

Conversation

@tdhock
Copy link
Member

@tdhock tdhock commented Aug 27, 2021

closes #5063

Beta testing new measure() function revealed that users were confused about how to use cols arg.

Consensus from issue seems to be that docs for cols arg need to be clarified to indicate that users should NOT provide this argument.

@codecov
Copy link

codecov bot commented Aug 27, 2021

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.49%. Comparing base (53df7e5) to head (8dbbe0c).
Report is 21 commits behind head on master.

❗ Current head 8dbbe0c differs from pull request most recent head 46dc1b4. Consider uploading reports for the commit 46dc1b4 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5115      +/-   ##
==========================================
- Coverage   97.51%   97.49%   -0.03%     
==========================================
  Files          80       80              
  Lines       14920    14864      -56     
==========================================
- Hits        14549    14491      -58     
- Misses        371      373       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mattdowle

This comment was marked as resolved.

@tdhock

This comment was marked as resolved.

@mattdowle

This comment was marked as resolved.

@mattdowle mattdowle added this to the 1.14.1 milestone Aug 27, 2021
@tdhock

This comment was marked as resolved.

@jangorecki jangorecki modified the milestones: 1.14.3, 1.14.5 Jul 19, 2022
@tdhock
Copy link
Member Author

tdhock commented Jul 1, 2023

So rather than adding logic to measure/measurev to try to detect the missing cols arg (which would not be possible, given that it is filled in via NSE), I implemented an error message in eval_with_cols, if the user provides the cols argument. I also clarified the documentation of the cols arg and added an example of usage with .SDcols in ?patterns man page.

@tdhock tdhock changed the title measure() doc fix: cols arg should not be provided by user fix docs/errors for measure, patterns, eval_with_cols: cols arg should not be provided by user Jul 1, 2023
@tdhock
Copy link
Member Author

tdhock commented Jul 1, 2023

The new behavior is the error message below, which was previously not present, but now clearly indicates that the user should remove the cols argument in their code.
Previously, other less informative warnings could result (confusing for user), like in #5063, if user provided cols which are not equal to actual column names.

> i1=data.table(iris)[1]
> melt(i1, measure=patterns("Sepal"))
   Petal.Length Petal.Width Species     variable value
          <num>       <num>  <fctr>       <fctr> <num>
1:          1.4         0.2  setosa Sepal.Length   5.1
2:          1.4         0.2  setosa  Sepal.Width   3.5
> melt(i1, measure=patterns("Sepal", cols=names(iris)))
Erreur : user should not provide cols argument to patterns

@tdhock tdhock changed the title fix docs/errors for measure, patterns, eval_with_cols: cols arg should not be provided by user fix docs/errors for melt, measure, patterns, eval_with_cols: cols arg should not be provided by user Oct 12, 2023
@jangorecki jangorecki modified the milestones: 1.14.11, 1.15.1 Oct 29, 2023
@tdhock tdhock added the reshape dcast melt label Jan 5, 2024
@tdhock tdhock requested a review from mattdowle as a code owner February 15, 2024 22:10
@tdhock tdhock requested review from MichaelChirico and jangorecki and removed request for mattdowle February 15, 2024 22:26
@jangorecki jangorecki added the breaking-change issues whose solution would require breaking existing behavior label Feb 22, 2024
if ("cols" %in% names(formals(fun)) && !"cols" %in% names(named_call)) {
if ("cols" %in% names(formals(fun))) {
if ("cols" %in% names(named_call)) {
stopf("user should not provide cols argument to %s, when specifying the columns for melt or .SDcols; in this context, non-standard evaluation is used internally to set cols to all data table column names, so please fix by removing cols argument", as.character(fun_uneval))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a much more serious breaking change than #5247 -- just glancing at GitHub shows lots of people using cols=:

https://github.com/search?q=lang%3AR%20%2Fpatterns%5B(%5D%5B%5E)%5D*cols%5Cs*%3D%2F&type=code

Even though I almost exclusively see it done with cols=names(x), such usage is now an error. We need to remove the cols= argument more gradually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok change to warning?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's map out a deprecation plan, e.g.

1.16.0: warning() if cols is used for a subset of columns, message() if it's used for the full set of columns. The difference because the latter case should simply drop cols=, the former case can actually get different behavior.
1.17.0: Upgrade to error to provide cols=
1.18.0: Remove code related to cols=

We can consider an option to allow the old behavior to keep working as well, in which case we probably want to lengthen the deprecation cycle by one more release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another option is that we could continue to support user-provided cols and modify measure() to support that (would be possible I think if we make measure return character column names instead of integer column indices). However I think it would be better to do this warning/error, and encourage users to specify the right regex/pattern arg to patterns/measure. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong prior here, though it does sound workable; do you think that would be useful and worth maintaining?

One thought is that it's sometimes clunky to jam everything into a pattern-based approach, e.g. it's not uncommon to have a vector of column names handy already, in which case just plugging it into cols= can be convenient as compared to turning that into a regex (esp if the column names might have special regex characters like ( { . etc). That said we haven't seen a request to this end.

@tdhock tdhock closed this Apr 12, 2024
@MichaelChirico MichaelChirico deleted the fix-measure-cols-doc branch July 8, 2025 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change issues whose solution would require breaking existing behavior reshape dcast melt

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Odd behaviour of melt() when using the cols argument in measure()

4 participants