fix: allow retrieve of public and owned ds#2712
Open
minottic wants to merge 2 commits into
Open
Conversation
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
buildOwnerGroup, thefindAllquery uses{ _id: { $in: datasetPid } }, butdatasetPidis built fromdatasetDto.pid; this likely should query onpidinstead of_idto avoid silently returning no datasets and miscomputing ownership. - The
#datasetOwnerbranch inbuildOwnerGroupdoesconst [firstOwner] = datasetOwners;on aSet, which will always yieldundefined; consider converting the set to an array first (e.g.const [firstOwner] = [...datasetOwners];).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `buildOwnerGroup`, the `findAll` query uses `{ _id: { $in: datasetPid } }`, but `datasetPid` is built from `datasetDto.pid`; this likely should query on `pid` instead of `_id` to avoid silently returning no datasets and miscomputing ownership.
- The `#datasetOwner` branch in `buildOwnerGroup` does `const [firstOwner] = datasetOwners;` on a `Set`, which will always yield `undefined`; consider converting the set to an array first (e.g. `const [firstOwner] = [...datasetOwners];`).
## Individual Comments
### Comment 1
<location path="src/jobs/interceptors/create-job-v3-mapping.interceptor.ts" line_range="113-114" />
<code_context>
+ const datasetList = jobParams.datasetList;
+ if (datasetList.length === 0) return undefined;
+ const datasetPid = datasetList.map((datasetDto) => datasetDto.pid);
+ const datasets = await this.datasetsService.findAll({
+ where: { _id: { $in: datasetPid } },
+ fields: ["isPublished", "accessGroups", "ownerGroup"],
+ });
</code_context>
<issue_to_address>
**🚨 issue (security):** Dataset lookup now uses `_id` with a list of PIDs, which likely changes semantics and may break owner-group resolution.
The previous implementation queried by `pid` via `findOne({ where: { pid: datasetDto.pid } })`, but the new code collects PIDs into `datasetPid` and queries with `where: { _id: { $in: datasetPid } }`. Unless `_id` is guaranteed to equal `pid`, this will return wrong or empty results and lead to incorrect `ownerGroup` and authorization decisions. Please either query by `pid` again or ensure you collect and filter on the same identifier field.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
ee4e19c to
aa90928
Compare
The computation of job.ownerGroup, error thrown if empty, suffered from having jobs trying to access public and owned datasets because public datasets might be not owned by the user submitting the job. It also includes a refactor to simplify maintenance as well as more verbose error messages
Junjiequan
reviewed
May 12, 2026
Comment on lines
+144
to
+149
| if ( | ||
| jobConfigCreateAuth === CreateJobAuth.DatasetAccess && | ||
| datasets.every((dataset) => dataset?.isPublished) | ||
| ) | ||
| return jobUserCurrentGroups?.[0]; | ||
| const nonPublishedDatasets = datasets.filter( |
Member
There was a problem hiding this comment.
I'm not very familiar with the job concept, but this logic looks slightly off to me.
If each job requires ownerGroup, instead of picking arbitary one from userGroups array, how about forcing user to provide one? if possible
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The computation of job.ownerGroup, error thrown if empty, suffered from having
jobs trying to access public and owned datasets because public datasets might
be not owned by the user submitting the job. It also includes a refactor to
simplify maintenance as well as more verbose error messages
Fixes
Changes:
Tests included
Documentation
official documentation info
Summary by Sourcery
Adjust job creation mapping and validation to correctly handle public and owned datasets while improving ownership checks and error reporting.
Bug Fixes:
Enhancements:
Tests: