⚡ Bolt: Parallelize data fetching in sitemap generation#149
⚡ Bolt: Parallelize data fetching in sitemap generation#149
Conversation
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
📝 WalkthroughWalkthroughRefactored sitemap generation to run per-year URL construction concurrently and to fetch speakers and talks in parallel with per-call failure handling. Added documentation guidance in Changes
Sequence Diagram(s)sequenceDiagram
participant SG as SitemapGenerator
participant YT as YearTask (per-year)
participant SP as SpeakersService
participant TK as TalksService
SG->>YT: start all YearTasks via Promise.all(years.map)
par YearTask for each year
YT->>SP: getSpeakers(year) (async)
YT->>TK: getTalks(year) (async)
alt both resolve
SP-->>YT: speakers[]
TK-->>YT: talks[]
else any fails
SP-->>YT: [] (catch)
TK-->>YT: [] (catch)
end
YT->>YT: build year URLs (root, year pages, speaker/talk detail, jobs)
YT-->>SG: return year URLs
end
SG->>SG: flatten/aggregate URLs
SG-->>Browser/CI: emit sitemap
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Review Summary by QodoParallelize data fetching in sitemap generation
WalkthroughsDescription• Parallelize data fetching across years using Promise.all • Replace sequential for...of loops with concurrent .map() operations • Add error handling with .catch() fallbacks for API failures • Reduce sitemap generation build time by eliminating blocking waterfalls Diagramflowchart LR
A["Sequential for loops<br/>years iteration"] -->|"Refactor"| B["Promise.all with<br/>years.map()"]
C["Sequential API calls<br/>getSpeakers + getTalks"] -->|"Parallelize"| D["Concurrent Promise.all<br/>with error handling"]
B --> E["Concurrent execution<br/>all years in parallel"]
D --> E
E --> F["Reduced build time"]
File Changes1. app/sitemap.ts
|
Code Review by Qodo
|
There was a problem hiding this comment.
Code Review
This pull request parallelizes data fetching in the sitemap generation process to improve build performance and updates the project's learning documentation to reflect this new practice. Additionally, it performs a significant cleanup of the package-lock.json file by removing numerous unused dependencies. The review feedback suggests avoiding side effects, such as mutating a shared array, within concurrent map iterations to ensure deterministic ordering and better predictability; instead, it is recommended to return individual result arrays and flatten them at the end.
| await Promise.all( | ||
| years.map(async (year) => { | ||
| urls.push({ | ||
| url: `${baseUrl}/${year}/${page}`, | ||
| url: `${baseUrl}/${year}`, | ||
| lastModified: BUILD_TIME, | ||
| changeFrequency: "weekly", | ||
| priority: 0.8, | ||
| changeFrequency: "daily", | ||
| priority: 0.9, | ||
| }); | ||
| } | ||
|
|
||
| const speakers = await getSpeakers(year); | ||
| for (const speaker of speakers) { | ||
| urls.push({ | ||
| url: `${baseUrl}/${year}/speakers/${speaker.id}`, | ||
| lastModified: BUILD_TIME, | ||
| changeFrequency: "weekly", | ||
| priority: 0.7, | ||
| }); | ||
| } | ||
| const yearPages = ["speakers", "talks", "schedule", "job-offers", "cfp", "diversity", "sponsorship", "travel"]; | ||
| for (const page of yearPages) { | ||
| urls.push({ | ||
| url: `${baseUrl}/${year}/${page}`, | ||
| lastModified: BUILD_TIME, | ||
| changeFrequency: "weekly", | ||
| priority: 0.8, | ||
| }); | ||
| } | ||
|
|
||
| const sessionGroups = await getTalks(year); | ||
| for (const group of sessionGroups) { | ||
| for (const talk of group.sessions) { | ||
| const [speakers, sessionGroups] = await Promise.all([ | ||
| getSpeakers(year).catch(() => []), | ||
| getTalks(year).catch(() => []), | ||
| ]); | ||
|
|
||
| for (const speaker of speakers) { | ||
| urls.push({ | ||
| url: `${baseUrl}/${year}/talks/${talk.id}`, | ||
| url: `${baseUrl}/${year}/speakers/${speaker.id}`, | ||
| lastModified: BUILD_TIME, | ||
| changeFrequency: "weekly", | ||
| priority: 0.7, | ||
| }); | ||
| } | ||
| } | ||
|
|
||
| const companies = getJobOffersByYear(year); | ||
| for (const company of companies) { | ||
| urls.push({ | ||
| url: `${baseUrl}/${year}/job-offers/${slugify(company.name)}`, | ||
| lastModified: BUILD_TIME, | ||
| changeFrequency: "monthly", | ||
| priority: 0.5, | ||
| }); | ||
| } | ||
| } | ||
| for (const group of sessionGroups) { | ||
| for (const talk of group.sessions) { | ||
| urls.push({ | ||
| url: `${baseUrl}/${year}/talks/${talk.id}`, | ||
| lastModified: BUILD_TIME, | ||
| changeFrequency: "weekly", | ||
| priority: 0.7, | ||
| }); | ||
| } | ||
| } | ||
|
|
||
| const companies = getJobOffersByYear(year); | ||
| for (const company of companies) { | ||
| urls.push({ | ||
| url: `${baseUrl}/${year}/job-offers/${slugify(company.name)}`, | ||
| lastModified: BUILD_TIME, | ||
| changeFrequency: "monthly", | ||
| priority: 0.5, | ||
| }); | ||
| } | ||
| }) | ||
| ); |
There was a problem hiding this comment.
While parallelizing data fetching is a significant performance improvement for static generation, mutating a shared urls array from within concurrent async functions leads to non-deterministic ordering of the sitemap entries. Although the sitemap protocol does not strictly require a specific order, it is a best practice to avoid side effects inside Promise.all to ensure predictability and easier debugging.
A more robust approach is to have each iteration of the map return its own array of URLs and then flatten the results into the main urls array at the end.
const yearUrls = await Promise.all(
years.map(async (year) => {
const results: MetadataRoute.Sitemap = [
{
url: `${baseUrl}/${year}`,
lastModified: BUILD_TIME,
changeFrequency: "daily",
priority: 0.9,
},
];
const yearPages = ["speakers", "talks", "schedule", "job-offers", "cfp", "diversity", "sponsorship", "travel"];
for (const page of yearPages) {
results.push({
url: `${baseUrl}/${year}/${page}`,
lastModified: BUILD_TIME,
changeFrequency: "weekly",
priority: 0.8,
});
}
const [speakers, sessionGroups] = await Promise.all([
getSpeakers(year).catch(() => []),
getTalks(year).catch(() => []),
]);
for (const speaker of speakers) {
results.push({
url: `${baseUrl}/${year}/speakers/${speaker.id}`,
lastModified: BUILD_TIME,
changeFrequency: "weekly",
priority: 0.7,
});
}
for (const group of sessionGroups) {
for (const talk of group.sessions) {
results.push({
url: `${baseUrl}/${year}/talks/${talk.id}`,
lastModified: BUILD_TIME,
changeFrequency: "weekly",
priority: 0.7,
});
}
}
const companies = getJobOffersByYear(year);
for (const company of companies) {
results.push({
url: `${baseUrl}/${year}/job-offers/${slugify(company.name)}`,
lastModified: BUILD_TIME,
changeFrequency: "monthly",
priority: 0.5,
});
}
return results;
})
);
urls.push(...yearUrls.flat());Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/sitemap.ts (1)
1-89:⚠️ Potential issue | 🟡 MinorFix Prettier formatting to resolve CI failure.
The pipeline indicates Prettier check failed for this file. Run
prettier --write app/sitemap.tsto fix the formatting issues before merging.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/sitemap.ts` around lines 1 - 89, Prettier formatting errors in app/sitemap.ts are causing CI to fail; run the formatter (e.g., prettier --write app/sitemap.ts) or apply Prettier rules to reformat the file, then commit; ensure exports and functions like the default async sitemap(), the BUILD_TIME constant, and usages of getSpeakers/getTalks/getJobOffersByYear/slugify remain unchanged behaviorally after formatting so the file passes Prettier checks and CI.
🧹 Nitpick comments (2)
app/sitemap.ts (2)
33-87: Non-deterministic URL order due to concurrent array mutations.The
urlsarray is mutated from within concurrent async callbacks. While this is safe in JavaScript (no data corruption), the resulting URL order will be non-deterministic—it depends on which year's fetches complete first.If consistent sitemap ordering matters (e.g., for cache stability or diffing builds), consider collecting results per year and then flattening in a deterministic order:
♻️ Suggested refactor for deterministic ordering
- await Promise.all( - years.map(async (year) => { - urls.push({ + const yearResults = await Promise.all( + years.map(async (year) => { + const yearUrls: MetadataRoute.Sitemap = []; + yearUrls.push({ url: `${baseUrl}/${year}`, ... }); // ... rest of the year processing, pushing to yearUrls instead of urls + return yearUrls; }) ); + urls.push(...yearResults.flat());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/sitemap.ts` around lines 33 - 87, The current implementation mutates the shared urls array from concurrent async callbacks (inside the years.map Promise.all), causing non-deterministic ordering; change the approach in the sitemap generation so each year produces its own local list (collecting the static year pages, results from getSpeakers(year), getTalks(year), and getJobOffersByYear(year) into a per-year array) and then after awaiting Promise.all over those per-year promises, flatten or iterate the resulting array-of-arrays in the original years order to push into the final urls array (use BUILD_TIME, baseUrl, speaker.id, talk.id, and slugify(company.name) as before) to ensure deterministic sitemap ordering.
52-55: Silent error swallowing may violate coding guidelines.The
.catch(() => [])pattern silently swallows errors without logging. WhilegetSpeakersandgetTalksinternally log their own errors before returning empty arrays, if those functions throw unexpectedly (e.g., JSON parse errors not caught internally), this outer catch would hide the failure.Consider logging before returning the fallback, per the guideline requiring proper error handling:
🛡️ Suggested improvement
const [speakers, sessionGroups] = await Promise.all([ - getSpeakers(year).catch(() => []), - getTalks(year).catch(() => []), + getSpeakers(year).catch((error) => { + console.error(`Sitemap: Failed to fetch speakers for ${year}:`, error); + return []; + }), + getTalks(year).catch((error) => { + console.error(`Sitemap: Failed to fetch talks for ${year}:`, error); + return []; + }), ]);As per coding guidelines: "Do not use generic catch-all statements without logging or handling the error properly."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/sitemap.ts` around lines 52 - 55, The Promise.all fallback currently swallows any thrown errors from getSpeakers(year) and getTalks(year) via .catch(() => []); change those catches to log the caught error before returning the fallback (e.g., .catch((err) => { logger.error('getSpeakers failed', err); return []; }) ), applying analogous logging for getTalks so failures aren’t silently ignored while preserving the [] fallback; update the Promise.all line that calls getSpeakers and getTalks in sitemap.ts to use these logging-aware catch handlers (use the module's existing logger variable if available, otherwise console.error).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@app/sitemap.ts`:
- Around line 1-89: Prettier formatting errors in app/sitemap.ts are causing CI
to fail; run the formatter (e.g., prettier --write app/sitemap.ts) or apply
Prettier rules to reformat the file, then commit; ensure exports and functions
like the default async sitemap(), the BUILD_TIME constant, and usages of
getSpeakers/getTalks/getJobOffersByYear/slugify remain unchanged behaviorally
after formatting so the file passes Prettier checks and CI.
---
Nitpick comments:
In `@app/sitemap.ts`:
- Around line 33-87: The current implementation mutates the shared urls array
from concurrent async callbacks (inside the years.map Promise.all), causing
non-deterministic ordering; change the approach in the sitemap generation so
each year produces its own local list (collecting the static year pages, results
from getSpeakers(year), getTalks(year), and getJobOffersByYear(year) into a
per-year array) and then after awaiting Promise.all over those per-year
promises, flatten or iterate the resulting array-of-arrays in the original years
order to push into the final urls array (use BUILD_TIME, baseUrl, speaker.id,
talk.id, and slugify(company.name) as before) to ensure deterministic sitemap
ordering.
- Around line 52-55: The Promise.all fallback currently swallows any thrown
errors from getSpeakers(year) and getTalks(year) via .catch(() => []); change
those catches to log the caught error before returning the fallback (e.g.,
.catch((err) => { logger.error('getSpeakers failed', err); return []; }) ),
applying analogous logging for getTalks so failures aren’t silently ignored
while preserving the [] fallback; update the Promise.all line that calls
getSpeakers and getTalks in sitemap.ts to use these logging-aware catch handlers
(use the module's existing logger variable if available, otherwise
console.error).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 15882a5e-63d0-47f4-ae86-9fb46f0b81d4
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (2)
.jules/bolt.mdapp/sitemap.ts
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
app/sitemap.ts (1)
33-84: Return per-year entries fromPromise.allinstead of mutatingurlsdirectly.
Promise.allpreserves the originalyearsorder in its result array, but the sharedurls.push(...)calls after Line 52 run in fetch-completion order. That makes the returned sitemap timing-dependent across identical builds. Building a localMetadataRoute.Sitemapper year and flattening afterward keeps the parallelism and the output deterministic.♻️ Suggested refactor
- await Promise.all( - years.map(async (year) => { - urls.push({ + const yearUrls = await Promise.all( + years.map(async (year): Promise<MetadataRoute.Sitemap> => { + const entries: MetadataRoute.Sitemap = []; + + entries.push({ url: `${baseUrl}/${year}`, lastModified: BUILD_TIME, changeFrequency: "daily", priority: 0.9, }); const yearPages = ["speakers", "talks", "schedule", "job-offers", "cfp", "diversity", "sponsorship", "travel"]; for (const page of yearPages) { - urls.push({ + entries.push({ url: `${baseUrl}/${year}/${page}`, lastModified: BUILD_TIME, changeFrequency: "weekly", priority: 0.8, }); } const [speakers, sessionGroups] = await Promise.all([getSpeakers(year).catch(() => []), getTalks(year).catch(() => [])]); for (const speaker of speakers) { - urls.push({ + entries.push({ url: `${baseUrl}/${year}/speakers/${speaker.id}`, lastModified: BUILD_TIME, changeFrequency: "weekly", priority: 0.7, }); } for (const group of sessionGroups) { for (const talk of group.sessions) { - urls.push({ + entries.push({ url: `${baseUrl}/${year}/talks/${talk.id}`, lastModified: BUILD_TIME, changeFrequency: "weekly", priority: 0.7, }); } } const companies = getJobOffersByYear(year); for (const company of companies) { - urls.push({ + entries.push({ url: `${baseUrl}/${year}/job-offers/${slugify(company.name)}`, lastModified: BUILD_TIME, changeFrequency: "monthly", priority: 0.5, }); } - }) + return entries; + }) ); + urls.push(...yearUrls.flat());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/sitemap.ts` around lines 33 - 84, The current code mutates the shared urls array from within parallel tasks, causing nondeterministic order; change the years.map callback used in Promise.all to build and return a local array (e.g., yearUrls) containing that year's entries (use same logic for the year page entries, speaker URLs from getSpeakers(year), talk URLs from getTalks(year), and job-offer URLs from getJobOffersByYear(year) with slugify and BUILD_TIME/baseUrl), await the parallel fetches as before but push into the local yearUrls, then return yearUrls from the map; after Promise.all completes, flatten the resulting array of per-year arrays into the final urls variable so output order matches years while keeping parallel fetches.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/sitemap.ts`:
- Line 52: The Promise.all call is re-catching errors unnecessarily; remove the
`.catch(() => [])` wrappers so it becomes Promise.all([getSpeakers(year),
getTalks(year)]) and let the internal try/catch in getSpeakers and getTalks
return [] and log errors; update the assignment to const [speakers,
sessionGroups] = await Promise.all([getSpeakers(year), getTalks(year)]) so
unexpected errors are not silently masked and sitemap omissions remain
diagnosable.
---
Nitpick comments:
In `@app/sitemap.ts`:
- Around line 33-84: The current code mutates the shared urls array from within
parallel tasks, causing nondeterministic order; change the years.map callback
used in Promise.all to build and return a local array (e.g., yearUrls)
containing that year's entries (use same logic for the year page entries,
speaker URLs from getSpeakers(year), talk URLs from getTalks(year), and
job-offer URLs from getJobOffersByYear(year) with slugify and
BUILD_TIME/baseUrl), await the parallel fetches as before but push into the
local yearUrls, then return yearUrls from the map; after Promise.all completes,
flatten the resulting array of per-year arrays into the final urls variable so
output order matches years while keeping parallel fetches.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| const sessionGroups = await getTalks(year); | ||
| for (const group of sessionGroups) { | ||
| for (const talk of group.sessions) { | ||
| const [speakers, sessionGroups] = await Promise.all([getSpeakers(year).catch(() => []), getTalks(year).catch(() => [])]); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check the file structure and locate the relevant files
find . -type f -name "*.ts" -o -name "*.tsx" | grep -E "(sitemap|useSpeakers|useTalks)" | head -20Repository: anyulled/devbcn-nextjs
Length of output: 166
🏁 Script executed:
# Get content of app/sitemap.ts around line 52
if [ -f "app/sitemap.ts" ]; then
cat -n app/sitemap.ts | sed -n '45,60p'
fiRepository: anyulled/devbcn-nextjs
Length of output: 700
🏁 Script executed:
# Search for getSpeakers and getTalks imports/calls
rg -A 3 -B 3 "getSpeakers|getTalks" app/sitemap.tsRepository: anyulled/devbcn-nextjs
Length of output: 593
🏁 Script executed:
# Find and examine hooks/useSpeakers.ts
if [ -f "hooks/useSpeakers.ts" ]; then
cat -n hooks/useSpeakers.ts
fiRepository: anyulled/devbcn-nextjs
Length of output: 1837
🏁 Script executed:
# Find and examine hooks/useTalks.ts
if [ -f "hooks/useTalks.ts" ]; then
cat -n hooks/useTalks.ts
fiRepository: anyulled/devbcn-nextjs
Length of output: 5458
Remove the redundant .catch(() => []) wrappers.
getSpeakers() and getTalks() already log and return [] on errors in their internal try-catch blocks. Re-catching them on line 52 is redundant and will silently hide unexpected regressions, making missing sitemap entries much harder to diagnose.
🛠️ Minimal fix
- const [speakers, sessionGroups] = await Promise.all([getSpeakers(year).catch(() => []), getTalks(year).catch(() => [])]);
+ const [speakers, sessionGroups] = await Promise.all([getSpeakers(year), getTalks(year)]);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const [speakers, sessionGroups] = await Promise.all([getSpeakers(year).catch(() => []), getTalks(year).catch(() => [])]); | |
| const [speakers, sessionGroups] = await Promise.all([getSpeakers(year), getTalks(year)]); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/sitemap.ts` at line 52, The Promise.all call is re-catching errors
unnecessarily; remove the `.catch(() => [])` wrappers so it becomes
Promise.all([getSpeakers(year), getTalks(year)]) and let the internal try/catch
in getSpeakers and getTalks return [] and log errors; update the assignment to
const [speakers, sessionGroups] = await Promise.all([getSpeakers(year),
getTalks(year)]) so unexpected errors are not silently masked and sitemap
omissions remain diagnosable.
💡 What: Replaced sequential
for...ofloops withPromise.allinapp/sitemap.tsand added fallback error handling.🎯 Why: Sequential data fetching inside loops delays the static generation and increases Next.js build time by introducing blocking waterfalls.
📊 Impact: Significantly reduces sitemap generation execution time by fetching data concurrently for all years and API endpoints.
🔬 Measurement: Observe Next.js build times for the sitemap route (
app/sitemap.ts) before and after the optimization.PR created automatically by Jules for task 15589933959973737168 started by @anyulled
Summary by CodeRabbit
Documentation
Bug Fixes & Performance