Skip to content

Finding a Motif in DNA#18

Draft
danielle-pinto wants to merge 2 commits intomainfrom
2026-02-12-subs
Draft

Finding a Motif in DNA#18
danielle-pinto wants to merge 2 commits intomainfrom
2026-02-12-subs

Conversation

@danielle-pinto
Copy link
Collaborator

@danielle-pinto danielle-pinto commented Feb 17, 2026

BioJulia solution for the haystack problem https://rosalind.info/problems/subs/

@github-actions
Copy link

Once the build has completed, you can preview your PR at this URL: https://biojulia.dev/BiojuliaDocs/previews/PR18/

but since we want to find all matches,
we will use `findnext`.

Currently, there isn't a `findall` function that allows us to avoid a loop.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to this documentation. https://biojulia.dev/BioSequences.jl/v2.0/sequence_search/#Exact-search-1

However, I'm eager to hear if I missed something!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not documented there, but it works - you need to use ExactSearchQuery - https://github.com/BioJulia/BioSequences.jl/blob/b626dbcaad76217b248449e6aa2cc1650e95660c/src/BioSequences.jl#L261-L316

julia> findall(ExactSearchQuery(dna"ATCA"), dna"ATCATCA")
2-element Vector{UnitRange{Int64}}:
 1:4
 4:7

julia> findall(ExactSearchQuery(dna"ATCA"), dna"ATCATCA"; overlap=false)
1-element Vector{UnitRange{Int64}}:
 1:4



```julia
function haystack_findnext(substring, string)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solution is very similar to the one above. However, I thought it was worth it to keep both solutions since it allows the reader to get introduced to the findnext function.


### Biojulia solution

Lastly, we can leverage some functions in the Kmers Biojulia package to help us!
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of all the packages, I was wondering if the kmers package would potentially have a solution. However, I wasn't able to find a relevant function. Technically, the findnext function is part of BioSequences, so perhaps that is the "BioJulia solution." Please let me know if there's any other julia functions that could be helpful here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments