diff --git a/papers/_posts/2008-01-01-sample-align-d--a-high-performance-multiple-sequence-alignment-system-using-phylogenetic-sampling-and-domain-decomposition.md b/papers/_posts/2008-01-01-sample-align-d--a-high-performance-multiple-sequence-alignment-system-using-phylogenetic-sampling-and-domain-decomposition.md new file mode 100644 index 00000000..dd7da73b --- /dev/null +++ b/papers/_posts/2008-01-01-sample-align-d--a-high-performance-multiple-sequence-alignment-system-using-phylogenetic-sampling-and-domain-decomposition.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Sample-align-d: A high performance multiple sequence alignment system using phylogenetic sampling and domain decomposition" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Khokhar, Ashfaq; " +year: "2008" +journal: IEEE +volume: +issue: +pages: 1-9 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/IPDPS.2008.4536174" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Multiple sequence alignment (MSA) is one of the most computationally intensive tasks in Computational Biology. Existing best known solutions for multiple sequence alignment take several hours (in some cases days) of computation time to align, for example, 2000 homologous sequences of average length 300. Inspired by the Sample Sort approach in parallel processing, in this paper we propose a highly scalable multiprocessor solution for the MSA problem in phylogenetically diverse sequences. Our method employs an intelligent scheme to partition the set of sequences into smaller subsets using k- mer count based similarity index, referred to as k-mer rank. Each subset is then independently aligned in parallel using any sequential approach. Further fine tuning of the local alignments is achieved using constraints derived from a global ancestor of the entire set. The proposed sample-align-D algorithm has been … diff --git a/papers/_posts/2009-01-01-a-domain-decomposition-strategy-for-alignment-of-multiple-biological-sequences-on-multiprocessor-platforms.md b/papers/_posts/2009-01-01-a-domain-decomposition-strategy-for-alignment-of-multiple-biological-sequences-on-multiprocessor-platforms.md new file mode 100644 index 00000000..2d64f236 --- /dev/null +++ b/papers/_posts/2009-01-01-a-domain-decomposition-strategy-for-alignment-of-multiple-biological-sequences-on-multiprocessor-platforms.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A domain decomposition strategy for alignment of multiple biological sequences on multiprocessor platforms" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Khokhar, Ashfaq; " +year: "2009" +journal: Academic Press +volume: 69 +issue: +pages: 666-677 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1016/j.jpdc.2009.03.006" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Multiple Sequences Alignment (MSA) of biological sequences is a fundamental problem in computational biology due to its critical significance in wide ranging applications including haplotype reconstruction, sequence homology, phylogenetic analysis, and prediction of evolutionary origins. The MSA problem is considered NP-hard and known heuristics for the problem do not scale well with increasing numbers of sequences. On the other hand, with the advent of a new breed of fast sequencing techniques it is now possible to generate thousands of sequences very quickly. For rapid sequence analysis, it is therefore desirable to develop fast MSA algorithms that scale well with an increase in the dataset size. In this paper, we present a novel domain decomposition based technique to solve the MSA problem on multiprocessing platforms. The domain decomposition based technique, in addition to yielding better … diff --git a/papers/_posts/2009-01-01-an-overview-of-multiple-sequence-alignment-systems.md b/papers/_posts/2009-01-01-an-overview-of-multiple-sequence-alignment-systems.md new file mode 100644 index 00000000..f556e4cd --- /dev/null +++ b/papers/_posts/2009-01-01-an-overview-of-multiple-sequence-alignment-systems.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "An Overview of Multiple Sequence Alignment Systems" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Khokhar, Ashfaq; " +year: "2009" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.48550/arXiv.0901.2747" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +An overview of current multiple alignment systems to date are described.The useful algorithms, the procedures adopted and their limitations are presented.We also present the quality of the alignments obtained and in which cases(kind of alignments, kind of sequences etc) the particular systems are useful. diff --git a/papers/_posts/2009-01-01-multiple-sequence-alignment-system-for-pyrosequencing-reads.md b/papers/_posts/2009-01-01-multiple-sequence-alignment-system-for-pyrosequencing-reads.md new file mode 100644 index 00000000..08326169 --- /dev/null +++ b/papers/_posts/2009-01-01-multiple-sequence-alignment-system-for-pyrosequencing-reads.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Multiple sequence alignment system for pyrosequencing reads" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Khokhar, Ashfaq; Zagordi, Osvaldo; Beerenwinkel, Niko; " +year: "2009" +journal: Springer Berlin Heidelberg +volume: +issue: +pages: 362-375 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-642-00727-9_34" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Pyrosequencing is among the emerging sequencing techniques, capable of generating upto 100,000 overlapping reads in a single run. This technique is much faster and cheaper than the existing state of the art sequencing technique such as Sanger. However, the reads generated by pyrosequencing are short in size and contain numerous errors. In order to use these reads for any subsequent analysis, the reads must be aligned . Existing multiple sequence alignment methods cannot be used as they do not take into account the specific positions of the sequences with respect to the genome, and are highly inefficient for large number of sequences. Therefore, the common practice has been to use either simple pairwise alignment despite its poor accuracy for error prone pyroreads, or use computationally expensive techniques based on sequential gap propagation. In this paper, we develop a … diff --git a/papers/_posts/2009-01-01-pyro-align--sample-align-based-multiple-alignment-system-for-pyrosequencing-reads-of-large-number.md b/papers/_posts/2009-01-01-pyro-align--sample-align-based-multiple-alignment-system-for-pyrosequencing-reads-of-large-number.md new file mode 100644 index 00000000..a9b298f5 --- /dev/null +++ b/papers/_posts/2009-01-01-pyro-align--sample-align-based-multiple-alignment-system-for-pyrosequencing-reads-of-large-number.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Pyro-align: Sample-align based multiple alignment system for pyrosequencing reads of large number" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; " +year: "2009" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.48550/arXiv.0901.2751" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Pyro-Align is a multiple alignment program specifically designed for pyrosequencing reads of huge number. Multiple sequence alignment is shown to be NP-hard and heuristics are designed for approximate solutions. Multiple sequence alignment of pyrosequenceing reads is complex mainly because of 2 factors. One being the huge number of reads, making the use of traditional heuristics,that scale very poorly for large number, unsuitable. The second reason is that the alignment cannot be performed arbitrarily, because the position of the reads with respect to the original genome is important and has to be taken into account.In this report we present a short description of the multiple alignment system for pyrosequencing reads. diff --git a/papers/_posts/2010-01-01-a-graph-theoretic-framework-for-efficient-computation-of-hmm-based-motif-finder.md b/papers/_posts/2010-01-01-a-graph-theoretic-framework-for-efficient-computation-of-hmm-based-motif-finder.md new file mode 100644 index 00000000..af475ea3 --- /dev/null +++ b/papers/_posts/2010-01-01-a-graph-theoretic-framework-for-efficient-computation-of-hmm-based-motif-finder.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A graph-theoretic framework for efficient computation of HMM based motif finder" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Burger, Lukas; Khokhar, Ashfaq; Zavolan, Mihaela; " +year: "2010" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Understanding the mechanisms that regulate gene expression is a major challenge in computational biology. An important part of solution in understanding this problem is to identify the binding sites in DNA for transcription factors known as motifs. Discovery of motifs in unaligned sequences is a fundamental problem in computational biology. Motif search using HMM and gibbs sampling have been shown to be very effective in finding regulatory motifs. We recently proposed a novel motif finding algorithm [1] that models, within a general framework, binding elements in terms of a variable number of motifs that are separated by spacers of varying lengths. The model is very effective for motif finding, but is computationally very expensive. In this paper, we propose a graphtheoretic framework for efficient computation of our HMM model for motif finding. The proposed graph model is very flexible, easy to use and is shown … diff --git a/papers/_posts/2010-01-01-high-performance-computational-biology-algorithms.md b/papers/_posts/2010-01-01-high-performance-computational-biology-algorithms.md new file mode 100644 index 00000000..175cd401 --- /dev/null +++ b/papers/_posts/2010-01-01-high-performance-computational-biology-algorithms.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "High performance computational biology algorithms" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; " +year: "2010" +journal: University of Illinois at Chicago +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Multiple Sequence s Alignment (MSA) of biological sequences is a fundamental problem in computational biology due to its critical significance in wide ranging applications including haplotype reconstruction, sequence homology, phylogenetic analysis, and prediction of evolutionary origins. The MSA problem is considered NP-hard and known heuristics for the problem do not scale well with increasing number of sequences. On the other hand, with the advent of new breed of fast sequencing techniques it is now possible to generate thousands of sequences very quickly. For rapid sequence analysis, it is therefore desirable to develop fast MSA algorithms that scale well with the increase in the dataset size. In this dissertation, we propose a novel domain decomposition based technique to solve the multiple sequence alignment problem on multiprocessing platforms. The domain decomposition based technique, in … diff --git a/papers/_posts/2010-01-01-parallel-algorithm-for-center-star-sequence-and-alignments-with-applications-to-short-reads.md b/papers/_posts/2010-01-01-parallel-algorithm-for-center-star-sequence-and-alignments-with-applications-to-short-reads.md new file mode 100644 index 00000000..50fffe16 --- /dev/null +++ b/papers/_posts/2010-01-01-parallel-algorithm-for-center-star-sequence-and-alignments-with-applications-to-short-reads.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Parallel Algorithm for Center Star Sequence and Alignments with Applications to Short Reads" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Khokhar, Ashfaq; " +year: "2010" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +With the advent of fast sequencing techniques, such as 454 and Solexa, the number of sequences that need to be aligned and processed are reaching up to one billion sequences [1]. The alignment of these sequences with the reference genome is one of the basic steps in mapping, alignments, and sequence analysis related problems. Aligning of such a large number of sequences using traditional multiple sequence alignment algorithms is computationally infeasible. In this paper we present a highly scalable parallel algorithm to find the center star sequence and perform approximate alignments of such a large number of sequences. The proposed algorithm has been implemented on a cluster of workstations using MPI library, and experimental results to find the Center Sequence for up to 6.4 million sequences are presented. These results include detailed computation and communication complexity analysis as … diff --git "a/papers/_posts/2011-01-01-large\342\200\220scale-itraq\342\200\220based-quantification-of-phosphorylation-changes-during-vasopressin-signaling.md" "b/papers/_posts/2011-01-01-large\342\200\220scale-itraq\342\200\220based-quantification-of-phosphorylation-changes-during-vasopressin-signaling.md" new file mode 100644 index 00000000..5c78e7ec --- /dev/null +++ "b/papers/_posts/2011-01-01-large\342\200\220scale-itraq\342\200\220based-quantification-of-phosphorylation-changes-during-vasopressin-signaling.md" @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Large‐scale iTRAQ‐based quantification of phosphorylation changes during vasopressin signaling" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Hoffert, Jason; Pisitkun, T; Saeed, F; Song, J; Knepper, M; " +year: "2011" +journal: Federation of American Societies for Experimental Biology +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1096/fasebj.25.1_supplement.1039.38" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Protein phosphorylation plays a critical role in the signaling pathways regulating water transport in kidney collecting duct. A central mediator in this process is the hormone arginine vasopressin (AVP), which regulates phosphorylation of the water channel aquaporin‐2 (AQP2), although the exact mechanisms are not fully understood. Here we utilized a multiplexed, isotopic label‐based quantitative phosphoproteomic approach in order to explore the temporal dynamics of phosphorylation events triggered by vasopressin across multiple timepoints. Briefly, rat inner medullary collecting duct (IMCD) samples were incubated in the presence or absence of 1nM AVP for 0.5, 2, 5, and 15 min (n=3). Each sample was labeled with a different iTRAQ reagent, mixed and processed for shotgun phosphoproteomic analysis. Of the 12,533 phosphopeptides identified, 3,298 were found in at least 2 out of 3 time courses and had … diff --git "a/papers/_posts/2011-01-01-mapping\342\200\220based-temporal-pattern-mining-algorithm-mtpma-identifies-unique-clusters-of-phosphopeptides-regulated-by-vasopressin-in-collecting-duct.md" "b/papers/_posts/2011-01-01-mapping\342\200\220based-temporal-pattern-mining-algorithm-mtpma-identifies-unique-clusters-of-phosphopeptides-regulated-by-vasopressin-in-collecting-duct.md" new file mode 100644 index 00000000..89f26c28 --- /dev/null +++ "b/papers/_posts/2011-01-01-mapping\342\200\220based-temporal-pattern-mining-algorithm-mtpma-identifies-unique-clusters-of-phosphopeptides-regulated-by-vasopressin-in-collecting-duct.md" @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Mapping‐based temporal pattern mining algorithm (MTPMA) identifies unique clusters of phosphopeptides regulated by vasopressin in collecting duct" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Hoffert, Jason; Pisitkun, Trairak; Knepper, Mark; " +year: "2011" +journal: Federation of American Societies for Experimental Biology +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1096/fasebj.25.1_supplement.921.4" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +We developed a new algorithm, MTPMA, for clustering time‐course patterns following step‐inputs in biological systems. We tested the algorithm with data from large‐scale quantitative phosphoproteomics experiments done as follows: Inner medullary collecting duct (IMCD) samples were incubated in the presence or absence of 1nM dDAVP (vasopressin) for 0.5, 2, 5, and 15 minutes (N=3 pairs) followed by LC‐MS/MS‐based phoshoproteomic analysis. Quantification used 8‐plex iTRAQ and commercially available software. Of the 12,533 phosphopeptides identified, 3,298 were found in at least 2 out of 3 time courses and had quantifiable iTRAQ ratios. These phosphopeptides were analyzed with MTPMA in order to identify groups that changed in abundance with similar temporal responses after vasopressin addition. The algorithm maps the data from a Cartesian plane to a discrete binary plane and uses an … diff --git a/papers/_posts/2011-01-01-mining-temporal-patterns-from-itraq-mass-spectrometry-lc-ms-ms-data.md b/papers/_posts/2011-01-01-mining-temporal-patterns-from-itraq-mass-spectrometry-lc-ms-ms-data.md new file mode 100644 index 00000000..5fa18759 --- /dev/null +++ b/papers/_posts/2011-01-01-mining-temporal-patterns-from-itraq-mass-spectrometry-lc-ms-ms-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Mining temporal patterns from iTRAQ mass spectrometry (LC-MS/MS) data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Pisitkun, Trairak; Knepper, Mark A; Hoffert, Jason D; " +year: "2011" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.48550/arXiv.1104.5510" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Large-scale proteomic analysis is emerging as a powerful technique in biology and relies heavily on data acquired by state-of-the-art mass spectrometers. As with any other field in Systems Biology, computational tools are required to deal with this ocean of data. iTRAQ (isobaric Tags for Relative and Absolute quantification) is a technique that allows simultaneous quantification of proteins from multiple samples. Although iTRAQ data gives useful insights to the biologist, it is more complex to perform analysis and draw biological conclusions because of its multi-plexed design. One such problem is to find proteins that behave in a similar way (i.e. change in abundance) among various time points since the temporal variations in the proteomics data reveal important biological information. Distance based methods such as Euclidian distance or Pearson coefficient, and clustering techniques such as k-mean etc, are not able to take into account the temporal information of the series. In this paper, we present an linear-time algorithm for clustering similar patterns among various iTRAQ time course data irrespective of their absolute values. The algorithm, referred to as Temporal Pattern Mining(TPM), maps the data from a Cartesian plane to a discrete binary plane. After the mapping a dynamic programming technique allows mining of similar data elements that are temporally closer to each other. The proposed algorithm accurately clusters iTRAQ data that are temporally closer to each other with more than 99% accuracy. Experimental results for different problem sizes are analyzed in terms of quality of clusters, execution time and scalability for large data sets … diff --git a/papers/_posts/2012-01-01-a-high-performance-multiple-sequence-alignment-system-for-pyrosequencing-reads-from-multiple-reference-genomes.md b/papers/_posts/2012-01-01-a-high-performance-multiple-sequence-alignment-system-for-pyrosequencing-reads-from-multiple-reference-genomes.md new file mode 100644 index 00000000..89e0117a --- /dev/null +++ b/papers/_posts/2012-01-01-a-high-performance-multiple-sequence-alignment-system-for-pyrosequencing-reads-from-multiple-reference-genomes.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A high performance multiple sequence alignment system for pyrosequencing reads from multiple reference genomes" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Perez-Rathke, Alan; Gwarnicki, Jaroslaw; Berger-Wolf, Tanya; Khokhar, Ashfaq; " +year: "2012" +journal: Academic Press +volume: 72 +issue: +pages: 83-93 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1016/j.jpdc.2011.08.001" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Genome resequencing with short reads generated from pyrosequencing generally relies on mapping the short reads against a single reference genome. However, mapping of reads from multiple reference genomes is not possible using a pairwise mapping algorithm. In order to align the reads w.r.t each other and the reference genomes, existing multiple sequence alignment(MSA) methods cannot be used because they do not take into account the position of these short reads with respect to the genome, and are highly inefficient for a large number of sequences. In this paper, we develop a highly scalable parallel algorithm based on domain decomposition, referred to as P-Pyro-Align, to align such a large number of reads from single or multiple reference genomes. The proposed alignment algorithm accurately aligns the erroneous reads, and has been implemented on a cluster of workstations using MPI library … diff --git a/papers/_posts/2012-01-01-an-efficient-algorithm-for-clustering-of-large-scale-mass-spectrometry-data.md b/papers/_posts/2012-01-01-an-efficient-algorithm-for-clustering-of-large-scale-mass-spectrometry-data.md new file mode 100644 index 00000000..be89ca2b --- /dev/null +++ b/papers/_posts/2012-01-01-an-efficient-algorithm-for-clustering-of-large-scale-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "An efficient algorithm for clustering of large-scale mass spectrometry data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Pisitkun, Trairak; Knepper, Mark A; Hoffert, Jason D; " +year: "2012" +journal: IEEE +volume: +issue: +pages: 1-4 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBM.2012.6392738" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +High-throughput spectrometers are capable of producing data sets containing thousands of spectra for a single biological sample. These data sets contain a substantial amount of redundancy from peptides that may get selected multiple times in a LC-MS/MS experiment. In this paper, we present an efficient algorithm, CAMS (Clustering Algorithm for Mass Spectra) for clustering mass spectrometry data which increases both the sensitivity and confidence of spectral assignment. CAMS utilizes a novel metric, called F-set, that allows accurate identification of the spectra that are similar. A graph theoretic framework is defined that allows the use of F-set metric efficiently for accurate cluster identifications. The accuracy of the algorithm is tested on real HCD and CID data sets with varying amounts of peptides. Our experiments show that the proposed algorithm is able to cluster spectra with very high accuracy in a reasonable … diff --git a/papers/_posts/2012-01-01-an-efficient-dynamic-programming-algorithm-for-phosphorylation-site-assignment-of-large-scale-mass-spectrometry-data.md b/papers/_posts/2012-01-01-an-efficient-dynamic-programming-algorithm-for-phosphorylation-site-assignment-of-large-scale-mass-spectrometry-data.md new file mode 100644 index 00000000..f1c50f79 --- /dev/null +++ b/papers/_posts/2012-01-01-an-efficient-dynamic-programming-algorithm-for-phosphorylation-site-assignment-of-large-scale-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "An efficient dynamic programming algorithm for phosphorylation site assignment of large-scale mass spectrometry data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Pisitkun, Trairak; Hoffert, Jason D; Wang, Guanghui; Gucek, Marjan; Knepper, Mark A; " +year: "2012" +journal: IEEE +volume: +issue: +pages: 618-625 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBMW.2012.6470210" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Phosphorylation site assignment of large-scale data from high throughput tandem mass spectrometry (LC-MS/MS) data is an important aspect of phosphoproteomics. Correct assignment of phosphorylated residue(s) is important for functional interpretation of the data within a biological context. Common search algorithms (Sequest etc.) for mass spectrometry data are not designed for accurate site assignment; thus, additional algorithms are needed. In this paper, we propose a linear-time and linear-space dynamic programming strategy for phosphorylation site assignment. The algorithm, referred to as PhosSA, optimizes the objective function defined as the summation of peak intensities that are associated with theoretical phosphopeptide fragmentation ions. Quality control is achieved through the use of a post-processing criteria whose value is indicative of the signal-to-noise (S/N) properties and redundancy of the … diff --git a/papers/_posts/2012-01-01-cp-hos--a-program-to-calculate-and-visualize-evolutionarily-conserved-functional-phosphorylation-sites.md b/papers/_posts/2012-01-01-cp-hos--a-program-to-calculate-and-visualize-evolutionarily-conserved-functional-phosphorylation-sites.md new file mode 100644 index 00000000..9f3f5fd3 --- /dev/null +++ b/papers/_posts/2012-01-01-cp-hos--a-program-to-calculate-and-visualize-evolutionarily-conserved-functional-phosphorylation-sites.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "CP hos: a program to calculate and visualize evolutionarily conserved functional phosphorylation sites" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Zhao, Boyang; Pisitkun, Trairak; Hoffert, Jason D; Knepper, Mark A; Saeed, Fahad; " +year: "2012" +journal: +volume: 12 +issue: +pages: 3299-3303 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1002/pmic.201200189" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Profiling using high‐throughput MS has discovered an overwhelming number of novel protein phosphorylation sites (“phosphosites”). However, the functional relevance of these sites is not always clear. In light of recent studies on the evolutionary mechanism of phosphorylation, we have developed CPhos, a Java program that can assess the conservation of phosphosites among species using an information theory‐based approach. The degree of conservation established using CPhos can be used to assess the functional significance of phosphosites. CPhos has a user friendly graphical user interface and is available both as a web service and as a standalone Java application to assist phosphoproteomic researchers in analyzing and prioritizing lists of phosphosites for further experimental validation. CPhos can be accessed or downloaded at http://helixweb.nih.gov/CPhos/. diff --git a/papers/_posts/2012-01-01-dynamics-of-the-g-protein-coupled-vasopressin-v2-receptor-signaling-network-revealed-by-quantitative-phosphoproteomics.md b/papers/_posts/2012-01-01-dynamics-of-the-g-protein-coupled-vasopressin-v2-receptor-signaling-network-revealed-by-quantitative-phosphoproteomics.md new file mode 100644 index 00000000..0942a1c7 --- /dev/null +++ b/papers/_posts/2012-01-01-dynamics-of-the-g-protein-coupled-vasopressin-v2-receptor-signaling-network-revealed-by-quantitative-phosphoproteomics.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Dynamics of the G protein-coupled vasopressin V2 receptor signaling network revealed by quantitative phosphoproteomics" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Hoffert, Jason D; Pisitkun, Trairak; Saeed, Fahad; Song, Jae H; Chou, Chung-Lin; Knepper, Mark A; " +year: "2012" +journal: Elsevier +volume: 11 +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1074/mcp.M111.014613" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +G protein-coupled receptors (GPCRs) regulate diverse physiological processes, and many human diseases are due to defects in GPCR signaling. To identify the dynamic response of a signaling network downstream from a prototypical Gs-coupled GPCR, the vasopressin V2 receptor, we have carried out multireplicate, quantitative phosphoproteomics with iTRAQ labeling at four time points following vasopressin exposure at a physiological concentration in cells isolated from rat kidney. A total of 12,167 phosphopeptides were identified from 2,783 proteins, with 273 changing significantly in abundance with vasopressin. Two-dimensional clustering of phosphopeptide time courses and Gene Ontology terms revealed that ligand binding to the V2 receptor affects more than simply the canonical cyclic adenosine monophosphate-protein kinase A and arrestin pathways under physiological conditions. The regulated … diff --git a/papers/_posts/2012-01-01-high-performance-phosphorylation-site-assignment-algorithm-for-mass-spectrometry-data-using-multicore-systems.md b/papers/_posts/2012-01-01-high-performance-phosphorylation-site-assignment-algorithm-for-mass-spectrometry-data-using-multicore-systems.md new file mode 100644 index 00000000..4b01138b --- /dev/null +++ b/papers/_posts/2012-01-01-high-performance-phosphorylation-site-assignment-algorithm-for-mass-spectrometry-data-using-multicore-systems.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "High performance phosphorylation site assignment algorithm for mass spectrometry data using multicore systems" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Hoffert, Jason; Pisitkun, Trairak; Knepper, Mark; " +year: "2012" +journal: +volume: +issue: +pages: 667-672 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1145/2382936.2383056" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Phosphorylation site assignment of high throughput tandem mass spectrometry (LC-MS/MS) data is one of the most common and critical aspects of phosphoproteomics. Although a number of tools have been proposed for automated site assignments, most of them have been limited in scalability or quality. In this paper, we propose a parallelized version of the PhosSA algorithm (called P-PhosSA) that we recently introduced for site assignment. A domain decomposition strategy is introduced for site assignment and the decomposed data is executed on multiple cores. The algorithm has been parallelized using Java Threads executing on multicore systems. The parallelized algorithm is tested using experimentally generated data sets of peptides with known phosphorylation sites while varying the fragmentation strategy (CID, HCD) and molarities of the peptides. The algorithm is also compatible with various peptide … diff --git a/papers/_posts/2012-01-01-identifying-protein-kinase-target-preferences-using-mass-spectrometry.md b/papers/_posts/2012-01-01-identifying-protein-kinase-target-preferences-using-mass-spectrometry.md new file mode 100644 index 00000000..cee0e565 --- /dev/null +++ b/papers/_posts/2012-01-01-identifying-protein-kinase-target-preferences-using-mass-spectrometry.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Identifying protein kinase target preferences using mass spectrometry" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Douglass, Jacqueline; Gunaratne, Ruwan; Bradford, Davis; Saeed, Fahad; Hoffert, Jason D; Steinbach, Peter J; Knepper, Mark A; Pisitkun, Trairak; " +year: "2012" +journal: American Physiological Society Bethesda, MD +volume: 303 +issue: +pages: C715-C727 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1152/ajpcell.00166.2012" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +A general question in molecular physiology is how to identify candidate protein kinases corresponding to a known or hypothetical phosphorylation site in a protein of interest. It is generally recognized that the amino acid sequence surrounding the phosphorylation site provides information that is relevant to identification of the cognate protein kinase. Here, we present a mass spectrometry-based method for profiling the target specificity of a given protein kinase as well as a computational tool for the calculation and visualization of the target preferences. The mass spectrometry-based method identifies sites phosphorylated in response to in vitro incubation of protein mixtures with active recombinant protein kinases followed by standard phosphoproteomic methodologies. The computational tool, called “PhosphoLogo,” uses an information-theoretic algorithm to calculate position-specific amino acid preferences and anti … diff --git a/papers/_posts/2012-01-01-nhlbi-abdesigner--an-online-tool-for-design-of-peptide-directed-antibodies.md b/papers/_posts/2012-01-01-nhlbi-abdesigner--an-online-tool-for-design-of-peptide-directed-antibodies.md new file mode 100644 index 00000000..5391ef2a --- /dev/null +++ b/papers/_posts/2012-01-01-nhlbi-abdesigner--an-online-tool-for-design-of-peptide-directed-antibodies.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "NHLBI-AbDesigner: an online tool for design of peptide-directed antibodies" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Pisitkun, Trairak; Hoffert, Jason D; Saeed, Fahad; Knepper, Mark A; " +year: "2012" +journal: American Physiological Society Bethesda, MD +volume: 302 +issue: +pages: C154-C164 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1152/ajpcell.00325.2011" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Investigation of physiological mechanisms at a cellular level often requires production of high-quality antibodies, frequently using synthetic peptides as immunogens. Here we describe a new, web-based software tool called NHLBI-AbDesigner that allows the user to visualize the information needed to choose optimal peptide sequences for peptide-directed antibody production (http://helixweb.nih.gov/AbDesigner/). The choice of an immunizing peptide is generally based on a need to optimize immunogenicity, antibody specificity, multispecies conservation, and robustness in the face of posttranslational modifications (PTMs). AbDesigner displays information relevant to these criteria as follows: 1) “Immunogenicity Score,” based on hydropathy and secondary structure prediction; 2) “Uniqueness Score,” a predictor of specificity of an antibody against all proteins expressed in the same species; 3) “Conservation Score … diff --git a/papers/_posts/2012-01-01-proteomic-and-metabolomic-approaches-to-cell-physiology-and-pathophysiology--quantitative-phosphoproteomics-in-nuclei-of-vasopressin-sensitive-renal-collecting-duct-cells.md b/papers/_posts/2012-01-01-proteomic-and-metabolomic-approaches-to-cell-physiology-and-pathophysiology--quantitative-phosphoproteomics-in-nuclei-of-vasopressin-sensitive-renal-collecting-duct-cells.md new file mode 100644 index 00000000..25ae54ea --- /dev/null +++ b/papers/_posts/2012-01-01-proteomic-and-metabolomic-approaches-to-cell-physiology-and-pathophysiology--quantitative-phosphoproteomics-in-nuclei-of-vasopressin-sensitive-renal-collecting-duct-cells.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Proteomic and Metabolomic Approaches to Cell Physiology and Pathophysiology: Quantitative phosphoproteomics in nuclei of vasopressin-sensitive renal collecting duct cells" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Bolger, Steven J; Hurtado, Patricia A Gonzales; Hoffert, Jason D; Saeed, Fahad; Pisitkun, Trairak; Knepper, Mark A; " +year: "2012" +journal: American Physiological Society +volume: 303 +issue: +pages: C1006 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1152%2Fajpcell.00260.2012" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Vasopressin regulates transport across the collecting duct epithelium in part via effects on gene transcription. Transcriptional regulation occurs partially via changes in phosphorylation of transcription factors, transcriptional coactivators, and protein kinases in the nucleus. To test whether vasopressin alters the nuclear phosphoproteome of vasopressin-sensitive cultured mouse mpkCCD cells, we used stable isotope labeling and mass spectrometry to quantify thousands of phosphorylation sites in nuclear extracts and nuclear pellet fractions. Measurements were made in the presence and absence of the vasopressin analog dDAVP. Of the 1,251 sites quantified, 39 changed significantly in response to dDAVP. Network analysis of the regulated proteins revealed two major clusters (“cell-cell adhesion” and “transcriptional regulation”) that were connected to known elements of the vasopressin signaling pathway. The hub … diff --git a/papers/_posts/2012-01-01-quantitative-phosphoproteomics-in-nuclei-of-vasopressin-sensitive-renal-collecting-duct-cells.md b/papers/_posts/2012-01-01-quantitative-phosphoproteomics-in-nuclei-of-vasopressin-sensitive-renal-collecting-duct-cells.md new file mode 100644 index 00000000..23a19912 --- /dev/null +++ b/papers/_posts/2012-01-01-quantitative-phosphoproteomics-in-nuclei-of-vasopressin-sensitive-renal-collecting-duct-cells.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Quantitative phosphoproteomics in nuclei of vasopressin-sensitive renal collecting duct cells" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Bolger, Steven J; Hurtado, Patricia A Gonzales; Hoffert, Jason D; Saeed, Fahad; Pisitkun, Trairak; Knepper, Mark A; " +year: "2012" +journal: American Physiological Society Bethesda, MD +volume: 303 +issue: +pages: C1006-C1020 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1152/ajpcell.00260.2012" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Vasopressin regulates transport across the collecting duct epithelium in part via effects on gene transcription. Transcriptional regulation occurs partially via changes in phosphorylation of transcription factors, transcriptional coactivators, and protein kinases in the nucleus. To test whether vasopressin alters the nuclear phosphoproteome of vasopressin-sensitive cultured mouse mpkCCD cells, we used stable isotope labeling and mass spectrometry to quantify thousands of phosphorylation sites in nuclear extracts and nuclear pellet fractions. Measurements were made in the presence and absence of the vasopressin analog dDAVP. Of the 1,251 sites quantified, 39 changed significantly in response to dDAVP. Network analysis of the regulated proteins revealed two major clusters (“cell-cell adhesion” and “transcriptional regulation”) that were connected to known elements of the vasopressin signaling pathway. The hub … diff --git a/papers/_posts/2013-01-01-a-graphical-user-interface-gui-for-phosphorylation-site-assignment-of-protein-mass-spectrometry-data.md b/papers/_posts/2013-01-01-a-graphical-user-interface-gui-for-phosphorylation-site-assignment-of-protein-mass-spectrometry-data.md new file mode 100644 index 00000000..fab38f7f --- /dev/null +++ b/papers/_posts/2013-01-01-a-graphical-user-interface-gui-for-phosphorylation-site-assignment-of-protein-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A Graphical User Interface (GUI) for Phosphorylation Site Assignment of Protein Mass Spectrometry Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; " +year: "2013" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Correct phosphorylation site assignment for high throughput tandem mass spectrometry (LC-MS/MS) data is one of the most common and critical aspects of phosphoproteomics. In this report, we present a graphical user interface (GUI) implemented in JAVA for phosphorylation site assignment. The GUI implements the PhosSA algorithm and is tested on a variety of operating systems and computing platforms. The GUI is divided into two parts: The first part takes input from multiple search engines (ie Sequest and Mascot) and converts it into PhosSA compatible format. The second part of the GUI runs the site assignment algorithm using varying thresholds for HCD and CID fragmentation methodologies. The software is accessible for free at: http://helixweb. nih. gov/ESBL/PhosSA/for all non-commercial purposes. diff --git a/papers/_posts/2013-01-01-a-high-performance-algorithm-for-clustering-of-large-scale-protein-mass-spectrometry-data-using-multi-core-architectures.md b/papers/_posts/2013-01-01-a-high-performance-algorithm-for-clustering-of-large-scale-protein-mass-spectrometry-data-using-multi-core-architectures.md new file mode 100644 index 00000000..441af38a --- /dev/null +++ b/papers/_posts/2013-01-01-a-high-performance-algorithm-for-clustering-of-large-scale-protein-mass-spectrometry-data-using-multi-core-architectures.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A high performance algorithm for clustering of large-scale protein mass spectrometry data using multi-core architectures" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Hoffert, Jason D; Knepper, Mark A; " +year: "2013" +journal: +volume: +issue: +pages: 923-930 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1145/2492517.2500245" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +High-throughput mass spectrometers can produce thousands of peptide spectra from a single complex protein sample in a short amount of time. These data sets contain a substantial amount of redundancy (i.e. the same peptide is selected and identified multiple times in a single experiment) from peptides that may get selected multiple times in the liquid chromatography mass spectrometry (LC-MS/MS) experiment. The data from these mass spectrometers contain a substantial number of spectra that have low signal to noise (S/N) ratio and may not get interpreted due to poor quality. Recently, we presented a graph theoretic algorithm, CAMS (Clustering Algorithm for Mass Spectra) for clustering mass spectrometry data. CAMS utilized a novel metric, called a F-set, that allows accurate identification of the spectra that are similar with much higher accuracy and sensitivity than if single peak comparisons were performed … diff --git a/papers/_posts/2013-01-01-foreword-to-the-special-issue-on-selected-papers-from-the-5th-international-conference-on-bioinformatics-and-computational-biology-bicob-2013.md b/papers/_posts/2013-01-01-foreword-to-the-special-issue-on-selected-papers-from-the-5th-international-conference-on-bioinformatics-and-computational-biology-bicob-2013.md new file mode 100644 index 00000000..98506388 --- /dev/null +++ b/papers/_posts/2013-01-01-foreword-to-the-special-issue-on-selected-papers-from-the-5th-international-conference-on-bioinformatics-and-computational-biology-bicob-2013.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Foreword to the special issue on selected papers from the 5th International Conference on Bioinformatics and Computational Biology (BICoB 2013)" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Dasgupta, Bhaskar; Al-Mubaid, Hisham; Saeed, Fahad; " +year: "2013" +journal: Imperial College Press +volume: 11 +issue: +pages: 1302002 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1142/S0219720013020022" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +With the rapid increase in the volume of biological data generated using next generation technologies, designing algorithms to process these data in an eącient manner and gaining biological insight is becoming a signiŻcantly growing challenge. This special issue is a follow-up to the 5th International Conference on Bioinformatics and Computational Biology (BICoB 2013) that took place in Honolulu, Hawaii during March 4–6, 2013. The guest editors selected a few top-quality papers from the BICoB proceedings and invited the authors of these papers to submit extended versions of their proceedings papers. After a rigorous review process the guest editors have selected the following Żve papers described below for publication in this special issue.The Żrst paper titled\Studying the role of APOE in Alzheimer's disease (AD) pathogenesis using a systems biology model 1 presents one of the Żrst attempts to model AD from a systems approach to study physiologically relevant parameters that may prove useful in the future." diff --git a/papers/_posts/2013-01-01-phossa--fast-and-accurate-phosphorylation-site-assignment-algorithm-for-mass-spectrometry-data.md b/papers/_posts/2013-01-01-phossa--fast-and-accurate-phosphorylation-site-assignment-algorithm-for-mass-spectrometry-data.md new file mode 100644 index 00000000..35ba2667 --- /dev/null +++ b/papers/_posts/2013-01-01-phossa--fast-and-accurate-phosphorylation-site-assignment-algorithm-for-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "PhosSA: Fast and accurate phosphorylation site assignment algorithm for mass spectrometry data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Pisitkun, Trairak; Hoffert, Jason D; Rashidian, Sara; Wang, Guanghui; Gucek, Marjan; Knepper, Mark A; " +year: "2013" +journal: BioMed Central +volume: 11 +issue: +pages: 1-15 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1186/1477-5956-11-S1-S14" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Phosphorylation site assignment of high throughput tandem mass spectrometry (LC-MS/MS) data is one of the most common and critical aspects of phosphoproteomics. Correctly assigning phosphorylated residues helps us understand their biological significance. The design of common search algorithms (such as Sequest, Mascot etc.) do not incorporate site assignment; therefore additional algorithms are essential to assign phosphorylation sites for mass spectrometry data. The main contribution of this study is the design and implementation of a linear time and space dynamic programming strategy for phosphorylation site assignment referred to as PhosSA. The proposed algorithm uses summation of peak intensities associated with theoretical spectra as an objective function. Quality control of the assigned sites is achieved using a post-processing redundancy criteria that indicates the signal-to-noise ratio … diff --git a/papers/_posts/2013-01-01-proteome-wide-measurement-of-protein-half-lives-and-translation-rates-in-vasopressin-sensitive-collecting-duct-cells.md b/papers/_posts/2013-01-01-proteome-wide-measurement-of-protein-half-lives-and-translation-rates-in-vasopressin-sensitive-collecting-duct-cells.md new file mode 100644 index 00000000..4031d922 --- /dev/null +++ b/papers/_posts/2013-01-01-proteome-wide-measurement-of-protein-half-lives-and-translation-rates-in-vasopressin-sensitive-collecting-duct-cells.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Proteome-wide measurement of protein half-lives and translation rates in vasopressin-sensitive collecting duct cells" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Sandoval, Pablo C; Slentz, Dane H; Pisitkun, Trairak; Saeed, Fahad; Hoffert, Jason D; Knepper, Mark A; " +year: "2013" +journal: LWW +volume: 24 +issue: +pages: 1793-1805 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1681/ASN.2013030279" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Vasopressin regulates water excretion, in part, by controlling the abundances of the water channel aquaporin-2 (AQP2) protein and regulatory proteins in the renal collecting duct. To determine whether vasopressin-induced alterations in protein abundance result from modulation of protein production, protein degradation, or both, we used protein mass spectrometry with dynamic stable isotope labeling in cell culture to achieve a proteome-wide determination of protein half-lives and relative translation rates in mpkCCD cells. Measurements were made at steady state in the absence or presence of the vasopressin analog, desmopressin (dDAVP). Desmopressin altered the translation rate rather than the stability of most responding proteins, but it significantly increased both the translation rate and the half-life of AQP2. In addition, proteins associated with vasopressin action, including Mal2, Akap12, gelsolin, myosin … diff --git "a/papers/_posts/2013-01-01-quantitative-phosphoproteomics-implicates-clusters-of-proteins-involved-in-cell\342\200\220cell-adhesion-and-transcriptional-regulation-in-the-vasopressin-signaling-network.md" "b/papers/_posts/2013-01-01-quantitative-phosphoproteomics-implicates-clusters-of-proteins-involved-in-cell\342\200\220cell-adhesion-and-transcriptional-regulation-in-the-vasopressin-signaling-network.md" new file mode 100644 index 00000000..9ea99ba5 --- /dev/null +++ "b/papers/_posts/2013-01-01-quantitative-phosphoproteomics-implicates-clusters-of-proteins-involved-in-cell\342\200\220cell-adhesion-and-transcriptional-regulation-in-the-vasopressin-signaling-network.md" @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Quantitative phosphoproteomics implicates clusters of proteins involved in cell‐cell adhesion and transcriptional regulation in the vasopressin signaling network" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Bolger, Steven J; Hurtado, Patricia A Gonzales; Hoffert, Jason D; Saeed, Fahad; Pisitkun, Trairak; Knepper, Mark A; " +year: "2013" +journal: The Federation of American Societies for Experimental Biology +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1096/fasebj.27.1_supplement.597.1" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Vasopressin regulates transport in the renal collecting duct in part by modifying the phosphorylation of transcriptional regulators present in the nucleus. To assess changes in the nuclear phosphoproteome of vasopressin‐sensitive mpkCCD cells in response to dDAVP, a vasopressin V2 receptor analog, we employed stable isotope labeling and mass spectrometry to quantify changes in the abundance of phosphorylated peptides in nuclear fractions. Of the 1,251 phosphorylation sites identified, 39 sites changed significantly in response to dDAVP. Network analysis showed that the regulated proteins fell into two major clusters: cell‐cell adhesion and transcriptional regulation. The hubs of these two clusters were the transcriptional coactivator beta‐catenin and the transcription factor c‐Jun respectively. Phosphorylation of beta‐catenin at Ser552, a known target of protein kinase A or Akt in the collecting duct, was … diff --git a/papers/_posts/2014-01-01-6th-international-conference-on-bioinformatics-and-computational-biology-bicob-2014.md b/papers/_posts/2014-01-01-6th-international-conference-on-bioinformatics-and-computational-biology-bicob-2014.md new file mode 100644 index 00000000..13a9949e --- /dev/null +++ b/papers/_posts/2014-01-01-6th-international-conference-on-bioinformatics-and-computational-biology-bicob-2014.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "6th International Conference on Bioinformatics and Computational Biology (BICoB 2014)" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; DasGupta, B; " +year: "2014" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Outer membrane proteins (OMPs) play key roles in many cell functions and computational methods for discriminating OMPs from non-OMPs in genomic sequences are very meaningful. In this study, amino acid composition and auto covariance derived from PSI-BLAST profile, which could extract evolutionary information and sequence order effects effectively, were suggested to represent protein sequences for discriminating OMPs using support vector machine. To evaluate the performance of the proposed method, five-fold cross-validation tests were performed on three widely used benchmark datasets. Comparison with other previously developed methods showed that the proposed method was very competitive and may at least play an important complementary role to the existing methods. diff --git a/papers/_posts/2014-01-01-a-knowledge-base-of-vasopressin-actions-in-the-kidney.md b/papers/_posts/2014-01-01-a-knowledge-base-of-vasopressin-actions-in-the-kidney.md new file mode 100644 index 00000000..9fb08007 --- /dev/null +++ b/papers/_posts/2014-01-01-a-knowledge-base-of-vasopressin-actions-in-the-kidney.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A knowledge base of vasopressin actions in the kidney" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Sanghi, Akshay; Zaringhalam, Matthew; Corcoran, Callan C; Saeed, Fahad; Hoffert, Jason D; Sandoval, Pablo; Pisitkun, Trairak; Knepper, Mark A; " +year: "2014" +journal: American Physiological Society Bethesda, MD +volume: 307 +issue: +pages: F747-F755 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1152/ajprenal.00012.2014" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Biological information is growing at a rapid pace, making it difficult for individual investigators to be familiar with all information that is relevant to their own research. Computers are beginning to be used to extract and curate biological information; however, the complexity of human language used in research papers continues to be a critical barrier to full automation of knowledge extraction. Here, we report a manually curated knowledge base of vasopressin actions in renal epithelial cells that is designed to be readable either by humans or by computer programs using natural language processing algorithms. The knowledge base consists of three related databases accessible at https://helixweb.nih.gov/ESBL/TinyUrls/Vaso_portal.html. One of the component databases reports vasopressin actions on individual proteins expressed in renal epithelia, including effects on phosphorylation, protein abundances, protein … diff --git a/papers/_posts/2014-01-01-cams-rs--clustering-algorithm-for-large-scale-mass-spectrometry-data-using-restricted-search-space-and-intelligent-random-sampling.md b/papers/_posts/2014-01-01-cams-rs--clustering-algorithm-for-large-scale-mass-spectrometry-data-using-restricted-search-space-and-intelligent-random-sampling.md new file mode 100644 index 00000000..2aaaf6ec --- /dev/null +++ b/papers/_posts/2014-01-01-cams-rs--clustering-algorithm-for-large-scale-mass-spectrometry-data-using-restricted-search-space-and-intelligent-random-sampling.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Cams-rs: clustering algorithm for large-scale mass spectrometry data using restricted search space and intelligent random sampling" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Hoffert, Jason D; Knepper, Mark A; " +year: "2014" +journal: IEEE Computer Society Press +volume: 11 +issue: +pages: 128-141 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/TCBB.2013.152" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +High-throughput mass spectrometers can produce massive amounts of redundant data at an astonishing rate with many of them having poor signal-to-noise (S/N) ratio. These low S/N ratio spectra may not get interpreted using conventional spectra-to-database matching techniques. In this paper, we present an efficient algorithm, CAMS-RS (Clustering Algorithm for Mass Spectra using Restricted Space and Sampling) for clustering of raw mass spectrometry data. CAMS-RS utilizes a novel metric (called F-set) that exploits the temporal and spatial patterns to accurately assess similarity between two given spectra. The F-set similarity metric is independent of the retention time and allows clustering of mass spectrometry data from independent LC-MS/MS runs. A novel restricted search space strategy is devised to limit the comparisons of the number of spectra. An intelligent sampling method is executed on individual … diff --git a/papers/_posts/2014-01-01-exploiting-thread-level-and-instruction-level-parallelism-to-cluster-mass-spectrometry-data-using-multicore-architectures.md b/papers/_posts/2014-01-01-exploiting-thread-level-and-instruction-level-parallelism-to-cluster-mass-spectrometry-data-using-multicore-architectures.md new file mode 100644 index 00000000..530c1955 --- /dev/null +++ b/papers/_posts/2014-01-01-exploiting-thread-level-and-instruction-level-parallelism-to-cluster-mass-spectrometry-data-using-multicore-architectures.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Exploiting thread-level and instruction-level parallelism to cluster mass spectrometry data using multicore architectures" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Hoffert, Jason D; Pisitkun, Trairak; Knepper, Mark A; " +year: "2014" +journal: Springer Vienna +volume: 3 +issue: +pages: 1-19 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/s13721-014-0054-1" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Modern mass spectrometers can produce large numbers of peptide spectra from complex biological samples in a short time. A substantial amount of redundancy is observed in these data sets from peptides that may get selected multiple times in liquid chromatography tandem mass spectrometry experiments. A large number of spectra do not get mapped to specific peptide sequences due to low signal-to-noise ratio of the spectra from these machines. Clustering is one way to mitigate the problems of these complex mass spectrometry data sets. Recently, we presented a graph theoretic framework, known as CAMS, for clustering of large-scale mass spectrometry data. CAMS utilized a novel metric to exploit the spatial patterns in the mass spectrometry peaks which allowed highly accurate clustering results. However, comparison of each spectrum with every other spectrum makes the clustering problem … diff --git a/papers/_posts/2014-01-01-foreword-to-the-special-issue-on-selected-papers-from-the-6th-international-conference-on-bioinformatics-and-computational-biology-bicob-2014..md b/papers/_posts/2014-01-01-foreword-to-the-special-issue-on-selected-papers-from-the-6th-international-conference-on-bioinformatics-and-computational-biology-bicob-2014..md new file mode 100644 index 00000000..16f0f4ed --- /dev/null +++ b/papers/_posts/2014-01-01-foreword-to-the-special-issue-on-selected-papers-from-the-6th-international-conference-on-bioinformatics-and-computational-biology-bicob-2014..md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Foreword to the special issue on selected papers from the 6th International Conference on Bioinformatics and Computational Biology (BICoB 2014)." +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Al-Mubaid, Hisham; Dasgupta, Bhaskar; " +year: "2014" +journal: +volume: 12 +issue: +pages: 1402001-1402001 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1142/s0219720014020016" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Foreword to the special issue on selected papers from the 6th International Conference on Bioinformatics and Computational Biology (BICoB 2014). - Abstract - Europe PMC Sign in | Create an account https://orcid.org Europe PMC Menu About Tools Developers Help Contact us Helpdesk Feedback Twitter Blog Tech blog Developer Forum Europe PMC plus Search life-sciences literature (43,880,304 articles, preprints and more) Search Advanced search Feedback This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy. Abstract Full text References Foreword to the special issue on selected papers from the 6th International Conference on Bioinformatics and Computational Biology (BICoB 2014). Saeed F 1 , Al-Mubaid H , Dasgupta B Author information Affiliations 1. Western … diff --git a/papers/_posts/2014-01-01-global-analysis-of-the-effects-of-the-v2-receptor-antagonist-satavaptan-on-protein-phosphorylation-in-collecting-duct.md b/papers/_posts/2014-01-01-global-analysis-of-the-effects-of-the-v2-receptor-antagonist-satavaptan-on-protein-phosphorylation-in-collecting-duct.md new file mode 100644 index 00000000..0d521a91 --- /dev/null +++ b/papers/_posts/2014-01-01-global-analysis-of-the-effects-of-the-v2-receptor-antagonist-satavaptan-on-protein-phosphorylation-in-collecting-duct.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Global analysis of the effects of the V2 receptor antagonist satavaptan on protein phosphorylation in collecting duct" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Hoffert, Jason D; Pisitkun, Trairak; Saeed, Fahad; Wilson, Justin L; Knepper, Mark A; " +year: "2014" +journal: American Physiological Society Bethesda, MD +volume: 306 +issue: +pages: 410-421 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1152/ajprenal.00497.2013" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Satavaptan (SR121463) is a vasopressin V2 receptor antagonist that has been shown to improve hyponatremia in patients with cirrhosis, congestive heart failure, and syndrome of inappropriate antidiuresis. While known to inhibit adenylyl cyclase-mediated accumulation of intracellular cyclic AMP and potentially recruit β-arrestin in kidney cell lines, very little is known regarding the signaling pathways that are affected by this drug. To this end, we carried out a global quantitative phosphoproteomic analysis of native rat inner medullary collecting duct cells pretreated with satavaptan or vehicle control followed by the V2 receptor agonist desmopressin (dDAVP) for 0.5, 2, 5, or 15 min. A total of 2,449 unique phosphopeptides from 1,160 proteins were identified. Phosphopeptides significantly changed by satavaptan included many of the same kinases [protein kinase A, phosphoinositide 3-kinase, mitogen-activated … diff --git a/papers/_posts/2015-01-01-a-high-performance-architecture-for-an-exact-match-short-read-aligner-using-burrows-wheeler-aligner-on-fpgas.md b/papers/_posts/2015-01-01-a-high-performance-architecture-for-an-exact-match-short-read-aligner-using-burrows-wheeler-aligner-on-fpgas.md new file mode 100644 index 00000000..4e2a73ec --- /dev/null +++ b/papers/_posts/2015-01-01-a-high-performance-architecture-for-an-exact-match-short-read-aligner-using-burrows-wheeler-aligner-on-fpgas.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A High Performance Architecture for an Exact Match Short-Read Aligner Using Burrows-Wheeler Aligner on FPGAs" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Qader, Dana Abdul; " +year: "2015" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Due to modern DNA sequencing technologies vast amount of short DNA sequences known as short-reads is generated. Biologists need to be able to align the short-reads to a reference genome to be able to make scientific use of the data. Fast and accurate short-read aligner programs are needed to keep up with the pace at which this data is generated. Field Programmable Gate Arrays have been widely used to accelerate many data-intensive bioinformatics applications. diff --git a/papers/_posts/2015-01-01-a-parallel-algorithm-for-compression-of-big-next-generation-sequencing-datasets.md b/papers/_posts/2015-01-01-a-parallel-algorithm-for-compression-of-big-next-generation-sequencing-datasets.md new file mode 100644 index 00000000..25440884 --- /dev/null +++ b/papers/_posts/2015-01-01-a-parallel-algorithm-for-compression-of-big-next-generation-sequencing-datasets.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A parallel algorithm for compression of big next-generation sequencing datasets" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Pérez, Sandino Vargas; Saeed, Fahad; " +year: "2015" +journal: IEEE +volume: 3 +issue: +pages: 196-201 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/Trustcom.2015.632" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The amount of big data from high-throughput Next-Generation Sequencing (NGS) techniques represents various challenges such as storage, analysis and transmission of massive datasets. One solution to storage and transmission of data is compression using specialized compression algorithms. The existing specialized algorithms suffer from poor scalability with increasing size of the datasets and best available solutions can take hours to compress gigabytes of data. Compression and decompression using these techniques for peta-scale data sets is prohibitively expensive in terms of time and energy. In this paper we introduce paraDSRC, a parallel implementation of the DNA Sequence Reads Compression (DSRC) application using a message passing model that presents reduction of the compression time complexity by a factor of O(1/p) (where p is the number of processing units). Our experimental results … diff --git a/papers/_posts/2015-01-01-autophagic-degradation-of-aquaporin-2-is-an-early-event-in-hypokalemia-induced-nephrogenic-diabetes-insipidus.md b/papers/_posts/2015-01-01-autophagic-degradation-of-aquaporin-2-is-an-early-event-in-hypokalemia-induced-nephrogenic-diabetes-insipidus.md new file mode 100644 index 00000000..ca54a7ef --- /dev/null +++ b/papers/_posts/2015-01-01-autophagic-degradation-of-aquaporin-2-is-an-early-event-in-hypokalemia-induced-nephrogenic-diabetes-insipidus.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Autophagic degradation of aquaporin-2 is an early event in hypokalemia-induced nephrogenic diabetes insipidus" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Khositseth, Sookkasem; Uawithya, Panapat; Somparn, Poorichaya; Charngkaew, Komgrid; Thippamom, Nattakan; Hoffert, Jason D; Saeed, Fahad; Michael Payne, D; Chen, Shu-Hui; Fenton, Robert A; " +year: "2015" +journal: Nature Publishing Group UK London +volume: 5 +issue: +pages: 18311 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1038/srep18311" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Hypokalemia (low serum potassium level) is a common electrolyte imbalance that can cause a defect in urinary concentrating ability, i.e., nephrogenic diabetes insipidus (NDI), but the molecular mechanism is unknown. We employed proteomic analysis of inner medullary collecting ducts (IMCD) from rats fed with a potassium-free diet for 1 day. IMCD protein quantification was performed by mass spectrometry using a label-free methodology. A total of 131 proteins, including the water channel AQP2, exhibited significant changes in abundance, most of which were decreased. Bioinformatic analysis revealed that many of the down-regulated proteins were associated with the biological processes of generation of precursor metabolites and energy, actin cytoskeleton organization and cell-cell adhesion. Targeted LC-MS/MS and immunoblotting studies further confirmed the down regulation of 18 selected proteins … diff --git a/papers/_posts/2015-01-01-big-data-proteogenomics-and-high-performance-computing--challenges-and-opportunities.md b/papers/_posts/2015-01-01-big-data-proteogenomics-and-high-performance-computing--challenges-and-opportunities.md new file mode 100644 index 00000000..71b47001 --- /dev/null +++ b/papers/_posts/2015-01-01-big-data-proteogenomics-and-high-performance-computing--challenges-and-opportunities.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Big data proteogenomics and high performance computing: Challenges and opportunities" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; " +year: "2015" +journal: IEEE +volume: +issue: +pages: 141-145 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/GlobalSIP.2015.7418173" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Proteogenomics is an emerging field of systems biology research at the intersection of proteomics and genomics. Two high-throughput technologies, Mass Spectrometry (MS) for proteomics and Next Generation Sequencing (NGS) machines for genomics are required to conduct proteogenomics studies. Independently both MS and NGS technologies are inflicted with data deluge which creates problems of storage, transfer, analysis and visualization. Integrating these big data sets (NGS+MS) for proteogenomics studies compounds all of the associated computational problems. Existing sequential algorithms for these proteogenomics datasets analysis are inadequate for big data and high performance computing (HPC) solutions are almost non-existent. The purpose of this paper is to introduce the big data problem of proteogenomics and the associated challenges in analyzing, storing and transferring these data sets … diff --git a/papers/_posts/2015-01-01-design-and-implementation-of-network-transfer-protocol-for-big-genomic-data.md b/papers/_posts/2015-01-01-design-and-implementation-of-network-transfer-protocol-for-big-genomic-data.md new file mode 100644 index 00000000..00cb2c44 --- /dev/null +++ b/papers/_posts/2015-01-01-design-and-implementation-of-network-transfer-protocol-for-big-genomic-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Design and implementation of network transfer protocol for big genomic data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Aledhari, Mohammed; Saeed, Fahad; " +year: "2015" +journal: IEEE +volume: +issue: +pages: 281-288 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BigDataCongress.2015.47" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Genomic data is growing exponentially due to next generation sequencing technologies (NGS) and their ability to produce massive amounts of data in a short time. NGS technologies generate big genomic data that needs to be exchanged between different locations efficiently and reliably. The current network transfer protocols rely on Transmission Control Protocol (TCP) or User Data gram Protocol (UDP) protocols, ignoring data size and type. Universal application layer protocols such as HTTP are designed for wide variety of data types and are not particularly efficient for genomic data. Therefore, we present a new data-aware transfer protocol for genomic-data that increases network throughput and reduces latency, called Genomic Text Transfer Protocol (GTTP). In this paper, we design and implement a new network transfer protocol for big genomic DNA dataset that relies on the Hypertext Transfer Protocol (HTTP … diff --git a/papers/_posts/2015-01-01-on-the-sampling-of-big-mass-spectrometry-data.md b/papers/_posts/2015-01-01-on-the-sampling-of-big-mass-spectrometry-data.md new file mode 100644 index 00000000..df2707e3 --- /dev/null +++ b/papers/_posts/2015-01-01-on-the-sampling-of-big-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "On the sampling of big mass spectrometry data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Awan, Muaaz Gul; Saeed, Fahad; " +year: "2015" +journal: +volume: +issue: +pages: 143-148 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + + diff --git a/papers/_posts/2016-01-01-a-parallel-peptide-indexer-and-decoy-generator-for-crux-tide-using-openmp.md b/papers/_posts/2016-01-01-a-parallel-peptide-indexer-and-decoy-generator-for-crux-tide-using-openmp.md new file mode 100644 index 00000000..9668fa5e --- /dev/null +++ b/papers/_posts/2016-01-01-a-parallel-peptide-indexer-and-decoy-generator-for-crux-tide-using-openmp.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A Parallel Peptide Indexer and Decoy Generator for Crux Tide using OpenMP" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Maabreh, Majdi; Gupta, Ajay; Saeed, Fahad; " +year: "2016" +journal: IEEE +volume: +issue: +pages: 411-418 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/HPCSim.2016.7568364" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Target-Decoy database is a common dependable strategy used in Peptide-Spectrum-Matching (PSM) for quality assessment. In this, a set of decoy labeled peptides are injected to the database and indexed along with real peptides. Crux Tide is a fast search engine that supports indexing peptides' database and decoys generation. In Tide, indexing FASTA files and generating decoys is a computationally expensive process. In this paper we first analyze the serial Tide indexer and decoy generator algorithm and, then describe a parallel shared memory solution. Our proposed technique utilizes a clever hashing technique to localize the process, and breaks up the processing dependency among threads. The developed parallel versions are able to reduce the computational complexity by approximately 50% and 25% of the sequential time using 4 and 8 threads, respectively. Moreover, our proposed solution could … diff --git a/papers/_posts/2016-01-01-a-variable-length-network-encoding-protocol-for-big-genomic-data.md b/papers/_posts/2016-01-01-a-variable-length-network-encoding-protocol-for-big-genomic-data.md new file mode 100644 index 00000000..c945b708 --- /dev/null +++ b/papers/_posts/2016-01-01-a-variable-length-network-encoding-protocol-for-big-genomic-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A variable-length network encoding protocol for big genomic data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Aledhari, Mohammed; Hefeida, Mohamed S; Saeed, Fahad; " +year: "2016" +journal: Springer International Publishing +volume: +issue: +pages: 212-224 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-319-33936-8_17" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Modern genomic studies utilize high-throughput instruments which can produce data at an astonishing rate. These big genomic datasets produced using next generation sequencing (NGS) machines can easily reach peta-scale level creating storage, analytic and transmission problems for large-scale system biology studies. Traditional networking protocols are oblivious to the data that is being transmitted and are designed for general purpose data transfer. In this paper we present a novel data-aware network transfer protocol to efficiently transfer big genomic data. Our protocol exploits the limited alphabet of DNA nucleotide and is developed over the hypertext transfer protocol (HTTP) framework. Our results show that proposed technique improves transmission up to 84 times when compared to normal HTTP encoding schemes. We also show that the performance of the resultant protocol (called VTTP … diff --git a/papers/_posts/2016-01-01-data-aware-communication-for-energy-harvesting-sensor-networks.md b/papers/_posts/2016-01-01-data-aware-communication-for-energy-harvesting-sensor-networks.md new file mode 100644 index 00000000..157fb12d --- /dev/null +++ b/papers/_posts/2016-01-01-data-aware-communication-for-energy-harvesting-sensor-networks.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Data Aware Communication for Energy Harvesting Sensor Networks" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Hefeida, Mohamed S; Saeed, Fahad; " +year: "2016" +journal: Springer International Publishing +volume: +issue: +pages: 121-132 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-319-33936-8_10" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +We propose a Data Aware Communication Technique (DACT) that reduces energy consumption in Energy Harvesting Wireless Sensor Networks (EH-WSN). DACT takes advantage of the data correlation present in household EH-WSN applications to reduce communication overhead. It adapts its functionality according to correlations in data communicated over the EH-WSN and operates independently from spatial and temporal correlations without requiring location information. Our results show that DACT improves communication efficiency of sensor nodes and can help reduce idle energy consumption in an average-size home by up to 90 % as compared to spatial/temporal correlation-based communication techniques. diff --git a/papers/_posts/2016-01-01-gpu-arraysort--a-parallel,-in-place-algorithm-for-sorting-large-number-of-arrays.md b/papers/_posts/2016-01-01-gpu-arraysort--a-parallel,-in-place-algorithm-for-sorting-large-number-of-arrays.md new file mode 100644 index 00000000..52c43d97 --- /dev/null +++ b/papers/_posts/2016-01-01-gpu-arraysort--a-parallel,-in-place-algorithm-for-sorting-large-number-of-arrays.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "GPU-ArraySort: A parallel, in-place algorithm for sorting large number of arrays" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Awan, Muaaz; Saeed, Fahad; " +year: "2016" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/ICPPW.2016.27" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Modern day analytics deals with big datasets from diverse fields. For many application the data is in the form of an array which consists of large number of smaller arrays. Existing techniques focus on sorting a single large array and cannot be used for sorting large number of smaller arrays in an efficient manner. Currently no such algorithm is available which can sort such large number of arrays utilizing the massively parallel architecture of GPU devices. In this paper we present a highly scalable parallel algorithm, called GPU-ArraySort, for sorting large number of arrays using a GPU. Our algorithm performs in-place operations and makes minimum use of any temporary run-time memory. Our results indicate that we can sort up to 2 million arrays having 1000 elements each, within few seconds. We compare our results with the unorthodox tagged array sorting technique based on NVIDIAs Thrust library. GPU-ArraySort … diff --git a/papers/_posts/2016-01-01-introduction-to-the-selected-papers-from-the-7th-international-conference-on-bioinformatics-and-computational-biology-bicob-2015.md b/papers/_posts/2016-01-01-introduction-to-the-selected-papers-from-the-7th-international-conference-on-bioinformatics-and-computational-biology-bicob-2015.md new file mode 100644 index 00000000..673408a6 --- /dev/null +++ b/papers/_posts/2016-01-01-introduction-to-the-selected-papers-from-the-7th-international-conference-on-bioinformatics-and-computational-biology-bicob-2015.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Introduction to the selected papers from the 7th International Conference on Bioinformatics and Computational Biology (BICoB 2015)" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haspel, Nurit; Al-Mubaid, Hisham; " +year: "2016" +journal: World Scientific Publishing Company +volume: 14 +issue: +pages: 1602002 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1142/S0219720016020029" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In the past decade, the scientiŻc community has seen unprecedented increase in the volume and complexity of the biological data. Our data-generation capability will likely reach exponential bounds with the introduction of next-generation sequencing (NGS) technologies and high-throughput mass spectrometers and continued decreasing cost of producing such data sets. In order to exploit the scientiŻc understanding of this vast ocean of biological data, the underlying information must be integrated, analyzed, displayed, and modeled in an eącient manner. However, the big data generated from these high-throughput technologies is becoming a hurdle in the system-wide analysis of biological entities. This special issue is a follow up to the 7th International Conference on Bioinformatics and Computational Biology (BICoB-2015), which took place in Honolulu, Hawaii, USA during 9–11 March 2015. The guest editors … diff --git a/papers/_posts/2016-01-01-ms-reduce--an-ultrafast-technique-for-reduction-of-big-mass-spectrometry-data-for-high-throughput-processing.md b/papers/_posts/2016-01-01-ms-reduce--an-ultrafast-technique-for-reduction-of-big-mass-spectrometry-data-for-high-throughput-processing.md new file mode 100644 index 00000000..8b76dd01 --- /dev/null +++ b/papers/_posts/2016-01-01-ms-reduce--an-ultrafast-technique-for-reduction-of-big-mass-spectrometry-data-for-high-throughput-processing.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "MS-REDUCE: an ultrafast technique for reduction of big mass spectrometry data for high-throughput processing" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Awan, Muaaz Gul; Saeed, Fahad; " +year: "2016" +journal: Oxford University Press +volume: 32 +issue: +pages: 1518-1526 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1093/bioinformatics/btw023" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Motivation: Modern proteomics studies utilize high-throughput mass spectrometers which can produce data at an astonishing rate. These big mass spectrometry (MS) datasets can easily reach peta-scale level creating storage and analytic problems for large-scale systems biology studies. Each spectrum consists of thousands of peaks which have to be processed to deduce the peptide. However, only a small percentage of peaks in a spectrum are useful for peptide deduction as most of the peaks are either noise or not useful for a given spectrum. This redundant processing of non-useful peaks is a bottleneck for streaming high-throughput processing of big MS data. One way to reduce the amount of computation required in a high-throughput environment is to eliminate non-useful peaks. Existing noise removing algorithms are limited in their data-reduction capability and are compute intensive making them … diff --git a/papers/_posts/2016-01-01-reductive-analytics-on-big-ms-data-leads-to-tremendous-reduction-in-time-for-peptide-deduction.md b/papers/_posts/2016-01-01-reductive-analytics-on-big-ms-data-leads-to-tremendous-reduction-in-time-for-peptide-deduction.md new file mode 100644 index 00000000..3317321e --- /dev/null +++ b/papers/_posts/2016-01-01-reductive-analytics-on-big-ms-data-leads-to-tremendous-reduction-in-time-for-peptide-deduction.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Reductive Analytics on Big MS Data leads to tremendous reduction in time for peptide deduction" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Awan, Muaaz Gul; Saeed, Fahad; " +year: "2016" +journal: Cold Spring Harbor Laboratory +volume: +issue: +pages: 73064 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1101/073064" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In this paper we present a feasibility of using a data-reductive strategy for analyzing big MS data. The proposed method utilizes our reduction algorithm MS-REDUCE and peptide deduction is accomplished using Tide with hiXcorr. Using this approach we were able to process 1 million spectra in under 3 hours. Our results showed that running peptide deduction with smaller amount of selected peaks made the computations much faster and scalable with increasing resolution of MS data. Quality assessment experiments performed on experimentally generated datasets showed good quality peptide matches can be made using the reduced datasets. We anticipate that the proteomics and systems biology community will widely adopt our reductive strategy due to its efficacy and reduced time for analysis. diff --git a/papers/_posts/2016-01-01-selected-papers-from-bicob2015.md b/papers/_posts/2016-01-01-selected-papers-from-bicob2015.md new file mode 100644 index 00000000..b025bcfd --- /dev/null +++ b/papers/_posts/2016-01-01-selected-papers-from-bicob2015.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Selected Papers from BICoB2015" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "BICoB; " +year: "2016" +journal: World Scientific +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + + diff --git a/papers/_posts/2016-01-01-systems-level-analysis-reveals-selective-regulation-of-aqp2-gene-expression-by-vasopressin.md b/papers/_posts/2016-01-01-systems-level-analysis-reveals-selective-regulation-of-aqp2-gene-expression-by-vasopressin.md new file mode 100644 index 00000000..28bd4343 --- /dev/null +++ b/papers/_posts/2016-01-01-systems-level-analysis-reveals-selective-regulation-of-aqp2-gene-expression-by-vasopressin.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Systems-level analysis reveals selective regulation of Aqp2 gene expression by vasopressin" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Sandoval, Pablo C; Claxton, J’Neka S; Lee, Jae Wook; Saeed, Fahad; Hoffert, Jason D; Knepper, Mark A; " +year: "2016" +journal: Nature Publishing Group UK London +volume: 6 +issue: +pages: 34863 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1038/srep34863" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Vasopressin-mediated regulation of renal water excretion is defective in a variety of water balance disorders in humans. It occurs in part through long-term mechanisms that regulate the abundance of the aquaporin-2 water channel in renal collecting duct cells. Here, we use deep DNA sequencing in mouse collecting duct cells to ask whether vasopressin signaling selectively increases Aqp2 gene transcription or whether it triggers a broadly targeted transcriptional network. ChIP-Seq quantification of binding sites for RNA polymerase II was combined with RNA-Seq quantification of transcript abundances to identify genes whose transcription is regulated by vasopressin. (View curated dataset at https://helixweb.nih.gov/ESBL/Database/Vasopressin/). The analysis revealed only 35 vasopressin-regulated genes (of 3659) including Aqp2. Increases in RNA polymerase II binding and mRNA abundances for Aqp2 far … diff --git a/papers/_posts/2017-01-01-a-hybrid-mpi-openmp-strategy-to-speedup-the-compression-of-big-next-generation-sequencing-datasets.md b/papers/_posts/2017-01-01-a-hybrid-mpi-openmp-strategy-to-speedup-the-compression-of-big-next-generation-sequencing-datasets.md new file mode 100644 index 00000000..5d089d55 --- /dev/null +++ b/papers/_posts/2017-01-01-a-hybrid-mpi-openmp-strategy-to-speedup-the-compression-of-big-next-generation-sequencing-datasets.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A Hybrid MPI-OpenMP Strategy to Speedup the Compression of Big Next-Generation Sequencing Datasets" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Vargas-Perez, Sandino; Saeed, Fahad; " +year: "2017" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/TPDS.2017.2692782" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +DNA sequencing has moved into the realm of Big Data due to the rapid development of high-throughput, low cost Next-Generation Sequencing (NGS) technologies. Sequential data compression solutions that once were sufficient to efficiently store and distribute this information are now falling behind. In this paper we introduce phyNGSC, a hybrid MPI-OpenMP strategy to speedup the compression of big NGS data by combining the features of both distributed and shared memory architectures. Our algorithm balances work-load among processes and threads, alleviates memory latency by exploiting locality, and accelerates I/O by reducing excessive read/write operations and inter-node message exchange. To make the algorithm scalable, we introduce a novel timestamp-based file structure that allows us to write the compressed data in a distributed and non-deterministic fashion while retaining the capability of … diff --git a/papers/_posts/2017-01-01-a-new-cryptography-algorithm-to-protect-cloud-based-healthcare-services.md b/papers/_posts/2017-01-01-a-new-cryptography-algorithm-to-protect-cloud-based-healthcare-services.md new file mode 100644 index 00000000..4d0ba9e4 --- /dev/null +++ b/papers/_posts/2017-01-01-a-new-cryptography-algorithm-to-protect-cloud-based-healthcare-services.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A new cryptography algorithm to protect cloud-based healthcare services" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Aledhari, Mohammed; Marhoon, Ali; Hamad, Ali; Saeed, Fahad; " +year: "2017" +journal: IEEE +volume: +issue: +pages: 37-43 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/CHASE.2017.57" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The revolution of smart devices has a significant and positive impact on the lives of many people, especially in regard to elements of healthcare. In part, this revolution is attributed to technological advances that enable individuals to wear and use medical devices to monitor their health activities, but remotely. Also, these smart, wearable medical devices assist health care providers in monitoring their patients remotely, thereby enabling physicians to respond quickly in the event of emergencies. An ancillary advantage is that health care costs will be reduced, another benefit that, when paired with prompt medical treatment, indicates significant advances in the contemporary management of health care. However, the competition among manufacturers of these medical devices creates a complexity of small and smart wearable devices such as ECG and EMG. This complexity results in other issues such as patient security … diff --git a/papers/_posts/2017-01-01-an-out-of-core-gpu-based-dimensionality-reduction-algorithm-for-big-mass-spectrometry-data-and-its-application-in-bottom-up-proteomics.md b/papers/_posts/2017-01-01-an-out-of-core-gpu-based-dimensionality-reduction-algorithm-for-big-mass-spectrometry-data-and-its-application-in-bottom-up-proteomics.md new file mode 100644 index 00000000..c2d94438 --- /dev/null +++ b/papers/_posts/2017-01-01-an-out-of-core-gpu-based-dimensionality-reduction-algorithm-for-big-mass-spectrometry-data-and-its-application-in-bottom-up-proteomics.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "An out-of-core gpu based dimensionality reduction algorithm for big mass spectrometry data and its application in bottom-up proteomics" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Awan, Muaaz Gul; Saeed, Fahad; " +year: "2017" +journal: +volume: +issue: +pages: 550-555 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1145/3107411.3107466" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Modern high resolution Mass Spectrometry instruments can generate millions of spectra in a single systems biology experiment. Each spectrum consists of thousands of peaks but only a small number of peaks actively contribute to deduction of peptides. Therefore, pre-processing of MS data to detect noisy and non-useful peaks are an active area of research. Most of the sequential noise reducing algorithms are impractical to use as a pre-processing step due to high time-complexity. In this paper, we present a GPU based dimensionality-reduction algorithm, called G-MSR, for MS2 spectra. Our proposed algorithm uses novel data structures which optimize the memory and computational operations inside GPU. These novel data structures include Binary Spectra and Quantized Indexed Spectra (QIS). The former helps in communicating essential information between CPU and GPU using minimum amount of data … diff --git a/papers/_posts/2017-01-01-gpu-pcc--a-gpu-based-technique-to-compute-pairwise-pearson's-correlation-coefficients-for-big-fmri-data.md b/papers/_posts/2017-01-01-gpu-pcc--a-gpu-based-technique-to-compute-pairwise-pearson's-correlation-coefficients-for-big-fmri-data.md new file mode 100644 index 00000000..babbf3a3 --- /dev/null +++ b/papers/_posts/2017-01-01-gpu-pcc--a-gpu-based-technique-to-compute-pairwise-pearson's-correlation-coefficients-for-big-fmri-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "GPU-PCC: A GPU Based Technique to Compute Pairwise Pearson's Correlation Coefficients for Big fMRI Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Eslami, Taban; Awan, Muaaz Gul; Saeed, Fahad; " +year: "2017" +journal: +volume: +issue: +pages: 723-728 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1145/3107411.3108173" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Functional Magnetic Resonance Imaging (fMRI) is a non-invasive brain imaging technique for studying the brain's functional activities. Pearson's Correlation Coefficient is an important measure for capturing dynamic behaviors and functional connectivity between brain components. One bottleneck in computing Correlation Coefficients is the time it takes to process big fMRI data. In this paper, we propose GPU-PCC, a GPU based algorithm based on vector dot product, which is able to compute pairwise Pearson's Correlation Coefficients while performing computation once for each pair. Our method is able to compute Correlation Coefficients in an ordered fashion without the need to do post-processing reordering of coefficients. We evaluated GPU-PCC using synthetic and real fMRI data and compared it with sequential version of computing Correlation Coefficient on CPU and existing state-of-the-art GPU method. We … diff --git a/papers/_posts/2017-01-01-power-efficient-and-highly-scalable-parallel-graph-sampling-using-fpgas.md b/papers/_posts/2017-01-01-power-efficient-and-highly-scalable-parallel-graph-sampling-using-fpgas.md new file mode 100644 index 00000000..a4cd4d62 --- /dev/null +++ b/papers/_posts/2017-01-01-power-efficient-and-highly-scalable-parallel-graph-sampling-using-fpgas.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Power-Efficient and Highly Scalable Parallel Graph Sampling using FPGAs" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Tariq, Usman; Cheema, Umer; Saeed, Fahad; " +year: "2017" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/RECONFIG.2017.8279806" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Energy efficiency is a crucial problem in data centers where big data is generally represented by directed or undirected graphs. Analysis of this big data graph is challenging due to volume and velocity of the data as well as irregular memory access patterns. Graph sampling is one of the most effective ways to reduce the size of graph while maintaining crucial characteristics. In this paper we present design and implementation of an FPGA based graph sampling method which is both time- and energy-efficient. This is in contrast to existing parallel approaches which include memory-distributed clusters, multicore and GPUs. Our strategy utilizes a novel graph data structure, that we call COPRA which allows time- and memory-efficient representation of graphs suitable for reconfigurable hardware such as FPGAs. Our experiments show that our proposed techniques are 2x faster and 3x more energy efficient as compared … diff --git a/papers/_posts/2017-01-01-scalable-data-structure-to-compress-next-generation-sequencing-files-and-its-application-to-compressive-genomics.md b/papers/_posts/2017-01-01-scalable-data-structure-to-compress-next-generation-sequencing-files-and-its-application-to-compressive-genomics.md new file mode 100644 index 00000000..0dbfbac3 --- /dev/null +++ b/papers/_posts/2017-01-01-scalable-data-structure-to-compress-next-generation-sequencing-files-and-its-application-to-compressive-genomics.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Scalable data structure to compress next-generation sequencing files and its application to compressive genomics" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Pérez, Sandino Vargas; Saeed, Fahad; " +year: "2017" +journal: IEEE +volume: +issue: +pages: 1923-1928 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBM.2017.8217953" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +It is now possible to compress and decompress large-scale Next-Generation Sequencing files taking advantage of high-performance computing techniques. To this end, we have recently introduced a scalable hybrid parallel algorithm, called phyNGSC, which allows fast compression as well as decompression of big FASTQ datasets using distributed and shared memory programming models via MPI and OpenMP. In this paper we present the design and implementation of a novel parallel data structure which lessens the dependency on decompression and facilitates the handling of DNA sequences in their compressed state using fine-grained decompression in a technique that is identified as in compresso data processing. Using our data structure compression and decompression throughputs of up to 8.71 GB/s and 10.12 GB/s were observed. Our proposed structure and methodology brings us one step closer to … diff --git a/papers/_posts/2018-01-01-a-deep-learning-based-data-minimization-algorithm-for-fast-and-secure-transfer-of-big-genomic-datasets.md b/papers/_posts/2018-01-01-a-deep-learning-based-data-minimization-algorithm-for-fast-and-secure-transfer-of-big-genomic-datasets.md new file mode 100644 index 00000000..f86b1b07 --- /dev/null +++ b/papers/_posts/2018-01-01-a-deep-learning-based-data-minimization-algorithm-for-fast-and-secure-transfer-of-big-genomic-datasets.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A deep learning-based data minimization algorithm for fast and secure transfer of big genomic datasets" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Aledhari, Mohammed; Di Pierro, Marianne; Hefeida, Mohamed; Saeed, Fahad; " +year: "2018" +journal: IEEE +volume: 7 +issue: +pages: 271-284 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/TBDATA.2018.2805687" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In the age of Big Genomics Data, institutions such as the National Human Genome Research Institute (NHGRI) are challenged in their efforts to share volumes of data between researchers, a process that has been plagued by unreliable transfers and slow speeds. These occur due to throughput bottlenecks of traditional transfer technologies. Two factors that affect the efficiency of data transmission are the channel bandwidth and the amount of data. Increasing the bandwidth is one way to transmit data efficiently, but might not always be possible due to resource limitations. Another way to maximize channel utilization is by decreasing the bits needed for transmission of a dataset. Traditionally, transmission of big genomic data between two geographical locations is done using general-purpose protocols, such as hypertext transfer protocol (HTTP) and file transfer protocol (FTP) secure. In this paper, we present a novel … diff --git a/papers/_posts/2018-01-01-a-fourier-based-data-minimization-algorithm-for-fast-and-secure-transfer-of-big-genomic-datasets.md b/papers/_posts/2018-01-01-a-fourier-based-data-minimization-algorithm-for-fast-and-secure-transfer-of-big-genomic-datasets.md new file mode 100644 index 00000000..c380f4e2 --- /dev/null +++ b/papers/_posts/2018-01-01-a-fourier-based-data-minimization-algorithm-for-fast-and-secure-transfer-of-big-genomic-datasets.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A Fourier-Based Data Minimization Algorithm for Fast and Secure Transfer of Big Genomic Datasets" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Aledhari, Mohammed; Pierro, Marianne Di; Saeed, Fahad; " +year: "2018" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BigDataCongress.2018.00024" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +DNA sequencing plays an important role in the bioinformatics research community. DNA sequencing is important to all organisms, especially to humans and from multiple perspectives. These include understanding the correlation of specific mutations that plays a significant role in increasing or decreasing the risks of developing a disease or condition, or finding the implications and connections between the genotype and the phenotype. Advancements in the high-throughput sequencing techniques, tools, and equipment, have helped to generate big genomic datasets due to the tremendous decrease in the DNA sequence costs. However, the advancements have posed great challenges to genomic data storage, analysis, and transfer. Accessing, manipulating, and sharing the generated big genomic datasets present major challenges in terms of time and size, as well as privacy. Data size plays an important role in … diff --git "a/papers/_posts/2018-01-01-fast-gpu-pcc--a-gpu-based-technique-to-compute-pairwise-pearson\342\200\231s-correlation-coefficients-for-time-series-data---an-fmri-study.md" "b/papers/_posts/2018-01-01-fast-gpu-pcc--a-gpu-based-technique-to-compute-pairwise-pearson\342\200\231s-correlation-coefficients-for-time-series-data---an-fmri-study.md" new file mode 100644 index 00000000..d10a9ef5 --- /dev/null +++ "b/papers/_posts/2018-01-01-fast-gpu-pcc--a-gpu-based-technique-to-compute-pairwise-pearson\342\200\231s-correlation-coefficients-for-time-series-data---an-fmri-study.md" @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Fast-GPU-PCC: A GPU-Based Technique to Compute Pairwise Pearson’s Correlation Coefficients for Time Series Data - An fMRI Study" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Eslami, Taban; Saeed, Fahad; " +year: "2018" +journal: MDPI +volume: +issue: +pages: +is_published: False +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + + diff --git a/papers/_posts/2018-01-01-gpu-daemon--gpu-algorithm-design,-data-management-&-optimization-template-for-array-based-big-omics-data.md b/papers/_posts/2018-01-01-gpu-daemon--gpu-algorithm-design,-data-management-&-optimization-template-for-array-based-big-omics-data.md new file mode 100644 index 00000000..8d2894ea --- /dev/null +++ b/papers/_posts/2018-01-01-gpu-daemon--gpu-algorithm-design,-data-management-&-optimization-template-for-array-based-big-omics-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "GPU-DAEMON: GPU algorithm design, data management & optimization template for array based big omics data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Awan, Muaaz Gul; Eslami, Taban; Saeed, Fahad; " +year: "2018" +journal: Pergamon +volume: 101 +issue: +pages: 163-173 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1016/j.compbiomed.2018.08.015" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In the age of ever increasing data, faster and more efficient data processing algorithms are needed. Graphics Processing Units (GPU) are emerging as a cost-effective alternative architecture for high-end computing. The optimal design of GPU algorithms is a challenging task which requires thorough understanding of the high performance computing architecture as well as the algorithmic design. The steep learning curve needed for effective GPU-centric algorithm design and implementation requires considerable expertise, time, and resources. In this paper, we present GPU-DAEMON, a GPU Data Management, Algorithm Design and Optimization technique suitable for processing array based big omics data. Our proposed GPU algorithm design template outlines and provides generic methods to tackle critical bottlenecks which can be followed to implement high performance, scalable GPU algorithms for given big … diff --git "a/papers/_posts/2018-01-01-mass\342\200\220simulator--a-highly-configurable-simulator-for-generating-ms-ms-datasets-for-benchmarking-of-proteomics-algorithms.md" "b/papers/_posts/2018-01-01-mass\342\200\220simulator--a-highly-configurable-simulator-for-generating-ms-ms-datasets-for-benchmarking-of-proteomics-algorithms.md" new file mode 100644 index 00000000..36ad35b9 --- /dev/null +++ "b/papers/_posts/2018-01-01-mass\342\200\220simulator--a-highly-configurable-simulator-for-generating-ms-ms-datasets-for-benchmarking-of-proteomics-algorithms.md" @@ -0,0 +1,40 @@ +--- +layout: paper +title: "MaSS‐Simulator: A Highly Configurable Simulator for Generating MS/MS Datasets for Benchmarking of Proteomics Algorithms" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Awan, Muaaz Gul; Saeed, Fahad; " +year: "2018" +journal: Wiley +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1002/pmic.201800206" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Mass Spectrometry (MS)‐based proteomics has become an essential tool in the study of proteins. With the advent of modern MS machines huge amounts of data is being generated, which can only be processed by novel algorithmic tools. However, in the absence of data benchmarks and ground truth datasets algorithmic integrity testing and reproducibility is a challenging problem. To this end, MaSS‐Simulator has been presented, which is an easy to use simulator and can be configured to simulate MS/MS datasets for a wide variety of conditions with known ground truths. MaSS‐Simulator offers many configuration options to allow the user a great degree of control over the test datasets, which can enable rigorous and large‐ scale testing of any proteomics algorithm. MaSS‐Simulator is assessed by comparing its performance against experimentally generated spectra and spectra obtained from NIST collections of … diff --git a/papers/_posts/2018-01-01-parallel-sampling-pipeline-for-indefinite-stream-of-heterogeneous-graphs-using-opencl-for-fpgas.md b/papers/_posts/2018-01-01-parallel-sampling-pipeline-for-indefinite-stream-of-heterogeneous-graphs-using-opencl-for-fpgas.md new file mode 100644 index 00000000..4a797146 --- /dev/null +++ b/papers/_posts/2018-01-01-parallel-sampling-pipeline-for-indefinite-stream-of-heterogeneous-graphs-using-opencl-for-fpgas.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Parallel sampling-pipeline for indefinite stream of heterogeneous graphs using OpenCL for FPGAs" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Tariq, Muhammad Usman; Saeed, Fahad; " +year: "2018" +journal: IEEE +volume: +issue: +pages: 4752-4761 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BigData.2018.8621979" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In the field of data science, a huge amount of data, generally represented as graphs, needs to be processed and analyzed. It is of utmost importance that this data be processed swiftly and efficiently to save time and energy. The volume and velocity of data, along with irregular access patterns in graph data structures, pose challenges in terms of analysis and processing. Further, a big chunk of time and energy is spent on analyzing these graphs on large compute clusters and/or data-centers. Filtering and refining of data using graph sampling techniques are one of the most effective ways to speed up the analysis. Efficient accelerators, such as FPGAs, have proven to significantly lower the energy cost of running an algorithm. To this end, we present the design and implementation of a parallel graph sampling technique, for a large number of input graphs streaming into a FPGA. A parallel approach using OpenCL for … diff --git a/papers/_posts/2018-01-01-similarity-based-classification-of-adhd-using-singular-value-decomposition.md b/papers/_posts/2018-01-01-similarity-based-classification-of-adhd-using-singular-value-decomposition.md new file mode 100644 index 00000000..21decb9a --- /dev/null +++ b/papers/_posts/2018-01-01-similarity-based-classification-of-adhd-using-singular-value-decomposition.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Similarity based classification of ADHD using Singular Value Decomposition" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Eslami, Taban; Saeed, Fahad; " +year: "2018" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1145/3203217.3203239" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Attention deficit hyperactivity disorder (ADHD) is one of the most common brain disorders among children. This disorder is considered as a big threat for public health and causes attention, focus and organizing difficulties for children and even adults. Since the cause of ADHD is not known yet, data mining algorithms are being used to help discover patterns which discriminate healthy from ADHD subjects. Numerous efforts are underway with the goal of developing classification tools for ADHD diagnosis based on functional and structural magnetic resonance imaging data of the brain. In this paper, we used Eros, which is a technique for computing similarity between two multivariate time series along with k-Nearest-Neighbor classifier, to classify healthy vs ADHD children. We designed a model selection scheme called J-Eros which is able to pick the optimum value of k for k-Nearest-Neighbor from the training data … diff --git a/papers/_posts/2018-01-01-towards-quantifying-psychiatric-diagnosis-using-machine-learning-algorithms-and-big-fmri-data.md b/papers/_posts/2018-01-01-towards-quantifying-psychiatric-diagnosis-using-machine-learning-algorithms-and-big-fmri-data.md new file mode 100644 index 00000000..32255f57 --- /dev/null +++ b/papers/_posts/2018-01-01-towards-quantifying-psychiatric-diagnosis-using-machine-learning-algorithms-and-big-fmri-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Towards quantifying psychiatric diagnosis using machine learning algorithms and big fMRI data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; " +year: "2018" +journal: BMC +volume: 3 +issue: +pages: 7 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1186/s41044-018-0033-0" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Attention Deficit Hyperactivity Disorder (ADHD) is one of the most common brain disorders among children and is very difficult to diagnose using current methods. Similarly other mental disorders are subject to the same systematic errors with sufficient evidence of diagnostic errors as well as over-prescribing of drugs due to misdiagnosis. For most mental health disorders there is no quantitative method that will inform the presence or absence of a given mental disorder. We argue that definitive and quantitative diagnostic tests are necessary for ADHD and other mental disorders. To this end, big data Functional Magnetic Resonance Imaging (fMRI) and machine learning algorithms can be instrumental in changing the way psychiatric disorders are diagnosed and treated. We briefly discuss our recent research efforts and future directions for a quantitative gold standard tests for psychiatric diagnosis. diff --git a/papers/_posts/2019-01-01-2019-ieee-international-conference-on-bioinformatics-and-biomedicine-bibm.md b/papers/_posts/2019-01-01-2019-ieee-international-conference-on-bioinformatics-and-biomedicine-bibm.md new file mode 100644 index 00000000..1c9e5050 --- /dev/null +++ b/papers/_posts/2019-01-01-2019-ieee-international-conference-on-bioinformatics-and-biomedicine-bibm.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Haseeb, Muhammad; Saeed, Fahad; " +year: "2019" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + + diff --git a/papers/_posts/2019-01-01-auto-asd-network--a-technique-based-on-deep-learning-and-support-vector-machines-for-diagnosing-autism-spectrum-disorder-using-fmri-data.md b/papers/_posts/2019-01-01-auto-asd-network--a-technique-based-on-deep-learning-and-support-vector-machines-for-diagnosing-autism-spectrum-disorder-using-fmri-data.md new file mode 100644 index 00000000..d2ea833c --- /dev/null +++ b/papers/_posts/2019-01-01-auto-asd-network--a-technique-based-on-deep-learning-and-support-vector-machines-for-diagnosing-autism-spectrum-disorder-using-fmri-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Auto-ASD-Network: A technique based on Deep Learning and Support Vector Machines for diagnosing Autism Spectrum Disorder using fMRI data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Eslami, Taban; Saeed, Fahad; " +year: "2019" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1145/3307339.3343482" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Quantitative analysis of brain disorders such as Autism Spectrum Disorder (ASD) is an ongoing field of research. Machine learning and deep learning techniques have been playing an important role in automating the diagnosis of brain disorders by extracting discriminative features from the brain data. In this study, we propose a model called Auto-ASD-Network in order to classify subjects with Autism disorder from healthy subjects using only fMRI data. Our model consists of a multilayer perceptron (MLP) with two hidden layers. We use an algorithm called SMOTE for performing data augmentation in order to generate artificial data and avoid overfitting, which helps increase the classification accuracy. We further investigate the discriminative power of features extracted using MLP by feeding them to an SVM classifier. In order to optimize the hyperparameters of SVM, we use a technique called Auto Tune Models (ATM … diff --git a/papers/_posts/2019-01-01-efficient-shared-peak-counting-in-database-peptide-search-using-compact-data-structure-for-fragment-ion-index.md b/papers/_posts/2019-01-01-efficient-shared-peak-counting-in-database-peptide-search-using-compact-data-structure-for-fragment-ion-index.md new file mode 100644 index 00000000..be6834b3 --- /dev/null +++ b/papers/_posts/2019-01-01-efficient-shared-peak-counting-in-database-peptide-search-using-compact-data-structure-for-fragment-ion-index.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Efficient shared peak counting in database peptide search using compact data structure for fragment-ion index" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Haseeb, Muhammad; Saeed, Fahad; " +year: "2019" +journal: IEEE +volume: +issue: +pages: 275-278 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBM47256.2019.8983152" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Database search is the most commonly employed method for identification of peptides from MS/MS spectra data. The search involves comparing experimentally obtained MS/MS spectra against a set of theoretical spectra predicted from a protein sequence database. One of the most commonly employed similarity metrics for spectral comparison is the shared-peak count between a pair of MS/MS spectra. Most modern methods index all generated fragment-ion data from theoretical spectra to speed up the shared peak count computations between a given experimental spectrum and all theoretical spectra. However, the bottleneck for this method is the gigantic memory footprint of fragment-ion index that leads to non-scalable solutions. In this paper, we present a novel data structure, called Compact Fragment-Ion Index Representation (CFIR), that efficiently compresses highly redundant ion-mass information in the data … diff --git a/papers/_posts/2019-01-01-gpu-dfc--a-gpu-based-parallel-algorithm-for-computing-dynamic-functional-connectivity-of-big-fmri-data.md b/papers/_posts/2019-01-01-gpu-dfc--a-gpu-based-parallel-algorithm-for-computing-dynamic-functional-connectivity-of-big-fmri-data.md new file mode 100644 index 00000000..bc409f36 --- /dev/null +++ b/papers/_posts/2019-01-01-gpu-dfc--a-gpu-based-parallel-algorithm-for-computing-dynamic-functional-connectivity-of-big-fmri-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "GPU-DFC: A GPU-based parallel algorithm for computing dynamic-functional connectivity of big fMRI data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Eslami, Taban; Saeed, Fahad; " +year: "2019" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BigDataService.2019.00022" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Studying dynamic-functional connectivity (DFC) using fMRI data of the brain gives much richer information to neuroscientists than studying the brain as a static entity. Mining of dynamic connectivity graphs from these brain studies can be used to classify diseased versus healthy brains. However, constructing and mining dynamic-functional connectivity graphs of the brain can be time consuming due to size of fMRI data. In this paper, we propose a highly scalable GPU-based parallel algorithm called GPU-DFC for computing dynamic-functional connectivity of fMRI data both at region and voxel level. Our algorithm exploits sparsification of correlation matrix and stores them in CSR format. Further reduction in the correlation matrix is achieved by parallel decomposition techniques. Our GPU-DFC algorithm achieves 2 times speed-up for computing dynamic correlations compared to state-of-the-art GPU-based techniques … diff --git a/papers/_posts/2019-01-01-gpu-sfft--a-gpu-based-parallel-algorithm-for-computing-the-sparse-fast-fourier-transform-sfft-of-k-sparse-signals.md b/papers/_posts/2019-01-01-gpu-sfft--a-gpu-based-parallel-algorithm-for-computing-the-sparse-fast-fourier-transform-sfft-of-k-sparse-signals.md new file mode 100644 index 00000000..f0b5f569 --- /dev/null +++ b/papers/_posts/2019-01-01-gpu-sfft--a-gpu-based-parallel-algorithm-for-computing-the-sparse-fast-fourier-transform-sfft-of-k-sparse-signals.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "GPU-SFFT: A GPU based parallel algorithm for computing the Sparse Fast Fourier Transform (SFFT) of k-sparse signals" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Artiles, Oswaldo; Saeed, Fahad; " +year: "2019" +journal: IEEE +volume: +issue: +pages: 3303-3311 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BigData47090.2019.9006579" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The Sparse Fast Fourier Transform (MIT-SFFT) is an algorithm to compute the discrete Fourier transform of a signal with a sublinear time complexity, i.e. algorithms with runtime complexity proportional to the sparsity level k, where k is the number of non-zero coefficients of the signal in the frequency domain. In this paper, we propose a highly scalable GPU-based parallel algorithm called GPU-SFFT for computing the SFFT of k-sparse signals. Our implementation of GPU-SFFT is based on parallel optimizations that leads to enormous speedups. These include carefully crafting parallel regions in the sequential MIT-SFFT code to exploit parallelism, and minimizing data movement between the CPU and the GPU. This allows us to exploit extreme parallelism for the CPU-GPU architectures and to maximize the number of concurrent threads executing instructions. Our experiments show that our designed CPU-GPU … diff --git a/papers/_posts/2019-01-01-high-performance-reductive-strategies-for-big-data-from-lc-ms-ms-proteomics.md b/papers/_posts/2019-01-01-high-performance-reductive-strategies-for-big-data-from-lc-ms-ms-proteomics.md new file mode 100644 index 00000000..3481831e --- /dev/null +++ b/papers/_posts/2019-01-01-high-performance-reductive-strategies-for-big-data-from-lc-ms-ms-proteomics.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "High-Performance Reductive Strategies for Big Data from LC-MS/MS Proteomics" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Awan, Muaaz Gul; " +year: "2019" +journal: Western Michigan University +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Mass Spectrometry (MS)-based proteomics utilizes high performance liquid chromatography in tandem with high-throughput mass spectrometers. These experiments can produce MS data sets with astonishing speed and volume that can easily reach peta-scale level, creating storage and computational problems for large-scale systems biology studies. Each spectrum output by a mass spectrometer may consist of thousands of peaks, which must all be processed to deduce the corresponding peptide. However, only a small percentage of peaks in a spectrum are useful for further processing, as most of the peaks are either noise or are not useful. Our experiments have shown that 90 to 95% of the peaks are not required for reliable results. This leads to a lot of redundant processing and causes a hindrance to high-throughput processing of big MS data. The existing pre-processing algorithms for noise-removal or … diff --git a/papers/_posts/2019-01-01-lbe--a-computational-load-balancing-algorithm-for-speeding-up-parallel-peptide-search-in-mass-spectrometry-based-proteomics.md b/papers/_posts/2019-01-01-lbe--a-computational-load-balancing-algorithm-for-speeding-up-parallel-peptide-search-in-mass-spectrometry-based-proteomics.md new file mode 100644 index 00000000..b1d2d06f --- /dev/null +++ b/papers/_posts/2019-01-01-lbe--a-computational-load-balancing-algorithm-for-speeding-up-parallel-peptide-search-in-mass-spectrometry-based-proteomics.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "LBE: A Computational Load Balancing Algorithm for Speeding up Parallel Peptide Search in Mass-Spectrometry based Proteomics" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Haseeb, Muhammad; Afzali, Fatima; Saeed, Fahad; " +year: "2019" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/IPDPSW.2019.00040" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The most commonly employed method for peptide identification in mass-spectrometry based proteomics involves comparing experimentally obtained tandem MS/MS spectra against a set of theoretical MS/MS spectra. The theoretical MS/MS spectra data are predicted using protein sequence database. Most state-of-the-art peptide search algorithms index theoretical spectra data to quickly filter-in the relevant (similar) indexed spectra when searching an experimental MS/MS spectrum. Data filtration substantially reduces the required number of computationally expensive spectrum-to-spectrum comparison operations. However, the number of predicted (and indexed) theoretical spectra grows exponentially with increase in post-translational modifications creating a memory and I/O bottleneck. In this paper, we present a parallel algorithm, called LBE, for efficient partitioning of theoretical spectra data on a distributed … diff --git "a/papers/_posts/2019-01-01-ngs\342\200\220integrator--a-tool-for-combining-information-from-multiple-genome\342\200\220wide-ngs-data-tracks-using-minimum-bayes-factors.md" "b/papers/_posts/2019-01-01-ngs\342\200\220integrator--a-tool-for-combining-information-from-multiple-genome\342\200\220wide-ngs-data-tracks-using-minimum-bayes-factors.md" new file mode 100644 index 00000000..579fee30 --- /dev/null +++ "b/papers/_posts/2019-01-01-ngs\342\200\220integrator--a-tool-for-combining-information-from-multiple-genome\342\200\220wide-ngs-data-tracks-using-minimum-bayes-factors.md" @@ -0,0 +1,40 @@ +--- +layout: paper +title: "NGS‐Integrator: A Tool for Combining Information from Multiple Genome‐Wide NGS Data Tracks Using Minimum Bayes Factors" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Jung, Hyun Jun; Wen, Bronte; Chen, Lihe; Saeed, Fahad; Knepper, Mark A; " +year: "2019" +journal: The Federation of American Societies for Experimental Biology +volume: 33 +issue: +pages: 637.2-637.2 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1096/fasebj.2019.33.1_supplement.637.2" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Genome‐wide studies that generate multiple high‐throughput next‐generation sequencing (NGS) datasets to identify genomic DNA elements for gene transcription require data integration methods to minimize complexity and false‐positive findings. This can involve integration of multiple genome‐wide data generated from same type or different types of high‐throughput NGS techniques. Since several strategies to integrate multiple genome‐wide NGS datasets based on peak calling tools have been developed, these conventional methods are typically applied to individual replicates and not the aggregate data from multiple data. NGS‐integrator, a Java‐based tool, integrates genome‐wide NGS datasets into a single data track for a genome browser based on minimum Bayes Factor (MBF) calculated from the signal‐to‐noise ratio. NGS‐integrator consists of two elements, “Calculator” and “Integrator”. The … diff --git a/papers/_posts/2019-01-01-optimized-cnn-based-diagnosis-system-to-detect-the-pneumonia-from-chest-radiographs.md b/papers/_posts/2019-01-01-optimized-cnn-based-diagnosis-system-to-detect-the-pneumonia-from-chest-radiographs.md new file mode 100644 index 00000000..b3e786f7 --- /dev/null +++ b/papers/_posts/2019-01-01-optimized-cnn-based-diagnosis-system-to-detect-the-pneumonia-from-chest-radiographs.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Optimized CNN-based diagnosis system to detect the pneumonia from chest radiographs" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Aledhari, Mohammed; Joji, Shelby; Hefeida, Mohamed; Saeed, Fahad; " +year: "2019" +journal: IEEE +volume: +issue: +pages: 2405-2412 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBM47256.2019.8983114" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Pneumonia is a high mortality disease that kills 50, 000 people in the United States each year. Children under the age of 5 and older population over the age of 65 are susceptible to serious cases of pneumonia. The United States spend billions of dollars fighting pneumonia-related infections every year. Early detection and intervention are crucial in treating pneumonia related infections. Since chest x-ray is one of the simplest and cheapest methods to diagnose pneumonia, we propose a deep learning algorithm based on convolutional neural networks to identify and classify pneumonia cases from these images. For all three models implemented, we obtained varying classification results and accuracy. Based on the results, we obtained better prediction with average accuracy of (68%) and average specificity of (69%) in contrast to the current state-of-the-art accuracy that is (51%) using the Visual Geometry Group … diff --git a/papers/_posts/2019-01-01-slm-transform--a-method-for-memory-efficient-indexing-of-spectra-for-database-search-in-lc-ms-ms-proteomics.md b/papers/_posts/2019-01-01-slm-transform--a-method-for-memory-efficient-indexing-of-spectra-for-database-search-in-lc-ms-ms-proteomics.md new file mode 100644 index 00000000..afa42163 --- /dev/null +++ b/papers/_posts/2019-01-01-slm-transform--a-method-for-memory-efficient-indexing-of-spectra-for-database-search-in-lc-ms-ms-proteomics.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Slm-transform: A method for memory-efficient indexing of spectra for database search in lc-ms/ms proteomics" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Haseeb, Muhammad; Awan, Muaaz G; Cadigan, Alexander S; Saeed, Fahad; " +year: "2019" +journal: Cold Spring Harbor Laboratory +volume: +issue: +pages: 531681 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1101/531681" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The most commonly used strategy for peptide identification in shotgun LC-MS/MS proteomics involves searching of MS/MS data against an in-silico digested protein sequence database. Typically, the digested peptide sequences are indexed into the memory to allow faster search times. However, subjecting a database to post-translational modifications (PTMs) during digestion results in an exponential increase in the number of peptides and therefore memory consumption. This limits the usage of existing fragment-ion based open-search algorithms for databases with several PTMs. In this paper, we propose a novel fragment-ion indexing technique which is analogous to suffix array transformation and allows constant time querying of indexed ions. We extend our transformation method, called SLM-Transform, by constructing ion buckets that allow querying of all indexed ions by mass by only storing information on distribution of ion-frequencies within buckets. The stored information is used with a regression technique to locate the position of ions in constant time. Moreover, the number of theoretical b- and y-ions generated and indexed for each theoretical spectrum are limited. Our results show that SLM-Transform allows indexing of up to 4x peptides than other leading fragment-ion based database search tools within the same memory constraints. We show that SLM-Transform based index allows indexing of over 83 million peptides within 26GB RAM as compared to 80GB required by MSFragger. Finally, we show the constant ion retrieval time for SLM-Transform based index allowing ultrafast peptide search speeds.Source code will be made … diff --git a/papers/_posts/2020-01-01-federated-learning--a-survey-on-enabling-technologies,-protocols,-and-applications.md b/papers/_posts/2020-01-01-federated-learning--a-survey-on-enabling-technologies,-protocols,-and-applications.md new file mode 100644 index 00000000..11925713 --- /dev/null +++ b/papers/_posts/2020-01-01-federated-learning--a-survey-on-enabling-technologies,-protocols,-and-applications.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Federated learning: A survey on enabling technologies, protocols, and applications" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Aledhari, Mohammed; Razzak, Rehma; Parizi, Reza M; Saeed, Fahad; " +year: "2020" +journal: IEEE +volume: 8 +issue: +pages: 140699-140725 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/ACCESS.2020.3013541" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +This paper provides a comprehensive study of Federated Learning (FL) with an emphasis on enabling software and hardware platforms, protocols, real-life applications and use-cases. FL can be applicable to multiple domains but applying it to different industries has its own set of obstacles. FL is known as collaborative learning, where algorithm(s) get trained across multiple devices or servers with decentralized data samples without having to exchange the actual data. This approach is radically different from other more established techniques such as getting the data samples uploaded to servers or having data in some form of distributed infrastructure. FL on the other hand generates more robust models without sharing data, leading to privacy-preserved solutions with higher security and access privileges to data. This paper starts by providing an overview of FL. Then, it gives an overview of technical details that … diff --git a/papers/_posts/2020-01-01-high-performance-and-machine-learning-algorithms-for-brain-fmri-data.md b/papers/_posts/2020-01-01-high-performance-and-machine-learning-algorithms-for-brain-fmri-data.md new file mode 100644 index 00000000..5418d9df --- /dev/null +++ b/papers/_posts/2020-01-01-high-performance-and-machine-learning-algorithms-for-brain-fmri-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "High Performance and Machine Learning Algorithms for Brain fMRI Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Eslami, Taban; " +year: "2020" +journal: Western Michigan University +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Brain disorders are very difficult to diagnose for reasons such as overlapping nature of symptoms, individual differences in brain structure, lack of medical tests and unknown causes of some disorders. The current psychiatric diagnostic process is based on behavioral observation and may be prone to misdiagnosis. diff --git a/papers/_posts/2020-01-01-methods-and-systems-for-compressing-data.md b/papers/_posts/2020-01-01-methods-and-systems-for-compressing-data.md new file mode 100644 index 00000000..94c08a64 --- /dev/null +++ b/papers/_posts/2020-01-01-methods-and-systems-for-compressing-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Methods and systems for compressing data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2020" +journal: US Patent 10,810,180 +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Methods and systems for compressing data, such as ion-mass information data in mass spectrometry spectra, to reduce index size are provided. Data in an index, such as fragment-ion data in a fragment-ion index, can be transformed for reduction of entropy and then encoded using a running counter technique to compress repetitive and redundant information in the index. diff --git "a/papers/_posts/2020-01-01-ngs-integrator--an-efficient-tool-for-combining-multiple-ngs-data-tracks-using-minimum-bayes\342\200\231-factors.md" "b/papers/_posts/2020-01-01-ngs-integrator--an-efficient-tool-for-combining-multiple-ngs-data-tracks-using-minimum-bayes\342\200\231-factors.md" new file mode 100644 index 00000000..7f61ecaa --- /dev/null +++ "b/papers/_posts/2020-01-01-ngs-integrator--an-efficient-tool-for-combining-multiple-ngs-data-tracks-using-minimum-bayes\342\200\231-factors.md" @@ -0,0 +1,40 @@ +--- +layout: paper +title: "NGS-Integrator: An efficient tool for combining multiple NGS data tracks using minimum Bayes’ factors" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Wen, Bronte; Jung, Hyun Jun; Chen, Lihe; Saeed, Fahad; Knepper, Mark A; " +year: "2020" +journal: BioMed Central +volume: 21 +issue: +pages: 1-7 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1186/s12864-020-07220-7" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Background Next-generation sequencing (NGS) is widely used for genome-wide identification and quantification of DNA elements involved in the regulation of gene transcription. Studies that generate multiple high-throughput NGS datasets require data integration methods for two general tasks: 1) generation of genome-wide data tracks representing an aggregate of multiple replicates of the same experiment; and 2) combination of tracks from different experimental types that provide complementary information regarding the location of genomic features such as enhancers. Results NGS-Integrator is a Java-based command line application, facilitating efficient integration of multiple genome-wide NGS datasets. NGS-Integrator first transforms all input data tracks using the complement of the minimum Bayes’ factor so that all values are expressed in the range [0,1] representing the probability of a true signal given the … diff --git a/papers/_posts/2021-01-01-a-multi-factorial-assessment-of-functional-human-autistic-spectrum-brain-network-analysis.md b/papers/_posts/2021-01-01-a-multi-factorial-assessment-of-functional-human-autistic-spectrum-brain-network-analysis.md new file mode 100644 index 00000000..5afd357e --- /dev/null +++ b/papers/_posts/2021-01-01-a-multi-factorial-assessment-of-functional-human-autistic-spectrum-brain-network-analysis.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A Multi-Factorial Assessment of Functional Human Autistic Spectrum Brain Network Analysis" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Artiles, Oswaldo; Saeed, Fahad; " +year: "2021" +journal: IEEE +volume: +issue: +pages: 3526-3531 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBM52615.2021.9669679" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The variability of the results obtained by the statistical analysis of functional human brain networks depend on multiple factors such as: the source of the fMRI data, the brain parcellations, the graph theory measures, and the threshold values applied to the functional connectivity matrices to obtain adjacency matrices of sparse graphs. Therefore, the brain network used for down-stream analysis is heavily dependent on the methods that are applied to the fMRI data to obtain and analyze such networks. In this paper we present the preliminary results of a multi-factorial assessment of the statistical analysis of functional human brain networks. The assessment was performed in the functional human brain networks obtained from the resting state fMRI data of ten imaging sites provided by the Autism Brain Imaging Data Exchange (ABIDE) preprocessed functional magnetic resonance database, with six different functional … diff --git a/papers/_posts/2021-01-01-asd-saenet--a-sparse-autoencoder,-and-deep-neural-network-model-for-detecting-autism-spectrum-disorder-asd-using-fmri-data.md b/papers/_posts/2021-01-01-asd-saenet--a-sparse-autoencoder,-and-deep-neural-network-model-for-detecting-autism-spectrum-disorder-asd-using-fmri-data.md new file mode 100644 index 00000000..ca9d23ea --- /dev/null +++ b/papers/_posts/2021-01-01-asd-saenet--a-sparse-autoencoder,-and-deep-neural-network-model-for-detecting-autism-spectrum-disorder-asd-using-fmri-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "ASD-SAENet: a sparse autoencoder, and deep-neural network model for detecting autism spectrum disorder (ASD) using fMRI data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Almuqhim, Fahad; Saeed, Fahad; " +year: "2021" +journal: Frontiers Media SA +volume: 15 +issue: +pages: 654315 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.3389/fncom.2021.654315/full" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Autism spectrum disorder (ASD) is a heterogenous neurodevelopmental disorder which is characterized by impaired communication, and limited social interactions. The shortcomings of current clinical approaches which are based exclusively on behavioral observation of symptomology, and poor understanding of the neurological mechanisms underlying ASD necessitates the identification of new biomarkers that can aid in study of brain development, and functioning, and can lead to accurate and early detection of ASD. In this paper, we developed a deep-learning model called ASD-SAENet for classifying patients with ASD from typical control subjects using fMRI data. We designed and implemented a sparse autoencoder (SAE) which results in optimized extraction of features that can be used for classification. These features are then fed into a deep neural network (DNN) which results in superior classification of fMRI brain scans more prone to ASD. Our proposed model is trained to optimize the classifier while improving extracted features based on both reconstructed data error and the classifier error. We evaluated our proposed deep-learning model using publicly available Autism Brain Imaging Data Exchange (ABIDE) dataset collected from 17 different research centers, and include more than 1,035 subjects. Our extensive experimentation demonstrate that ASD-SAENet exhibits comparable accuracy (70.8%), and superior specificity (79.1%) for the whole dataset as compared to other methods. Further, our experiments demonstrate superior results as compared to other state-of-the-art methods on 12 out of the 17 imaging centers exhibiting … diff --git a/papers/_posts/2021-01-01-benchmarking-mass-spectrometry-based-proteomics-algorithms-using-a-simulated-database.md b/papers/_posts/2021-01-01-benchmarking-mass-spectrometry-based-proteomics-algorithms-using-a-simulated-database.md new file mode 100644 index 00000000..6db544e5 --- /dev/null +++ b/papers/_posts/2021-01-01-benchmarking-mass-spectrometry-based-proteomics-algorithms-using-a-simulated-database.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Benchmarking mass spectrometry based proteomics algorithms using a simulated database" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Awan, Muaaz Gul; Awan, Abdullah Gul; Saeed, Fahad; " +year: "2021" +journal: Springer Vienna Vienna +volume: 10 +issue: +pages: 23 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/s13721-021-00298-3" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Protein sequencing algorithms process data from a variety of instruments that has been generated under diverse experimental conditions. Currently there is no way to predict the accuracy of an algorithm for a given data set. Most of the published algorithms and associated software has been evaluated on limited number of experimental data sets. However, these performance evaluations do not cover the complete search space the algorithm and the software might encounter in real-world. To this end, we present a database of simulated spectra that can be used to benchmark any spectra to peptide search engine. We demonstrate the usability of this database by bench marking two popular peptide sequencing engines. We show wide variation in the accuracy of peptide deductions and a complete quality profile of a given algorithm can be useful for practitioners and algorithm developers. All benchmarking … diff --git a/papers/_posts/2021-01-01-communication-avoiding-micro-architecture-to-compute-xcorr-scores-for-peptide-identification.md b/papers/_posts/2021-01-01-communication-avoiding-micro-architecture-to-compute-xcorr-scores-for-peptide-identification.md new file mode 100644 index 00000000..f98a4a61 --- /dev/null +++ b/papers/_posts/2021-01-01-communication-avoiding-micro-architecture-to-compute-xcorr-scores-for-peptide-identification.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Communication-avoiding micro-architecture to compute Xcorr scores for peptide identification" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Kumar, Sumesh; Saeed, Fahad; " +year: "2021" +journal: IEEE +volume: +issue: +pages: 99-103 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/FPL53798.2021.00024" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Database algorithms play a crucial part in systems biology studies by identifying proteins from mass spectrometry data. Many of these database search algorithms incur huge computational costs by computing similarity scores for each pair of sparse experimental spectrum and candidate theoretical spectrum vectors. Modern MS instrumentation techniques which are capable of generating high-resolution spectrometry data require comparison against an enormous search space, further emphasizing the need of efficient accelerators. Recent research has shown that the overall cost of scoring, and deducing peptides is dominated by the communication costs between different hierarchies of memory and processing units. However, these communication costs are seldom considered in accelerator-based architectures leading to inefficient DRAM accesses, and poor data-utilization due to irregular memory access patterns … diff --git a/papers/_posts/2021-01-01-deepcovidnet--deep-convolutional-neural-network-for-covid-19-detection-from-chest-radiographic-images.md b/papers/_posts/2021-01-01-deepcovidnet--deep-convolutional-neural-network-for-covid-19-detection-from-chest-radiographic-images.md new file mode 100644 index 00000000..34f95748 --- /dev/null +++ b/papers/_posts/2021-01-01-deepcovidnet--deep-convolutional-neural-network-for-covid-19-detection-from-chest-radiographic-images.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "DeepCOVIDNet: Deep Convolutional Neural Network for COVID-19 Detection from Chest Radiographic Images" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Ahmed, Khandaker Mamun; Eslami, Taban; Saeed, Fahad; Amini, M Hadi; " +year: "2021" +journal: IEEE +volume: +issue: +pages: 1703-1710 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBM52615.2021.9669767" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The novel Coronavirus Disease 2019 (COVID-19) is a global pandemic that has infected millions of people causing millions of deaths around the world. Reverse Transcription Polymerase Chain Reaction (RT-PCR) is the standard screening method for COVID-19 detection but it requires specific molecular-biology training. Moreover, the general workflow is difficult e.g. sample collection, processing time, and analysis expertise, etc. Chest radiographic image analysis can be a good alternative screening method that is faster, more efficient, and requires minimal clinical or molecular biology trained laboratory personnel. Early studies have shown that abnormalities on the chest radiographic images are likely to be the consequence of COVID-19 infection. In this study, we propose DeepCOVIDNet, a deep learning based COVID-19 detection model. Our proposed deep-learning model is a multiclass classifier that can … diff --git a/papers/_posts/2021-01-01-explainable-and-scalable-machine-learning-algorithms-for-detection-of-autism-spectrum-disorder-using-fmri-data.md b/papers/_posts/2021-01-01-explainable-and-scalable-machine-learning-algorithms-for-detection-of-autism-spectrum-disorder-using-fmri-data.md new file mode 100644 index 00000000..5f8f6c8c --- /dev/null +++ b/papers/_posts/2021-01-01-explainable-and-scalable-machine-learning-algorithms-for-detection-of-autism-spectrum-disorder-using-fmri-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Explainable and scalable machine learning algorithms for detection of autism spectrum disorder using fMRI data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Eslami, Taban; Raiker, Joseph S; Saeed, Fahad; " +year: "2021" +journal: Academic Press +volume: +issue: +pages: 39-54 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1016/B978-0-12-822822-7.00004-1" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Diagnosing autism spectrum disorder (ASD) is a challenging problem and is based purely on behavioral descriptions of symptomology (Diagnostic and Statistical Manual—5th Edition/ICD-10). Numerous limitations (e.g., informant discrepancies, lack of adherence to assessment guidelines) to current diagnostic practices have the potential to result in misdiagnosis of the disorder. Prior research provides evidence that structural and functional magnetic resonance imaging (MRI) data collected from individuals with ASD exhibit distinguishing characteristics that differ in neural patterns of the brain. Our proposed deep learning model ASD-DiagNet exhibits consistently high accuracy for classification of ASD brain scans from neurotypical scans. We have integrated traditional machine learning and deep learning techniques that allow us to isolate ASD biomarkers from MRI data sets. Our method, called Auto-ASD-Network … diff --git a/papers/_posts/2021-01-01-graph-theoretic-approach-for-the-analysis-of-comprehensive-mass-spectrometry-ms-ms-data-of-dissolved-organic-matter.md b/papers/_posts/2021-01-01-graph-theoretic-approach-for-the-analysis-of-comprehensive-mass-spectrometry-ms-ms-data-of-dissolved-organic-matter.md new file mode 100644 index 00000000..0d5a290b --- /dev/null +++ b/papers/_posts/2021-01-01-graph-theoretic-approach-for-the-analysis-of-comprehensive-mass-spectrometry-ms-ms-data-of-dissolved-organic-matter.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Graph Theoretic Approach for the Analysis of Comprehensive Mass-Spectrometry (MS/MS) Data of Dissolved Organic Matter" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Tariq, Muhammad Usman; Leyvay, Dennys; Limaz, Francisco Alberto Fernandez; Saeed, Fahad; " +year: "2021" +journal: IEEE +volume: +issue: +pages: 3742-3746 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBM52615.2021.9669289" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Dissolved organic matter (DOM) is a highly complex mixture of organic substances found in aquatic ecosystems. This mixture results from the degradation of primary producers within the ecosystem, groundwater, and the surrounding terrestrial sources. Understanding the chemical structure of DOM is crucial to assessing its impact on aquatic ecosystems. Although multiple studies have addressed the complexity of DOM, the molecular structure of this set of compounds remains unclear. In this work, we present a novel computational framework “Graph-DOM,” to assess the comprehensive fragmentation data obtained from the analysis of DOM using the Data Independent Fragmentation strategy with ESI-FT-ICR MS/MS enabling better understanding of the structural complexity of DOM. Graph-DOM uses graph algorithms to dissect a compiled output file obtained from processing hundreds of ultra-high-resolution … diff --git a/papers/_posts/2021-01-01-hicops--high-performance-computing-framework-for-tera-scale-database-search-of-mass-spectrometry-based-omics-data.md b/papers/_posts/2021-01-01-hicops--high-performance-computing-framework-for-tera-scale-database-search-of-mass-spectrometry-based-omics-data.md new file mode 100644 index 00000000..f9b60a89 --- /dev/null +++ b/papers/_posts/2021-01-01-hicops--high-performance-computing-framework-for-tera-scale-database-search-of-mass-spectrometry-based-omics-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "HiCOPS: High Performance Computing Framework for Tera-Scale Database Search of Mass Spectrometry based Omics Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Haseeb, Muhammad; Saeed, Fahad; " +year: "2021" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.48550/arXiv.2102.02286" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Database-search algorithms, that deduce peptides from Mass Spectrometry (MS) data, have tried to improve the computational efficiency to accomplish larger, and more complex systems biology studies. Existing serial, and high-performance computing (HPC) search engines, otherwise highly successful, are known to exhibit poor-scalability with increasing size of theoretical search-space needed for increased complexity of modern non-model, multi-species MS-based omics analysis. Consequently, the bottleneck for computational techniques is the communication costs of moving the data between hierarchy of memory, or processing units, and not the arithmetic operations. This post-Moore change in architecture, and demands of modern systems biology experiments have dampened the overall effectiveness of the existing HPC workflows. We present a novel efficient parallel computational method, and its implementation on memory-distributed architectures for peptide identification tool called HiCOPS, that enables more than 100-fold improvement in speed over most existing HPC proteome database search tools. HiCOPS empowers the supercomputing database search concept for comprehensive identification of peptides, and all their modified forms within a reasonable time-frame. We demonstrate this by searching Gigabytes of experimental MS data against Terabytes of databases where HiCOPS completes peptide identification in few minutes using 72 parallel nodes (1728 cores) compared to several weeks required by existing state-of-the-art tools using 1 node (24 cores); 100 minutes vs 5 weeks; 500x speedup. Finally, we formulate a … diff --git a/papers/_posts/2021-01-01-high-performance-computing-framework-for-tera-scale-database-search-of-mass-spectrometry-data.md b/papers/_posts/2021-01-01-high-performance-computing-framework-for-tera-scale-database-search-of-mass-spectrometry-data.md new file mode 100644 index 00000000..9ee76c18 --- /dev/null +++ b/papers/_posts/2021-01-01-high-performance-computing-framework-for-tera-scale-database-search-of-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "High performance computing framework for tera-scale database search of mass spectrometry data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Haseeb, Muhammad; Saeed, Fahad; " +year: "2021" +journal: Nature Publishing Group +volume: 1 +issue: +pages: 550-561 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1038/s43588-021-00113-z" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Database peptide search algorithms deduce peptides from mass spectrometry data. There has been substantial effort in improving their computational efficiency to achieve larger and more complex systems biology studies. However, modern serial and high-performance computing (HPC) algorithms exhibit suboptimal performance mainly due to their ineffective parallel designs (low resource utilization) and high overhead costs. We present an HPC framework, called HiCOPS, for efficient acceleration of the database peptide search algorithms on distributed-memory supercomputers. HiCOPS provides, on average, more than tenfold improvement in speed and superior parallel performance over several existing HPC database search software. We also formulate a mathematical model for performance analysis and optimization, and report near-optimal results for several key metrics including strong-scale efficiency … diff --git a/papers/_posts/2021-01-01-machine-learning-methods-for-diagnosing-autism-spectrum-disorder-and-attention-deficit-hyperactivity-disorder-using-functional-and-structural-mri--a-survey.md b/papers/_posts/2021-01-01-machine-learning-methods-for-diagnosing-autism-spectrum-disorder-and-attention-deficit-hyperactivity-disorder-using-functional-and-structural-mri--a-survey.md new file mode 100644 index 00000000..d646cbc4 --- /dev/null +++ b/papers/_posts/2021-01-01-machine-learning-methods-for-diagnosing-autism-spectrum-disorder-and-attention-deficit-hyperactivity-disorder-using-functional-and-structural-mri--a-survey.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Machine Learning methods for diagnosing Autism Spectrum Disorder and Attention-deficit/Hyperactivity Disorder using functional and structural MRI: A Survey" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Eslami, Taban; Almuqhim, Fahad; Raiker, Joseph S; Saeed, Fahad; " +year: "2021" +journal: Frontiers +volume: 14 +issue: +pages: 62 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.3389/fninf.2020.575999/full" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Here we summarize recent progress in machine learning model for diagnosis of Autism Spectrum Disorder (ASD) and Attention-deficit/Hyperactivity Disorder (ADHD). We outline and describe the machine-learning, especially deep-learning, techniques that are suitable for addressing research questions in this domain, pitfalls of the available methods, as well as future directions for the field. We envision a future where the diagnosis of ASD, ADHD, and other mental disorders is accomplished, and quantified using imaging techniques, such as MRI, and machine-learning models. diff --git a/papers/_posts/2021-01-01-methods-for-proteogenomics-data-analysis,-challenges,-and-scalability-bottlenecks--a-survey.md b/papers/_posts/2021-01-01-methods-for-proteogenomics-data-analysis,-challenges,-and-scalability-bottlenecks--a-survey.md new file mode 100644 index 00000000..391e3e01 --- /dev/null +++ b/papers/_posts/2021-01-01-methods-for-proteogenomics-data-analysis,-challenges,-and-scalability-bottlenecks--a-survey.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Methods for Proteogenomics Data Analysis, Challenges, and Scalability Bottlenecks: A Survey" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Tariq, Muhammad Usman; Haseeb, Muhammad; Aledhari, Mohammed; Razzak, Rehma; Parizi, Reza M; Saeed, Fahad; " +year: "2021" +journal: IEEE +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/ACCESS.2020.3047588" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Big Data Proteogenomics lies at the intersection of high-throughput Mass Spectrometry (MS) based proteomics and Next Generation Sequencing based genomics. The combined and integrated analysis of these two high-throughput technologies can help discover novel proteins using genomic, and transcriptomic data. Due to the biological significance of integrated analysis, the recent past has seen an influx of proteogenomic tools that perform various tasks, including mapping proteins to the genomic data, searching experimental MS spectra against a six-frame translation genome database, and automating the process of annotating genome sequences. To date, most of such tools have not focused on scalability issues that are inherent in proteogenomic data analysis where the size of the database is much larger than a typical protein database. These state-of-the-art tools can take more than half a month to process … diff --git a/papers/_posts/2021-01-01-neural-engineering-techniques-for-autism-spectrum-disorder--volume-1--imaging-and-signal-analysis.md b/papers/_posts/2021-01-01-neural-engineering-techniques-for-autism-spectrum-disorder--volume-1--imaging-and-signal-analysis.md new file mode 100644 index 00000000..456cf4e1 --- /dev/null +++ b/papers/_posts/2021-01-01-neural-engineering-techniques-for-autism-spectrum-disorder--volume-1--imaging-and-signal-analysis.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Neural Engineering Techniques for Autism Spectrum Disorder: Volume 1: Imaging and Signal Analysis" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "El-Baz, Ayman S; Suri, Jasjit S; " +year: "2021" +journal: Academic Press +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Neural Engineering for Autism Spectrum Disorder, Volume One: Imaging and Signal Analysis Techniques presents the latest advances in neural engineering and biomedical engineering as applied to the clinical diagnosis and treatment of Autism Spectrum Disorder (ASD). Advances in the role of neuroimaging, infrared spectroscopy, sMRI, fMRI, DTI, social behaviors and suitable data analytics useful for clinical diagnosis and research applications for Autism Spectrum Disorder are covered, including relevant case studies. The application of brain signal evaluation, EEG analytics, feature selection, and analysis of blood oxygen level-dependent (BOLD) signals are presented for detection and estimation of the degree of ASD. Presents applications of Neural Engineering and other Machine Learning techniques for the diagnosis of Autism Spectrum Disorder (ASD) Includes in-depth technical coverage of imaging and signal analysis techniques, including coverage of functional MRI, neuroimaging, infrared spectroscopy, sMRI, fMRI, DTI, and neuroanatomy of autism Covers Signal Analysis for the detection and estimation of Autism Spectrum Disorder (ASD), including brain signal analysis, EEG analytics, feature selection, and analysis of blood oxygen level-dependent (BOLD) signals for ASD Written to help engineers, computer scientists, researchers and clinicians understand the technology and applications of Neural Engineering for the detection and diagnosis of Autism Spectrum Disorder (ASD) diff --git a/papers/_posts/2021-01-01-real-time-peptide-identification-from-high-throughput-mass-spectrometry-data.md b/papers/_posts/2021-01-01-real-time-peptide-identification-from-high-throughput-mass-spectrometry-data.md new file mode 100644 index 00000000..4284d3f3 --- /dev/null +++ b/papers/_posts/2021-01-01-real-time-peptide-identification-from-high-throughput-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Real-time peptide identification from high-throughput mass-spectrometry data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Kumar, Sumesh; Saeed, Fahad; " +year: "2021" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1145/3459930.3470856" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Peptide deduction remains one of the most challenging research problems in the large-scale study of proteomes using high-throughput Mass Spectrometers. The identification of large number of proteins from complex biological samples can be carried out in two steps: 1) tryptic digestion of protein sample to isolate constituent peptides, and then generating MS/MS data using high-thought put mass spectrometers; 2) Once the data is generated various method such as database-search tools are used to compare mass-spectrometry data against a repository of known peptides. Advances in the MS instrumentation now allow generation of high-resolution data in massive volume and velocity making traditional MS based algorithms a bottleneck in the overall workflows [4]. New generation of state-of-the-art database search tools are now capable of producing highquality matches with impressively low FDR; however, the search time usually takes somewhere between a few weeks to a few months depending on the size of database and search parameters. To accelerate the overall search times, several studies have been proposed which target this computational bottleneck by exploiting specialized hardware architectures including HPC compute clusters and GPUs [2],[1]. Even with these accelerated pipelines the dream of realizing a true real-time processing and deduction of peptides from MS data is a far from realization. One bottleneck preventing the design of true real-time processing of MS based data is the cost of communication of the data required for the existing workflows [3] ie moving the data from storage to computational nodes and across … diff --git a/papers/_posts/2021-01-01-search-feasibility-in-distributed-ms-proteomics-big-data.md b/papers/_posts/2021-01-01-search-feasibility-in-distributed-ms-proteomics-big-data.md new file mode 100644 index 00000000..a56d796b --- /dev/null +++ b/papers/_posts/2021-01-01-search-feasibility-in-distributed-ms-proteomics-big-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Search feasibility in distributed MS-proteomics big data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Mohammad, Umair; Saeed, Fahad; " +year: "2021" +journal: +volume: +issue: +pages: 1-1 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1145/3459930.3470855" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Making large-scale Mass Spectrometry (MS) data FAIR (Findable, Accessible, Interoperable, Reusable) and democratizing access for the omics research community requires advance access and reuse mechanisms. In this work, we proposed a novel distributed data access infrastructure and developed a simulation test-bed to show the feasibility of this solution. In contrast to existing centralized approaches, participating nodes are relied upon to execute the search algorithm and search based on the comparison of raw spectra is supported as opposed to simple meta-data based searches. Simulation results using networking, stochastic modelling, and queuing theory, illustrated that search times were reduced by up-to 600 times for up-to a total of fifty billion spectra. Proteomics is vital because of the importance proteins to life and their role in state-of-the-art medicine such as custom drug delivery and cancer treatment … diff --git a/papers/_posts/2021-01-01-simulation-testbed-for-evaluating-distributed-querying-and-searching-of-mass-spectrometry-big-data-in-a-network-based-infrastructure.md b/papers/_posts/2021-01-01-simulation-testbed-for-evaluating-distributed-querying-and-searching-of-mass-spectrometry-big-data-in-a-network-based-infrastructure.md new file mode 100644 index 00000000..c40d7724 --- /dev/null +++ b/papers/_posts/2021-01-01-simulation-testbed-for-evaluating-distributed-querying-and-searching-of-mass-spectrometry-big-data-in-a-network-based-infrastructure.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Simulation Testbed for Evaluating Distributed Querying and Searching of Mass Spectrometry Big Data in a Network-based Infrastructure" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Mohammad, Umair; Saeed, Fahad; " +year: "2021" +journal: IEEE +volume: +issue: +pages: 137-142 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BigDataService52369.2021.00022" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Advance access and reuse mechanisms for large-scale Mass Spectrometry (MS) data are essential for democratizing data for the omics research community and making it adhere to FAIR (Findable, Accessible, Interoperable, Reusable) principles. Although a number of centralized data repositories have been established, they have been limited to search mechanisms that depend on the meta-data associated with these MS datasets. Furthermore, they require constant influx of resources for maintenance. In this paper, we proposed an alternative novel distributed infrastructure for direct MS/MS spectral search. We designed and developed a simulation testbed using concepts from computer networks, queuing theory, and stochastic simulation methods. Results show that a distributed MS search based on raw MS/MS spectra can scale gracefully for up-to 2000 participating nodes, while simultaneously processing … diff --git a/papers/_posts/2021-01-01-source-data--high-performance-computing-framework-for-tera-scale-database-search-of-mass-spectrometry-data.md b/papers/_posts/2021-01-01-source-data--high-performance-computing-framework-for-tera-scale-database-search-of-mass-spectrometry-data.md new file mode 100644 index 00000000..f585e58a --- /dev/null +++ b/papers/_posts/2021-01-01-source-data--high-performance-computing-framework-for-tera-scale-database-search-of-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Source data: high performance computing framework for tera-scale database search of mass spectrometry data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Haseeb, Muhammad; Saeed, Fahad; " +year: "2021" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + + diff --git a/papers/_posts/2021-01-01-specollate--deep-cross-modal-similarity-network-for-mass-spectrometry-data-based-peptide-deductions.md b/papers/_posts/2021-01-01-specollate--deep-cross-modal-similarity-network-for-mass-spectrometry-data-based-peptide-deductions.md new file mode 100644 index 00000000..672d3c89 --- /dev/null +++ b/papers/_posts/2021-01-01-specollate--deep-cross-modal-similarity-network-for-mass-spectrometry-data-based-peptide-deductions.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "SpeCollate: Deep cross-modal similarity network for mass spectrometry data based peptide deductions" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Tariq, Muhammad Usman; Saeed, Fahad; " +year: "2021" +journal: Public Library of Science San Francisco, CA USA +volume: 16 +issue: +pages: e0259349 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1371/journal.pone.0259349" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Historically, the database search algorithms have been the de facto standard for inferring peptides from mass spectrometry (MS) data. Database search algorithms deduce peptides by transforming theoretical peptides into theoretical spectra and matching them to the experimental spectra. Heuristic similarity-scoring functions are used to match an experimental spectrum to a theoretical spectrum. However, the heuristic nature of the scoring functions and the simple transformation of the peptides into theoretical spectra, along with noisy mass spectra for the less abundant peptides, can introduce a cascade of inaccuracies. In this paper, we design and implement a Deep Cross-Modal Similarity Network called SpeCollate, which overcomes these inaccuracies by learning the similarity function between experimental spectra and peptides directly from the labeled MS data. SpeCollate transforms spectra and peptides into a shared Euclidean subspace by learning fixed size embeddings for both. Our proposed deep-learning network trains on sextuplets of positive and negative examples coupled with our custom-designed SNAP-loss function. Online hardest negative mining is used to select the appropriate negative examples for optimal training performance. We use 4.8 million sextuplets obtained from the NIST and MassIVE peptide libraries to train the network and demonstrate that for closed search, SpeCollate is able to perform better than Crux and MSFragger in terms of the number of peptide-spectrum matches (PSMs) and unique peptides identified under 1% FDR for real-world data. SpeCollate also identifies a large number of peptides not reported by … diff --git a/papers/_posts/2021-01-01-turbobc--a-memory-efficient-and-scalable-gpu-based-betweenness-centrality-algorithm-in-the-language-of-linear-algebra.md b/papers/_posts/2021-01-01-turbobc--a-memory-efficient-and-scalable-gpu-based-betweenness-centrality-algorithm-in-the-language-of-linear-algebra.md new file mode 100644 index 00000000..594805bb --- /dev/null +++ b/papers/_posts/2021-01-01-turbobc--a-memory-efficient-and-scalable-gpu-based-betweenness-centrality-algorithm-in-the-language-of-linear-algebra.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "TurboBC: A Memory Efficient and Scalable GPU Based Betweenness Centrality Algorithm in the Language of Linear Algebra" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Artiles, Oswaldo; Saeed, Fahad; " +year: "2021" +journal: +volume: +issue: +pages: 1-10 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1145/3458744.3474047" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Betweenness centrality (BC) is a shortest path centrality metric used to measure the influence of individual vertices or edges on huge graphs that are used for modeling and analysis of human brain, omics data, or social networks. The application of the BC algorithm to modern graphs must deal with the size of the graphs, as well with highly irregular data-access patterns. These challenges are particularly important when the BC algorithm is implemented on Graphics Processing Units (GPU), due to the limited global memory of these processors, as well as the decrease in performance due to the load unbalance resulting from processing irregular data structures. In this paper, we present the first GPU based linear-algebraic formulation and implementation of BC, called TurboBC, a set of memory efficient BC algorithms that exhibits good performance and high scalability on unweighted, undirected or directed sparse … diff --git a/papers/_posts/2021-01-01-turbobfs--gpu-based-breadth-first-search-bfs-algorithms-in-the-language-of-linear-algebra.md b/papers/_posts/2021-01-01-turbobfs--gpu-based-breadth-first-search-bfs-algorithms-in-the-language-of-linear-algebra.md new file mode 100644 index 00000000..28adb258 --- /dev/null +++ b/papers/_posts/2021-01-01-turbobfs--gpu-based-breadth-first-search-bfs-algorithms-in-the-language-of-linear-algebra.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "TurboBFS: GPU Based Breadth-First Search (BFS) Algorithms in the Language of Linear Algebra" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Artiles, Oswaldo; Saeed, Fahad; " +year: "2021" +journal: IEEE +volume: +issue: +pages: 520-528 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/IPDPSW52791.2021.00084" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Graphs that are used for modeling of human brain, omics data, or social networks are huge, and manual inspection of these graph is impossible. A popular, and fundamental, method used for making sense of these large graphs is the well-known Breadth-First Search (BFS) algorithm. However, BFS suffers from large computational cost especially for big graphs of interest. More recently, the use of Graphics processing units (GPU) has been promising, but challenging because of limited global memory of GPU’s, and irregular structures of real-world graphs. In this paper, we present a GPU based linear-algebraic formulation and implementation of BFS, called TurboBFS, that exhibits excellent scalability on unweighted, undirected or directed sparse graphs of arbitrary structure. We demonstrate that our algorithms obtain up to 40 GTEPs, and are on average 15.7x, 5.8x, and 1.8x faster than the other state-of-the-art … diff --git a/papers/_posts/2022-01-01-a-easy-to-use-generalized-template-to-support-development-of-gpu-algorithms.md b/papers/_posts/2022-01-01-a-easy-to-use-generalized-template-to-support-development-of-gpu-algorithms.md new file mode 100644 index 00000000..a42bf589 --- /dev/null +++ b/papers/_posts/2022-01-01-a-easy-to-use-generalized-template-to-support-development-of-gpu-algorithms.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "A Easy to Use Generalized Template to Support Development of GPU Algorithms" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: 77-87 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_6" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Computational techniques have taken a new meaning for scientific inquiry in biology especially after the introduction of high-throughput experimental techniques. These instruments can produce massive amounts of data that needs to be processed in a scalable fashion to ensure that we can make sense of these data sets from various sources [, ]. As expected, Mass Spectrometry (MS) based omics is essential for precision medicine, cancer research, and drug discovery but the scale at which these data sets need to be processed are massive (tera- to peta-byte levels) [, –]. We have also shown that proteomics, and meta-proteomics search can be taken impractically long times [, ]. which can become a major technical hurdle in investigating these systems biology studies. The existing serial algorithms scale very poorly with increasing size of the data sets, and HPC methods are also shown to be much less than optimal [, ]. diff --git a/papers/_posts/2022-01-01-biomedical-iot--enabling-technologies,-architectural-elements,-challenges,-and-future-directions.md b/papers/_posts/2022-01-01-biomedical-iot--enabling-technologies,-architectural-elements,-challenges,-and-future-directions.md new file mode 100644 index 00000000..600ce10e --- /dev/null +++ b/papers/_posts/2022-01-01-biomedical-iot--enabling-technologies,-architectural-elements,-challenges,-and-future-directions.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Biomedical IoT: Enabling Technologies, Architectural Elements, Challenges, and Future Directions" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Aledhari, Mohammed; Razzak, Rehma; Qolomany, Basheer; Al-Fuqaha, Ala; Saeed, Fahad; " +year: "2022" +journal: IEEE +volume: 10 +issue: +pages: 31306-31339 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/ACCESS.2022.3159235" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +This paper provides a comprehensive literature review of various technologies and protocols used for medical Internet of Things (IoT) with a thorough examination of current enabling technologies, use cases, applications, and challenges. Despite recent advances, medical IoT is still not considered a routine practice. Due to regulation, ethical, and technological challenges of biomedical hardware, the growth of medical IoT is inhibited. Medical IoT continues to advance in terms of biomedical hardware, and monitoring figures like vital signs, temperature, electrical signals, oxygen levels, cancer indicators, glucose levels, and other bodily levels. In the upcoming years, medical IoT is expected replace old healthcare systems. In comparison to other survey papers on this topic, our paper provides a thorough summary of the most relevant protocols and technologies specifically for medical IoT as well as the challenges. Our … diff --git a/papers/_posts/2022-01-01-classification-of-autism-spectrum-disorder-using-rs-fmri-data-and-graph-convolutional-networks.md b/papers/_posts/2022-01-01-classification-of-autism-spectrum-disorder-using-rs-fmri-data-and-graph-convolutional-networks.md new file mode 100644 index 00000000..bfa8fc77 --- /dev/null +++ b/papers/_posts/2022-01-01-classification-of-autism-spectrum-disorder-using-rs-fmri-data-and-graph-convolutional-networks.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Classification of Autism Spectrum Disorder Using rs-fMRI data and Graph Convolutional Networks" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Yang, Tianren; Al-Duailij, Mai A; Bozdag, Serdar; Saeed, Fahad; " +year: "2022" +journal: IEEE +volume: +issue: +pages: 3131-3138 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BigData55660.2022.10021070" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Autism spectrum disorder (ASD) affects large number of children and adults in the US, and worldwide. Early and quick diagnosis of ASD can improve the quality of life significantly both for patients and their families. Prior research provides strong evidence that structural and functional magnetic resonance imaging (MRI) data collected from individuals with ASD exhibit distinguishing characteristics that differ in local and global, spatial and temporal neural patterns of the brain – and therefore can be used for diagnostic purposes for various mental disorders. However, the data from MRI are high-dimensional and advanced methods are needed to make sense out of these datasets. In this paper, we present a novel model based on graph convolutional network (GCN) that can utilize resting state fMRI (rs-fMRI) data to classify ASD subjects from health controls (HC). In addition to using the graph from traditional correlation … diff --git a/papers/_posts/2022-01-01-communication-lower-bounds-for-distributed-memory-computations-for-mass-spectrometry-based-omics-data.md b/papers/_posts/2022-01-01-communication-lower-bounds-for-distributed-memory-computations-for-mass-spectrometry-based-omics-data.md new file mode 100644 index 00000000..334895af --- /dev/null +++ b/papers/_posts/2022-01-01-communication-lower-bounds-for-distributed-memory-computations-for-mass-spectrometry-based-omics-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Communication lower-bounds for distributed-memory computations for mass spectrometry based omics data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; Iyengar, SS; " +year: "2022" +journal: Academic Press +volume: 161 +issue: +pages: 37-47 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1016/j.jpdc.2021.11.001" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Mass spectrometry (MS) based omics data analysis require significant time and resources. To date, few parallel algorithms have been proposed for deducing peptides from mass spectrometry-based data. However, these parallel algorithms were designed, and developed when the amount of data that needed to be processed was smaller in scale. In this paper, we prove that the communication bound that is reached by the existing parallel algorithms is Ω (m n+ 2 r q p), where m and n are the dimensions of the theoretical database matrix, q and r are dimensions of spectra, and p is the number of processors. We further prove that communication-optimal strategy with fast-memory M= m n+ 2 q r p can achieve Ω (2 m n q p) but is not achieved by any existing parallel proteomics algorithms till date. To validate our claim, we performed a meta-analysis of published parallel algorithms, and their performance results. We … diff --git a/papers/_posts/2022-01-01-computational-cpu-gpu-template-for-pre-processing-of-floating-point-ms-data.md b/papers/_posts/2022-01-01-computational-cpu-gpu-template-for-pre-processing-of-floating-point-ms-data.md new file mode 100644 index 00000000..01fd088d --- /dev/null +++ b/papers/_posts/2022-01-01-computational-cpu-gpu-template-for-pre-processing-of-floating-point-ms-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Computational CPU-GPU Template for Pre-processing of Floating-Point MS Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: 89-97 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_7" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The data from MS spectra is usually stored as short list of numbers. To process these spectra, more often than not, one has to “see” inside the data to make data pre- and post-processing decisions . Sorting, and searching of data for an array of number is one of the oldest problems in computer science. There has been significant effort in developing algorithms that can sort very large array . diff --git a/papers/_posts/2022-01-01-existing-hpc-methods-and-the-communication-lower-bounds-for-distributed-memory-computations-for-mass-spectrometry-based-omics-data.md b/papers/_posts/2022-01-01-existing-hpc-methods-and-the-communication-lower-bounds-for-distributed-memory-computations-for-mass-spectrometry-based-omics-data.md new file mode 100644 index 00000000..0761bfd5 --- /dev/null +++ b/papers/_posts/2022-01-01-existing-hpc-methods-and-the-communication-lower-bounds-for-distributed-memory-computations-for-mass-spectrometry-based-omics-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Existing HPC Methods and the Communication Lower Bounds for Distributed-Memory Computations for Mass Spectrometry-Based Omics Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: 21-35 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_3" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Mass spectrometry (MS)-based omics data analysis requires substantial time and resources which has necessitated the need for high-performance computing (HPC) methods. Few parallel algorithms have been proposed, designed, and developed when the amount of data that needed to be processed was smaller in scale, i.e. only a few PTM were of interest and would satisfy when only a shorter theoretical database was needed for computations. diff --git a/papers/_posts/2022-01-01-fast-spectral-pre-processing-for-big-ms-data.md b/papers/_posts/2022-01-01-fast-spectral-pre-processing-for-big-ms-data.md new file mode 100644 index 00000000..260b5b41 --- /dev/null +++ b/papers/_posts/2022-01-01-fast-spectral-pre-processing-for-big-ms-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Fast Spectral Pre-processing for Big MS Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: 57-75 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_5" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In this chapter, we discuss and introduce a data pre-processing algorithm for dimensionality reduction of big MS data. We will start by discussing a few spectral pre-processing methods followed by the introduction MS-REDUCE which is a highly efficient method for processing MS data. diff --git a/papers/_posts/2022-01-01-g-msr--a-gpu-based-dimensionality-reduction-algorithm.md b/papers/_posts/2022-01-01-g-msr--a-gpu-based-dimensionality-reduction-algorithm.md new file mode 100644 index 00000000..9968ba06 --- /dev/null +++ b/papers/_posts/2022-01-01-g-msr--a-gpu-based-dimensionality-reduction-algorithm.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "G-MSR: A GPU-Based Dimensionality Reduction Algorithm" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: 99-110 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_8" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In our previous chapters, we have introduced a generalized strategy has been devised that can be used for processing of MS-based omics data sets on a CPU-GPU architecture. diff --git a/papers/_posts/2022-01-01-high-performance-algorithms-for-mass-spectrometry-based-omics.md b/papers/_posts/2022-01-01-high-performance-algorithms-for-mass-spectrometry-based-omics.md new file mode 100644 index 00000000..cfa8f134 --- /dev/null +++ b/papers/_posts/2022-01-01-high-performance-algorithms-for-mass-spectrometry-based-omics.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "High-Performance Algorithms for Mass Spectrometry-Based Omics" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +To date, the processing of high-throughput Mass Spectrometry (MS) data is primarily accomplished using serial algorithms. Developing new methods to process MS data is an active area of research [1], but there is no single strategy that focuses on scalability of MS-based methods [2]. MS is a diverse and versatile technology for highthroughput functional characterization of proteins, small molecules, and metabolites in complex biological mixtures. In the recent years, the technology has rapidly evolved and is now capable of generating increasingly large (multiple terabytes per experiment)[1] and complex (multiple species/microbiome/high-dimensional) data sets [3]. This rapid advances in MS instrumentation must be matched by equally fast and rapid evolution of scalable methods developed for the analysis of these complex data sets. Ideally, the new methods should leverage the rich heterogeneous computational … diff --git a/papers/_posts/2022-01-01-high-performance-computing-strategy-using-distributed-memory-supercomputers.md b/papers/_posts/2022-01-01-high-performance-computing-strategy-using-distributed-memory-supercomputers.md new file mode 100644 index 00000000..7261e9f3 --- /dev/null +++ b/papers/_posts/2022-01-01-high-performance-computing-strategy-using-distributed-memory-supercomputers.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "High-Performance Computing Strategy Using Distributed-Memory Supercomputers" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: 37-56 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_4" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Database peptide search is the most commonly employed computational technique to deduce peptides from the experimentally obtained mass spectrometry data . In this technique, the experimental spectral data are compared against a protein sequence database through various search algorithms in order to assign the correct peptide sequence to each experimental spectrum . Since the experimental spectra data (histogram-like data) and the peptide sequence data (text data) are not one-to-one comparable, database search simulates the mass spectrometry process in silico to generate theoretical spectra from the peptide sequences in the database . diff --git a/papers/_posts/2022-01-01-introduction-to-mass-spectrometry-data.md b/papers/_posts/2022-01-01-introduction-to-mass-spectrometry-data.md new file mode 100644 index 00000000..d1d3261e --- /dev/null +++ b/papers/_posts/2022-01-01-introduction-to-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Introduction to Mass Spectrometry Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: 7-19 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_2" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Mass spectrometry (MS) is used to elucidate the chemical structures of peptide molecules and has numerous systems biology applications [, , , , , , , , , , –]. Mass spectrometryMass spectrometry is also used in metabolomicsMetabolomics, glycomicsGlycomics, lipidomicsLipidomics and clinical applications [, , , , –]. diff --git a/papers/_posts/2022-01-01-machine-learning-and-the-future-of-hpc-for-ms-based-omics.md b/papers/_posts/2022-01-01-machine-learning-and-the-future-of-hpc-for-ms-based-omics.md new file mode 100644 index 00000000..0a4152f8 --- /dev/null +++ b/papers/_posts/2022-01-01-machine-learning-and-the-future-of-hpc-for-ms-based-omics.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Machine-Learning and the Future of HPC for MS-Based Omics" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: 125-129 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_10" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +To date, MS proteomics data is identified using database search algorithms based purely on numerical techniques or some denovo techniques that allow peptide identification without using databases. Currently, there is no single strategy from database search or denovo techniques that can claim as the most accurate strategy. diff --git a/papers/_posts/2022-01-01-molecular-level-characterization-of-dom-along-a-freshwater-to-estuarine-coastal-gradient-in-the-florida-everglades.md b/papers/_posts/2022-01-01-molecular-level-characterization-of-dom-along-a-freshwater-to-estuarine-coastal-gradient-in-the-florida-everglades.md new file mode 100644 index 00000000..3d3a35b8 --- /dev/null +++ b/papers/_posts/2022-01-01-molecular-level-characterization-of-dom-along-a-freshwater-to-estuarine-coastal-gradient-in-the-florida-everglades.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Molecular level characterization of DOM along a freshwater-to-estuarine coastal gradient in the Florida Everglades" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Leyva, Dennys; Jaffé, Rudolf; Courson, Jessica; Kominoski, John S; Tariq, Muhammad Usman; Saeed, Fahad; Fernandez-Lima, Francisco; " +year: "2022" +journal: Springer International Publishing Cham +volume: 84 +issue: +pages: 63 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/s00027-022-00887-y" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Understanding dissolved organic matter (DOM) export to the ocean is needed to assess the impact of climate change on the global carbon cycle. The molecular-level characterization of DOM compositional variability and complexity in aquatic ecosystems has been analytically challenging. Advanced analytical studies based on ultra-high resolution mass spectrometry (FT ICR MS) have proven highly successful to better understand the dynamics of DOM in coastal ecosystems. In this work, the molecular signature of DOM along a freshwater-to-estuarine gradient in the Harney River, Florida Everglades was analyzed for the first time using a novel approach based on tandem high resolution ion mobility and ultra-high resolution mass spectrometry (ESI-TIMS-FT ICR MS). This method enhances traditional DOM molecular characterization by including the molecular isomeric complexity. An average of six and up to 12 … diff --git a/papers/_posts/2022-01-01-need-for-high-performance-computing-for-ms-based-omics-data-analysis.md b/papers/_posts/2022-01-01-need-for-high-performance-computing-for-ms-based-omics-data-analysis.md new file mode 100644 index 00000000..818e52ce --- /dev/null +++ b/papers/_posts/2022-01-01-need-for-high-performance-computing-for-ms-based-omics-data-analysis.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Need for High-Performance Computing for MS-Based Omics Data Analysis" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: Springer +volume: +issue: +pages: 1-5 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_1" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +For the past 30 years, significant efforts were invested for the development of designing and implementing more efficient scoringScoring functions which included highly successful search enginesSearch-engines [, , , –]. Similar to other domains, numerical algorithms were developed for Mass Spectrometry (MS)-based peptidePeptide deduction and are designed and implemented by assuming number of arithmetic operationsArithmetic operations as the sole metric for efficiencyEfficiency. diff --git a/papers/_posts/2022-01-01-re-configurable-hardware-for-computational-proteomics.md b/papers/_posts/2022-01-01-re-configurable-hardware-for-computational-proteomics.md new file mode 100644 index 00000000..a0b55c48 --- /dev/null +++ b/papers/_posts/2022-01-01-re-configurable-hardware-for-computational-proteomics.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Re-configurable Hardware for Computational Proteomics" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; Kumar, Sumesh; " +year: "2022" +journal: Springer +volume: +issue: +pages: 111-124 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-01960-9_9" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In 1984, when the world was introduced to the first-ever reconfigurable hardware (FPGAs) device, it offered to solve a critical problem faced during the implementation of application specific integrated chips (ASIC). Even though FPGAs ran at a clock speed much slower than that of an ASIC, they provided an attractive solution to emulate the design logic and verify the functional and timing performance at the early stages of the design process. diff --git a/papers/_posts/2022-01-01-spertl--epileptic-seizure-prediction-using-eeg-with-resnets-and-transfer-learning.md b/papers/_posts/2022-01-01-spertl--epileptic-seizure-prediction-using-eeg-with-resnets-and-transfer-learning.md new file mode 100644 index 00000000..9d9ae8f2 --- /dev/null +++ b/papers/_posts/2022-01-01-spertl--epileptic-seizure-prediction-using-eeg-with-resnets-and-transfer-learning.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "SPERTL: Epileptic Seizure Prediction using EEG with ResNets and Transfer Learning" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Mohammad, Umair; Saeed, Fahad; " +year: "2022" +journal: IEEE +volume: +issue: +pages: 1-5 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BHI56158.2022.9926767" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Epilepsy is a chronic condition that causes repeat unprovoked seizures and many epileptics either develop resistance to medications and/or are not suitable candidates for surgical solutions. Hence, these recurring unpredictable seizures can have a severely negative impact on quality of life including an elevated risk of injury, social stigmatization, inability to take part in essential activities such as driving and possibly reduced access to healthcare. A predictive system that informs patients and caregivers about a potential upcoming seizure ahead of time is not only desirable but an urgent necessity. In this paper, we contribute by designing and developing patient-specific epileptic seizure (ES) prediction models using only electroencephalography (EEG) data with residual neural networks (ResNets) and transfer learning (TL) - (SPERTL). We train our proposed model on EEG data from 20 patients with a seizure … diff --git a/papers/_posts/2022-01-01-systems-and-methods-for-diagnosing-autism-spectrum-disorder-using-fmri-data.md b/papers/_posts/2022-01-01-systems-and-methods-for-diagnosing-autism-spectrum-disorder-using-fmri-data.md new file mode 100644 index 00000000..25a33341 --- /dev/null +++ b/papers/_posts/2022-01-01-systems-and-methods-for-diagnosing-autism-spectrum-disorder-using-fmri-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Systems And Methods For Diagnosing Autism Spectrum Disorder Using fMRI Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Almuqhim, Fahad; " +year: "2022" +journal: US Patent 11,379,981 +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Systems and methods for diagnosing autism spectrum disorder (ASD) using only functional magnetic resonance imaging (fMRI) data are provided. Machine learning infrastructure can be used to identify reliable biomarkers of ASD in order to classify patients with ASD from among a group of typical control subjects using only fMRI. A sparse autoencoder (SAE) can be used, resulting in optimized extraction of features that can be used for classification. These features can then be fed into a deep neural network (DNN), which results in classification of fMRI brain scans more prone to ASD. The model can be trained to optimize the classifier while improving extracted features based on both reconstructed data error and the classifier error. diff --git a/papers/_posts/2022-01-01-systems-and-methods-for-measuring-similarity-between-mass-spectra-and-peptides.md b/papers/_posts/2022-01-01-systems-and-methods-for-measuring-similarity-between-mass-spectra-and-peptides.md new file mode 100644 index 00000000..acfa6c25 --- /dev/null +++ b/papers/_posts/2022-01-01-systems-and-methods-for-measuring-similarity-between-mass-spectra-and-peptides.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Systems and methods for measuring similarity between mass spectra and peptides" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Tariq, Muhammad Usman; " +year: "2022" +journal: US Patent 11,251,031 +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Systems and methods for measuring cross-modal similarity between mass spectra and peptides are provided. A deep learning network can be used and, by training on a variety of labeled spectra, the network can embed both spectra and peptides onto a Euclidean subspace where the similarity is measured by the L2 distance between different points. The network can be trained on a novel loss function, which can calculate the gradients from sextuplets of data points. diff --git a/papers/_posts/2022-01-01-systems-and-methods-for-peptide-identification.md b/papers/_posts/2022-01-01-systems-and-methods-for-peptide-identification.md new file mode 100644 index 00000000..443b3bf4 --- /dev/null +++ b/papers/_posts/2022-01-01-systems-and-methods-for-peptide-identification.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Systems and methods for peptide identification" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Saeed, Fahad; Haseeb, Muhammad; " +year: "2022" +journal: US Patent 11,309,061 +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Provided are parallel computational methods and their implementation on memory-distributed architectures for a peptide identification tool, called HiCOPS, that enables more than 100-fold improvement in speed over existing HPC proteome database search tools. HiCOPS empowers the supercomputing database search for comprehensive identification of peptides and all their modified forms within a reasonable timeframe. Searching Gigabytes of experimen tal mass spectrometry data against Terabytes of databases demonstrates peptide identification in minutes compared to days or weeks, providing multiple orders of magnitude improvements in processing times. Also provided is a theo retical framework for a novel overhead-avoiding strategy, resulting in superior performance evaluation results for key metrics including execution time, CPU utilization, and I/O efficiency. a diff --git a/papers/_posts/2022-01-01-unsupervised-structural-classification-of-dissolved-organic-matter-based-on-fragmentation-pathways.md b/papers/_posts/2022-01-01-unsupervised-structural-classification-of-dissolved-organic-matter-based-on-fragmentation-pathways.md new file mode 100644 index 00000000..e679bef0 --- /dev/null +++ b/papers/_posts/2022-01-01-unsupervised-structural-classification-of-dissolved-organic-matter-based-on-fragmentation-pathways.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Unsupervised structural classification of dissolved organic matter based on fragmentation pathways" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Leyva, Dennys; Tariq, Muhammad Usman; Jaffé, Rudolf; Saeed, Fahad; Lima, Francisco Fernandez; " +year: "2022" +journal: American Chemical Society +volume: 56 +issue: +pages: 1458-1468 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1021/acs.est.1c04726" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Dissolved organic matter (DOM) is considered an essential component of the Earth’s ecological and biogeochemical processes. Structural information of DOM components at the molecular level remains one of the most extraordinary analytical challenges. Advances in determination of chemical formulas from the molecular studies of DOM have provided limited indications on structural signatures and potential reaction pathways. In this work, we extend the structural characterization of a wetland DOM sample using precursor and fragment molecular ions obtained by a sequential electrospray ionization–Fourier transform–ion cyclotron resonance tandem mass spectrometry (ESI-FT-ICR CASI-CID MS/MS) approach. The DOM chemical complexity resulted in near 900 precursors (P) and 24 000 fragment (F) molecular ions over a small m/z 261–477 range. The DOM structural content was dissected into families of … diff --git a/papers/_posts/2023-01-01-22nd-ieee-international-workshop-on-high-performance-computational-biology-hicomb-2023.md b/papers/_posts/2023-01-01-22nd-ieee-international-workshop-on-high-performance-computational-biology-hicomb-2023.md new file mode 100644 index 00000000..83176b06 --- /dev/null +++ b/papers/_posts/2023-01-01-22nd-ieee-international-workshop-on-high-performance-computational-biology-hicomb-2023.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "22nd IEEE International Workshop on High Performance Computational Biology (HiCOMB 2023)" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "de Melo, Alba Cristina MA; Kalyanaraman, Ananth; Saeed, Fahad; Bozdag, Serdar; Ahmed, Zeeshan; Alser, Mohammed; Awan, Muaaz Gul; Baur, Brittany; Bhowmick, Sanjukta; Bose, Banabithi; " +year: "2023" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The size and complexity of genomic and biomedical big data continue to grow at a exponential pace, and the analysis of these complex, noisy, data sets demands efficient algorithms and high performance computing architectures. Hence, high-performance computing (HPC) has become an integral part of research and development in bioinformatics, computational biology, and medical and health informatics. The goal of the HiCOMB workshop is to showcase novel HPC research and technologies to solve data-and compute-intensive problems arising from all areas of computational life sciences. The workshop will feature contributed papers as well as invited talks from reputed researchers in the field.This year's program will feature a keynote talk by Valerie Scheider, from the Intramural research program at National Library of Medicine, National Institutes of Health, and one invited talk by Daniel Jacobson, from Oak … diff --git a/papers/_posts/2023-01-01-asd-grestm--deep-learning-framework-for-asd-classification-using-gramian-angular-field.md b/papers/_posts/2023-01-01-asd-grestm--deep-learning-framework-for-asd-classification-using-gramian-angular-field.md new file mode 100644 index 00000000..66617044 --- /dev/null +++ b/papers/_posts/2023-01-01-asd-grestm--deep-learning-framework-for-asd-classification-using-gramian-angular-field.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "ASD-GResTM: Deep Learning Framework for ASD classification using Gramian Angular Field" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Almuqhim, Fahad; Saeed, Fahad; " +year: "2023" +journal: IEEE +volume: +issue: +pages: 2837-2843 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBM58861.2023.10385743" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Autism Spectrum Disorder (ASD) is a heterogeneous disorder in children, and the current clinical diagnosis is accomplished using behavioral, cognitive, developmental, and language metrics. These clinical metrics can be imperfect measures as they are subject to high test-retest variability, and are influenced by assessment factors such as environment, social structure, or comorbid disorders. Advances in neuroimaging coupled with machine-learning provides an opportunity to develop methods that are more quantifiable, and reliable than existing clinical techniques. In this paper, we design and develop a deep-learning model that operates on functional magnetic resonance imaging (fMRI) data, and can classify between ASD and neurotypical brains. We introduce a novel strategy to transform time-series data extracted from fMRI signals into Gramian Angular Field (GAF) while locking in the temporal and spatial … diff --git a/papers/_posts/2023-01-01-confounding-effects-on-the-performance-of-machine-learning-analysis-of-static-functional-connectivity-computed-from-rs-fmri-multi-site-data.md b/papers/_posts/2023-01-01-confounding-effects-on-the-performance-of-machine-learning-analysis-of-static-functional-connectivity-computed-from-rs-fmri-multi-site-data.md new file mode 100644 index 00000000..9b67eabe --- /dev/null +++ b/papers/_posts/2023-01-01-confounding-effects-on-the-performance-of-machine-learning-analysis-of-static-functional-connectivity-computed-from-rs-fmri-multi-site-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Confounding Effects on the Performance of Machine Learning Analysis of Static Functional Connectivity Computed from rs-fMRI Multi-site Data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Artiles, Oswaldo; Al Masry, Zeina; Saeed, Fahad; " +year: "2023" +journal: Springer US New York +volume: 21 +issue: +pages: 651-668 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/s12021-023-09639-1" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Resting-state functional magnetic resonance imaging (rs-fMRI) is a non-invasive imaging technique widely used in neuroscience to understand the functional connectivity of the human brain. While rs-fMRI multi-site data can help to understand the inner working of the brain, the data acquisition and processing of this data has many challenges. One of the challenges is the variability of the data associated with different acquisitions sites, and different MRI machines vendors. Other factors such as population heterogeneity among different sites, with variables such as age and gender of the subjects, must also be considered. Given that most of the machine-learning models are developed using these rs-fMRI multi-site data sets, the intrinsic confounding effects can adversely affect the generalizability and reliability of these computational methods, as well as the imposition of upper limits on the classification scores. This … diff --git a/papers/_posts/2023-01-01-description-of-dissolved-organic-matter-transformational-networks-at-the-molecular-level.md b/papers/_posts/2023-01-01-description-of-dissolved-organic-matter-transformational-networks-at-the-molecular-level.md new file mode 100644 index 00000000..60522a28 --- /dev/null +++ b/papers/_posts/2023-01-01-description-of-dissolved-organic-matter-transformational-networks-at-the-molecular-level.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Description of Dissolved Organic Matter Transformational Networks at the Molecular Level" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Leyva, Dennys; Usman Tariq, Muhammad; Jaffé, Rudolf; Saeed, Fahad; Fernandez-Lima, Francisco; " +year: "2023" +journal: American Chemical Society +volume: 57 +issue: +pages: 2672-2681 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1021/acs.est.2c04715" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Dissolved Organic Matter (DOM) is an important component of the global carbon cycle. Unscrambling the structural footprint of DOM is key to understand its biogeochemical transformations at the mechanistic level. Although numerous studies have improved our knowledge of DOM chemical makeup, its three-dimensional picture remains largely unrevealed. In this work, we compare four solid phase extracted (SPE) DOM samples from three different freshwater ecosystems using high resolution mobility and ultrahigh-resolution Fourier transform ion cyclotron resonance tandem mass spectrometry (FT-ICR MS/MS). Structural families were identified based on neutral losses at the level of nominal mass using continuous accumulation of selected ions-collision induced dissociation (CASI-CID)FT-ICR MS/MS. Comparison of the structural families indicated dissimilarities in the structural footprint of this sample set. The … diff --git a/papers/_posts/2023-01-01-energy-efficient-ai-ml-based-continuous-monitoring-at-the-edge--ecg-and-eeg-case-study.md b/papers/_posts/2023-01-01-energy-efficient-ai-ml-based-continuous-monitoring-at-the-edge--ecg-and-eeg-case-study.md new file mode 100644 index 00000000..f2b92412 --- /dev/null +++ b/papers/_posts/2023-01-01-energy-efficient-ai-ml-based-continuous-monitoring-at-the-edge--ecg-and-eeg-case-study.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Energy Efficient AI/ML based Continuous Monitoring at the Edge: ECG and EEG Case Study" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Mohammad, Umair; Saeed, Fahad; " +year: "2023" +journal: IEEE +volume: +issue: +pages: 3313-3320 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/BIBM58861.2023.10385620" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In this paper, we propose an energy-efficient approach for machine-learning based continuous testing and monitoring of long-term patients at the wireless edge. The approach is applicable for any wearable sensors that generate time-series data. Our scheme simultaneously performs sensor-server clustering while ensuring the delay requirements of every user are met. In contrast to previous works on task offloading for generic edge computing/machine learning, our proposed model considers application specific parameters including the sampling rate, measurement duration and number of input channels/leads. We formulate the problem as a mixed integer nonlinear program (MINLP) and propose a heuristic solution. Two applications, cardiac event prediction from a wearable electrocardiograms (ECG), and epileptic seizure prediction from wearable scalp electroencephalography (EEG) are used to demonstrate the … diff --git a/papers/_posts/2023-01-01-gpu-acceleration-of-the-distributed-memory-database-peptide-search-of-mass-spectrometry-data.md b/papers/_posts/2023-01-01-gpu-acceleration-of-the-distributed-memory-database-peptide-search-of-mass-spectrometry-data.md new file mode 100644 index 00000000..f2bb4587 --- /dev/null +++ b/papers/_posts/2023-01-01-gpu-acceleration-of-the-distributed-memory-database-peptide-search-of-mass-spectrometry-data.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "GPU-acceleration of the distributed-memory database peptide search of mass spectrometry data" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Haseeb, Muhammad; Saeed, Fahad; " +year: "2023" +journal: Nature Publishing Group UK London +volume: 13 +issue: +pages: 18713 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1038/s41598-023-43033-w" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Database peptide search is the primary computational technique for identifying peptides from the mass spectrometry (MS) data. Graphical Processing Units (GPU) computing is now ubiquitous in the current-generation of high-performance computing (HPC) systems, yet its application in the database peptide search domain remains limited. Part of the reason is the use of sub-optimal algorithms in the existing GPU-accelerated methods resulting in significantly inefficient hardware utilization. In this paper, we design and implement a new-age CPU-GPU HPC framework, called GiCOPS, for efficient and complete GPU-acceleration of the modern database peptide search algorithms on supercomputers. Our experimentation shows that the GiCOPS exhibits between 1.2 to 5 speed improvement over its CPU-only predecessor, HiCOPS, and over 10 improvement over several existing GPU-based database search … diff --git a/papers/_posts/2023-01-01-high-performance-computing-algorithms-for-accelerating-peptide-identification-from-mass-spectrometry-data-using-heterogeneous-supercomputers.md b/papers/_posts/2023-01-01-high-performance-computing-algorithms-for-accelerating-peptide-identification-from-mass-spectrometry-data-using-heterogeneous-supercomputers.md new file mode 100644 index 00000000..f16722fc --- /dev/null +++ b/papers/_posts/2023-01-01-high-performance-computing-algorithms-for-accelerating-peptide-identification-from-mass-spectrometry-data-using-heterogeneous-supercomputers.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "High Performance Computing Algorithms for Accelerating Peptide Identification from Mass-Spectrometry Data Using Heterogeneous Supercomputers" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Haseeb, Muhammad; Saeed, Fahad; " +year: "2023" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Fast and accurate identification of peptides and proteins from the mass spectrometry (MS) data is a critical problem in modern systems biology. Database peptide search is the most commonly used computational method to identify peptide sequences from the MS data. In this method, giga-bytes of experimentally generated MS data are compared against tera-byte sized databases of theoretically simulated MS data resulting in a compute-and data-intensive problem requiring days or weeks of computational times on desktop machines. Existing serial and high performance computing (HPC) algorithms strive to accelerate and improve the computational efficiency of the search, but exhibit sub-optimal performances due to their inefficient parallelization models, low resource utilization and high overhead costs. diff --git "a/papers/_posts/2023-01-01-ppad--a-deep-learning-architecture-to-predict-progression-of-alzheimer\342\200\231s-disease.md" "b/papers/_posts/2023-01-01-ppad--a-deep-learning-architecture-to-predict-progression-of-alzheimer\342\200\231s-disease.md" new file mode 100644 index 00000000..b4c07bc8 --- /dev/null +++ "b/papers/_posts/2023-01-01-ppad--a-deep-learning-architecture-to-predict-progression-of-alzheimer\342\200\231s-disease.md" @@ -0,0 +1,40 @@ +--- +layout: paper +title: "PPAD: a deep learning architecture to predict progression of Alzheimer’s disease" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Al Olaimat, Mohammad; Martinez, Jared; Saeed, Fahad; Bozdag, Serdar; Alzheimer’s Disease Neuroimaging Initiative; " +year: "2023" +journal: Oxford University Press +volume: 39 +issue: +pages: i149-i157 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1093/bioinformatics/btad249" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Motivation Alzheimer’s disease (AD) is a neurodegenerative disease that affects millions of people worldwide. Mild cognitive impairment (MCI) is an intermediary stage between cognitively normal state and AD. Not all people who have MCI convert to AD. The diagnosis of AD is made after significant symptoms of dementia such as short-term memory loss are already present. Since AD is currently an irreversible disease, diagnosis at the onset of the disease brings a huge burden on patients, their caregivers, and the healthcare sector. Thus, there is a crucial need to develop methods for the early prediction AD for patients who have MCI. Recurrent neural networks (RNN) have been successfully used to handle electronic health records (EHR) for predicting conversion from MCI to AD. However, RNN ignores irregular time intervals between successive events which occurs common in electronic health … diff --git "a/papers/_posts/2023-01-01-pvtad--alzheimer\342\200\231s-disease-diagnosis-using-pyramid-vision-transformer-applied-to-white-matter-of-t1-weighted-structural-mri-data.md" "b/papers/_posts/2023-01-01-pvtad--alzheimer\342\200\231s-disease-diagnosis-using-pyramid-vision-transformer-applied-to-white-matter-of-t1-weighted-structural-mri-data.md" new file mode 100644 index 00000000..fe33b1e4 --- /dev/null +++ "b/papers/_posts/2023-01-01-pvtad--alzheimer\342\200\231s-disease-diagnosis-using-pyramid-vision-transformer-applied-to-white-matter-of-t1-weighted-structural-mri-data.md" @@ -0,0 +1,40 @@ +--- +layout: paper +title: "PVTAD: ALZHEIMER’S DISEASE DIAGNOSIS USING PYRAMID VISION TRANSFORMER APPLIED TO WHITE MATTER OF T1-WEIGHTED STRUCTURAL MRI DATA" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Aghdam, Maryam Akhavan; Bozdag, Serdar; Saeed, Fahad; Alzheimer’s Disease Neuroimaging Initiative; " +year: "2023" +journal: Cold Spring Harbor Laboratory Preprints +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1101/2023.11.17.567617" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Alzheimer's disease (AD) is a neurodegenerative disorder, and timely diagnosis is crucial for early interventions. AD is known to have disruptive local and global brain neural connections that may be instrumental in understanding and extracting specific biomarkers. Previous machine-learning approaches are mostly based on convolutional neural network (CNN) and standard vision transformer (ViT) models which may not sufficiently capture the multidimensional local and global patterns that may be indicative of AD. Therefore, in this paper, we propose a novel approach called PVTAD to classify AD and cognitively normal (CN) cases using pretrained pyramid vision transformer (PVT) and white matter (WM) of T1-weighted structural MRI (sMRI) data. Our approach combines the advantages of CNN and standard ViT to extract both local and global features indicative of AD from the WM coronal middle slices. We … diff --git a/papers/_posts/2023-01-01-q-casa-invited-speakers-quantum-centric-supercomputing-strategies-for-neuroscience-problems--challenges-and-progress.md b/papers/_posts/2023-01-01-q-casa-invited-speakers-quantum-centric-supercomputing-strategies-for-neuroscience-problems--challenges-and-progress.md new file mode 100644 index 00000000..f4ef76b5 --- /dev/null +++ b/papers/_posts/2023-01-01-q-casa-invited-speakers-quantum-centric-supercomputing-strategies-for-neuroscience-problems--challenges-and-progress.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Q-CASA Invited Speakers Quantum-Centric Supercomputing Strategies for Neuroscience problems: Challenges and Progress" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Loredo, Robert; Saeed, Fahad; " +year: "2023" +journal: IEEE +volume: +issue: +pages: 499-499 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/IPDPSW59300.2023.00087" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +For decades high-performance computing systems have allowed researchers, scientists and engineers to solve complex data- and compute-intensive problems. These parallel and distributed computing systems provided researchers to develop state of the art solutions in areas such as machine learning, health, and life science. While successful for a large number of problems, these systems face limitations when handling intractable problems which require either an enormous number of resources (i.e. number of bits, or power) or time to solve a problem (i.e. factoring integers). In this work we describe how quantum computers can enhance these systems by incorporating quantum computational principles to create a quantum-centric supercomputer. Recent developments include quantum middleware, error mitigation, and error suppression techniques to curtail these issues in near-term quantum devices and enable … diff --git a/papers/_posts/2023-01-01-statistical-and-machine-learning-analysis-of-the-human-brain-functional-network-in-a-multi-site-resting-state-functional-mri-database-framework.md b/papers/_posts/2023-01-01-statistical-and-machine-learning-analysis-of-the-human-brain-functional-network-in-a-multi-site-resting-state-functional-mri-database-framework.md new file mode 100644 index 00000000..41164aaf --- /dev/null +++ b/papers/_posts/2023-01-01-statistical-and-machine-learning-analysis-of-the-human-brain-functional-network-in-a-multi-site-resting-state-functional-mri-database-framework.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Statistical and Machine Learning Analysis of the Human Brain Functional Network in a Multi-Site Resting-State Functional MRI Database Framework" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Artiles, Oswaldo; Saeed, Fahad; " +year: "2023" +journal: +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +The human brain has a complex network structure that is non-random and multiscale. It consists of subsystems coupled by a nonlinear dynamic, enabling it to produce complex responses to various external inputs and self-organize. To understand the physical structure and specific brain functions, it is essential to comprehend the connectivity of the hundreds of billions of neurons in the human brain. Functional connectivity (FC) in modern neuroscience is the statistical temporal dependencies between neuronal activation events occurring in spatially separated brain regions. Resting-state functional magnetic resonance imaging (rs-fMRI) is a non-invasive imaging technique widely used in neuroscience to understand the functional connectivity of the human brain. The studies presented in this dissertation were based on the models and methods from network neuroscience, which is an active area of research developed in the last three decades. These methods were used to model and analyze the functional human brain networks in a multi-site rs-fMRI data framework. diff --git a/papers/_posts/2023-01-01-systems-and-methods-for-matching-mass-spectrometry-data-with-a-peptide-database.md b/papers/_posts/2023-01-01-systems-and-methods-for-matching-mass-spectrometry-data-with-a-peptide-database.md new file mode 100644 index 00000000..f6b73b7f --- /dev/null +++ b/papers/_posts/2023-01-01-systems-and-methods-for-matching-mass-spectrometry-data-with-a-peptide-database.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Systems and methods for matching mass spectrometry data with a peptide database" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Kumar, Sumesh; Saeed, Fahad; " +year: "2023" +journal: US Patent 11,842,799 +volume: +issue: +pages: +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +Systems, architectures, devices, and methods for matching experimentally acquired mass spectrometry data with a peptide database are provided. The system architecture can include a host central processing unit (CPU) system, a bridge connecting the CPU system with a core control register (or registers), a plurality of processing elements (PEs), and a bus arbiter. The PEs can execute the computations in a parallel and asynchronous manner. The bus arbiter can be a first-come first-serve (FCFS)-based bus arbiter (ie, can utilize an FCFS-based arbitration scheme). diff --git a/papers/_posts/2024-01-01-communication-evaluation-of-a-wireless-4-channel-wearable-eeg-for-brain-computer-interface-bci-and-healthcare-applications.md b/papers/_posts/2024-01-01-communication-evaluation-of-a-wireless-4-channel-wearable-eeg-for-brain-computer-interface-bci-and-healthcare-applications.md new file mode 100644 index 00000000..c956452b --- /dev/null +++ b/papers/_posts/2024-01-01-communication-evaluation-of-a-wireless-4-channel-wearable-eeg-for-brain-computer-interface-bci-and-healthcare-applications.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Communication Evaluation of a Wireless 4-Channel Wearable EEG for Brain-Computer Interface (BCI) and Healthcare Applications" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Bhattarai, Abhishek; Mohammad, Umair; Saeed, Fahad; " +year: "2024" +journal: IEEE +volume: +issue: +pages: 902-903 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1109/SoutheastCon52093.2024.10500137" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +This paper evaluates the communication performance of a wearable electroencephalography (EEG) headband for brain-computer interface (BCI) and healthcare applications. Our study is motivated by the application of EEG sensors to epileptic seizure prediction using short duration segments. Using packet delivery ratio (PDR) as the metric, we show that shorter segments suffer from lower PDR which can have catastrophic implications for health/BCI applications based on predictive modeling. diff --git a/papers/_posts/2024-01-01-heterogeneity-aware-distributed-machine-learning-at-the-wireless-edge-for-health-iot-applications--an-eeg-data-case-study.md b/papers/_posts/2024-01-01-heterogeneity-aware-distributed-machine-learning-at-the-wireless-edge-for-health-iot-applications--an-eeg-data-case-study.md new file mode 100644 index 00000000..c09de752 --- /dev/null +++ b/papers/_posts/2024-01-01-heterogeneity-aware-distributed-machine-learning-at-the-wireless-edge-for-health-iot-applications--an-eeg-data-case-study.md @@ -0,0 +1,40 @@ +--- +layout: paper +title: "Heterogeneity Aware Distributed Machine Learning at the Wireless Edge for Health IoT Applications: An EEG Data Case Study" +nickname: 2024-04-16-bottenhorn-salo-diva +authors: "Mohammad, Umair; Saeed, Fahad; " +year: "2024" +journal: Springer +volume: +issue: +pages: 33-70 +is_published: True +image: /assets/images/papers/biorxiv.png +projects: [] +tags: [] + +# Text +fulltext: +pdf: +pdflink: +pmcid: +preprint: +supplement: + +# Links +doi: "10.1007/978-3-031-57567-9_3" +pmid: + +# Data and code +github: [""] +neurovault: +openneuro: [""] +figshare: +figshare_names: +osf: +--- +{% include JB/setup %} + +# Abstract + +In this book chapter, we design and develop a mobile edge learning (MEL) framework that enables multiple end user devices or “learners” to cooperatively train a machine learning (ML) model in a wireless edge environment. We will focus on designing and developing the heterogeneity aware synchronous (HA-Sync) approach with time constraints and extend the framework to consider dual-time and energy constraints. The proposed MEL framework will include the commonly known federated learning (FL) as well as parallelized learning (PL). After discussing the system model and a brief convergence proof for both FL and PL, we will formulate the problem as a quadratically constrained integer linear program (QCILP), relax it to a QCLP, and propose analytical solutions based on Lagrangian analysis, Karush-Kuhn-Tucker (KKT) conditions, and partial fraction expansion. For the problem with dual-time and energy …