Case Study: APIs & Custom Dev for a Genomics Data Warehouse
The Marine Genomics project, as detailed in the BMC Genomics article, aimed to create a centralized platform for curating and analyzing Expressed Sequence Tags (ESTs) and microarray data specific to marine organisms (reported in the media). Recognizing the fragmented nature of marine genomic data, the project sought to provide researchers with a unified, accessible repository to facilitate comprehensive studies on marine species’ transcriptomic responses to environmental stresses.
If you’re looking to learn more please feel free to reach out to Dave and say hi or check out our case studies for some examples of our work.
The Challenge
Dispersed and Inconsistent Genomic Data: Marine researchers faced fragmented access to EST and microarray data across different institutions, formats, and platforms—slowing progress and collaboration.
Lack of Centralized Infrastructure: Without a unified repository, researchers couldn’t efficiently store, query, or analyze genomic data, resulting in duplicated efforts and reduced research accuracy.
No Standardized Submission or QC Tools: The absence of formal tools for data submission, trimming, and quality control made it difficult to maintain data integrity or share findings with confidence.
Barriers to Global Collaboration: Scientists around the world had limited ways to contribute to or benefit from shared datasets, stalling momentum in a rapidly growing field.
Our Solution
Custom Web Platform with Unified Database: Developed a PHP-based web portal backed by PostgreSQL, enabling researchers to store, access, and manage marine genomics data in one centralized location.
Integrated Bioinformatics Tools: Added features like sequence trimming, BLAST search, and quality control to automate and standardize core genomic workflows.
Built-in GenBank Submission Pipeline: Enabled direct submissions to GenBank from the platform, reducing redundant steps and increasing public data availability.
Global Data Contribution Support: Designed the platform to accept data from 19 species and accommodate submissions from researchers worldwide—strengthening global collaboration.
Tech Stack:
Results
Centralized Genomic Research Hub: Marine researchers now rely on a single platform to store and analyze genomic data, saving time and eliminating redundancy.
Faster, More Accurate Data Processing: With built-in tools for trimming, QC, and BLAST, users can prepare data for publication and submission with minimal friction.
Widespread Researcher Adoption: The platform has been adopted by users across multiple continents, demonstrating its usability and relevance in marine genomics.
Accelerated Contributions to Public Databases: GenBank submissions increased thanks to the automated pipeline, contributing to broader advancements in environmental and genomic research.