Supporting Bioinformatics in a Regulated Environment¶
Next-generation sequencing is enabling a new paradigm of microbiology assays within the clinical and public health laboratory. These assays are enabling identification of microorganisms from culture or directly from sample, study of antimicrobial resistance, and investigation of relatedness between isolates.While many of these assays are used for surveillance purposes, with global changes to many regulatory frameworks, there is a growing need for assays that can be used in a regulated environment. BugSeq is invested in enabling reporting from our results to improve human health. Below, we detail some of the unique capabilities of BugSeq to enable integration into mission-critical assays.
Attributes of a Regulated Analysis¶
Validated and Accurate Methods¶
BugSeq has published on many of our methods, including isolate identification, metagenomic analysis and strain typing; we take pride in producing results with leading accuracy. BugSeq also relies on many published tools and references their methods where applicable. A non-exhaustive list of publications detailing BugSeq’s methods, along with external validations, is available on a dedicated page. Several key method papers from our team are highlighted below:
- Chorlton SD. Ten common issues with reference sequence databases and how to mitigate them. Frontiers in Bioinformatics (2024).
- Khdhiri et al. refMLST: reference-based multilocus sequence typing enables universal bacterial typing. BMC Bioinformatics (2024).
- Chandrakumar et al. BugSplit enables genome-resolved metagenomics through highly accurate taxonomic binning of metagenomic assemblies. Communications Biology (2022).
- Fan, Huang & Chorlton. BugSeq: a highly accurate cloud platform for long-read metagenomic analyses. BMC Bioinformatics (2021).
Versioning to Enable Reproducibility¶
Producing reproducible and reliable results is of paramount importance for bioinformatic analysis used in regulated settings. BugSeq version controls both analysis code and accompanying databases to ensure reproducibility. We have previously documented our methods for versioning in a separate blog post.
Typically, a BugSeq customer will validate analysis on a specific version for use in a regulated setting and lock this version down. All subsequent analyses will be conducted using this version, guaranteeing the same code and databases are used for each subsequent analysis.
Notably, new analysis methods and databases may significantly improve speed, usability and accuracy of bioinformatic analysis. All changes to analysis are documented in our change log. When users are ready to upgrade to a new analysis version, they may do so at the time of their choosing.
Curated and Updated Databases to Enable Accurate Results¶
Databases are guaranteed to be updated every three months, and are frequently updated more often than this based on emergence of novel infections, mechanisms of resistance and curation advances. However, databases are also tied to versions and can be locked down, therefore ensuring reproducible analysis.
BugSeq maintains over 15 databases to enable highly accurate bioinformatic analysis. Most of these databases are curated, including reference sequence, taxonomy, plasmid and antimicrobial resistance (AMR) prediction databases. Our team dedicates significant resources to curating these databases using both automated and manual methods, and has published on our curation methods (Chorlton, 2024). This approach to curation and thinking around the requirement for high quality databases is consistent with the view of FDA (Sichtig et al, 2019).
For some applications, such as prediction of antimicrobial resistance in Mycobacterium tuberculosis, we use established databases like the World Health Organization database to ensure standardization. We also supplement these databases with additional variants conferring antimicrobial resistance to yield improved performance; where we do so, these variants are clearly labeled with their source.
Building a NGS Assay¶
Next-generation sequencing assays in diagnostic environments are time sensitive. Patients rely on the results of validated assays and analysis, and delays in results may negatively impact patient diagnosis and treatment. Furthermore, new infections are constantly emerging, and laboratories must be ready to identify and characterize these in a rapid manner.
Uptime
By leveraging the scalability and redundancy of AWS services such as S3 and EC2, BugSeq customers realize >99.99% uptime on the BugSeq platform.
BugSeq’s analysis pipelines are cloud-deployed on Amazon Web Services (AWS) servers to enable parallel processing and ensure high availability and quick turnaround times. BugSeq’s web application leverages multiple servers with redundancy to guarantee uptime. The offline data-analysis component of BugSeq’s architecture is built to scalably and efficiently process large volumes of clinical data by fanning out tasks on AWS Batch and EC2, scaling up as needed. BugSeq also operates in multiple regions to mitigate regional outages.
Quality Assurance and Risk Management¶
BugSeq maintains stringent quality assurance policies during software development and deployment to our users. Through extensive engagement with regulatory agencies and external consultants, we have worked towards compliance with ISO 13485:2021. Furthermore, we have aligned our risk management documents to be compliant with ISO 14971:2019 and have incorporated a risk-focused approach to any development decisions. Briefly, all software updates undergo an extensive array of tests and validations prior to release to our production pipelines. Software updates and bug fixes are subject to code reviews by BugSeq’s software development team prior to release. This software development and life cycle process has been scrutinized by regulatory consultants and is aligned with IEC 62304:2006+AMD1:2015. The goal of these processes is to reduce the risk of user-facing errors to our clinical and public health users.
Support When You Need It¶
Support for BugSeq’s services is provided 24 hours a day, 7 days a week for our clinical and public health customers. BugSeq maintains an on-call roster to ensure rapid response to analysis issues, including technical difficulties and result interpretation. Support is available via email, phone and chat, and is provided by a multidisciplinary team including bioinformaticians, medical microbiologists, software engineers and molecular scientists.
Privacy & Security¶
BugSeq takes data privacy and security with utmost importance. Our privacy policy is publicly available on our website. All data is encrypted both at-rest and in-transit. All access to data must be authenticated and authorized before access is granted, and all access is logged and archived securely.
BugSeq practices the principle of least privilege for customer access, internal access, and server-to-server access.
Data is segmented by user and organization. User access is restricted to BugSeq-produced results files from our analysis services. When organizational sharing is enabled, users can also access results data from other users within their organization.
Traffic into BugSeq must go through a firewall with rigorous rules to block malicious traffic. BugSeq also routinely upgrades and patches packages to ensure software is up-to-date. BugSeq also has a rigorous set of standard operating procedures for data access and cybersecurity incident response, including data encryption, access authentication, data storage, data segmentation and restriction, host security, and incident response.
Conclusion¶
Integrating bioinformatic analysis into a regulated environment is often more complex than originally anticipated by the laboratory. With our extensive experience across clinical and public health microbiology, BugSeq enables this process to be as smooth as possible. BugSeq’s analysis enables reproducible, accurate analysis while mitigating the risks of developing, maintaining and executing bioinformatic analysis within resource-constrained laboratory environments.