The registration desk and conference room is located on the 1st floor of the SiMR building

Click here to download a PDF of the meeting booklet
Click here to download a summary of the schedule

Click here to view the schedule in Google calendars.

Click here to download an iCal format calendar of the schedule.

Click here to download the Student Education day schedule

Join the converssation on Slack

Monday August 7th

Arrival, Education Day, Board Meeting and Welcome Reception

Student Education Day (talks aimed at students / new to GSC)
GSC Board meeting (board members only)
Welcome to GSC23 Drinks Reception (all GSC23 registered participants)
Tuesday August 8th

(1) Genomic Standards for Precision Medicine & (2) Publishing and Database Perspectives



Welcome to Mahidol University, Thailand
Prof. Banchong Mahaisavariya, M.D. (President of Mahidol University)

President’s Welcome
Mahidol University has been living a long history for over 129 years since the establishment of “Siriraj Hospital” in 1888, then developed to a medical school in the next year. The name was bestowed upon by His Late Majesty King Bhumibol Adulyadej after His Royal Highness Prince Mahidol of Songkla.
Mahidol University had continued to progress in all aspects to stay relevant and conform to the rapidly changing world and society. The significant role of the university is to produce quality graduates, who are knowledgeable in their chosen fields and mindful of morality, to serve as human capital for the country’s current development. Moreover, the university has vouched for academic development, research support for innovations, improvements of university’s physical systems and environment as an eco-university as well as internationalization. All strategies are aligned with Mahidol University’s aspiration towards a world class university. The university’s current milestone is being ranked as the top university in the region and No.1 in Thailand.
Mahidol University is recognized as a large higher education institution comprising of academicians and professionals in every field, both in arts and sciences. Therefore, the current university administration has set our target to maintain our status as the leading university in the country and the region. We are aspired to be the source of knowledge for the benefits of society and country as Mahidol University’s slogan 'Wisdom of the Land'.

Welcome and Logistics
Manop Pithukipakorn, Sakda Khoomrung (Local Hosts)

All keynotes are 25 minutes plus 5 minutes for Q&A. All other presentations are 12 minutes plus 3 minutes for Q&A.

GSC Presidents Welcome
Lynn Schriml (GSC President, University of Maryland School of Medicine)

Welcome & Introduce afternoon working group session organisation

Keynote: Genotype First – in Research and In Clinical Care
Leslie Biesecker (NIH/NHGRI, USA)

The rapidly falling cost of genome sequencing necessitates rethinking about how we care for patients and perform research. The clinical research paradigm has been dominated by a phenotype-first approach. In this paradigm it is the patient’s manifestations or disease that initiate the process of care, through the mechanism of the differential diagnosis. It is essential to recognize that a key function of the differential diagnostic paradigm was to narrow the range of potential diagnoses to match the limits of the then available genetic tests. Now that large panel tests can be had for $250, the upstream narrowing of the diagnoses is less critical to success in achieving the diagnosis. In fact, it can be argued that the broad testing is superior to a narrow differential as the test interrogates genes for disorders that the clinician may not have thought to considered in her differential diagnosis. Thinking more broadly than panels, clinical genome sequencing is becoming affordable and has the potential to provide a life-long health care resource for a patient.
In the research realm, a similar paradigm has held – that of the assembly of a patient cohort or a case-control cohort for the purpose of elucidating the cause of a disorder affecting the people of that cohort. This is an arduous and expensive process and sometimes has limited long term utility, after the primary questions have been addressed. Instead, because a research genome can cost as little as $250, one can begin to consider the establishment of genomically characterized cohorts, which can subsequently be interrogated by a genomic attribute. In this mode of research, the clinical investigator searches the genomic database to identify individuals harboring genetic variants of interest. These individuals can be recalled to the research center on the basis of their genetic change and the researcher then tests a hypothesis of that variant being associated with a phenotype. A large advantage of this approach is the reduction in bias as a part of the recruitment and eligibility review process and the reduced cost.
These examples and principles will be reviewed and discussed with pertinent examples.

Session chair:Manop Pithukpakorn (Faculty of Medicine Siriraj Hospital, Mahidol University)
Session 1: Genomic Standards for Precision Medicine
Session chair:Manop Pithukpakorn (Faculty of Medicine Siriraj Hospital, Mahidol University)

Curating Clinical Genomes: Real World Practice of Human Genome Variant Interpretation

Manop Pithukpakorn (Faculty of Medicine Siriraj Hospital, Mahidol University)

With the rapid advances in genomic sequencing technology and increased knowledge of genomics in medicine, the integration of genomic data into clinical practice is becoming increasingly important. The ACMG 2015 Standards and Guidelines for the Interpretation of Sequence Variants provide a valuable framework for variant interpretation, but they also have limitations. Real-world data on the clinical significance of many variants is limited, and there are knowledge gaps in our understanding of how variants affect the disease pathogenic mechanisms and clinical phenotypes.
The ClinVar database is a valuable resource for variant interpretation, but it is not without its limitations. The database is not comprehensive, and the quality of the data varies. ClinGen specific gene guidelines provide additional information on the interpretation of variants in specific genes but add more complexity to interpretation process.
Interpretation of copy number variants (CNV) and splicing variants is particularly challenging. Changes in the number of copies or splice sites of coding sequences do not always alter the gene function. There are no clear guidelines for the interpretation of these variants, and they are often classified as 'uncertain significance.'
This presentation will discuss the challenges of curating clinical genomes, and the resources available to help with variant interpretation. The presentation will also discuss the future of clinical genomics, and how we can use genomic data to improve patient care.

Human Cell Atlas: Putting together reference transcriptomes of all human cells

Varodom Charoensawan (Faculty of Science, Mahidol University)

Regulation of gene expression is an important biological process that gives rise to phenotypic diversity from the same genetic information. In addition to reference genomes and variations between individuals, reference transcriptomes among tissues and cell types are crucial to understanding fundamental units of life and serve as a platform for developing targeted therapy in the precision medicine era.

Human Cell Atlas (HCA) consortium aims to 'create comprehensive reference maps of all human cells – as a basis for both understanding human health and diagnosing, monitoring, and treating diseases'. As part of the HCA, we are establishing the reference immune and gene expression repertoires of the Asian population under the 'Asian Immune Diversity Atlas (AIDA)' project, together with collaborators from eight countries. Asian genomic data are heavily underrepresented as compared to genomic data of people of European ancestry that accounts for 80 percent of the current data worldwide. In the AIDA project, we will rectify this imbalance by characterizing the nature and extent of variation in immune cell types from diverse Asian populations.

Within Thailand, we will investigate the conserved and unique patterns in the immune diversity of the Thai populations in different parts, which are historically, culturally, and genetically unique. These data will serve as a healthy baseline for characterizing cell state changes in various immune-related diseases. This project is a collaborative effort involving researchers from Mahidol University, Chiang Mai University, Khon Kaen University, and Prince of Songkla University in Thailand

The FAIR Principles: past, present and future

Susanna-Assunta Sansone (University of Oxford, UK)

The FAIR Principles ( have propelled the global debate in all disciplines on the importance of Findable, Accessible, Interoperable, and Reusable data, by humans and machines, and the need for better research data management, transparent and reproducible data worldwide. FAIR has united stakeholders world-wide behind a common concept: good data management under common standards. It is no longer optional. However, the FAIR Principles are aspirational, and putting FAIR into practice is work in progress; it 'takes a village'! Starting with a brief history of the Principles, Susanna will paint the landscape of key initiatives and community activities for FAIR data, with a focus in the Life Science, including resources like FAIRsharing (, and the FAIR Cookbook (

Platinum Sponsor talk 1: Oxford Nanopore (ONT); Game-changer in Epigenetics

Thidathip Wongsurawat (Representing Oxford Nanopore Technology)

The application of DNA methylation patterns holds a considerable potential in cancer care, notably improving the processes of classification, diagnosis, prognosis, and treatment response prediction. Oxford Nanopore Technologies (ONT) stands uniquely positioned in this regard, capable of detecting DNA methylation directly from the raw data without additional chemical treatments and offering 'real-time methylation analysis'. This represents an advantage over most next-generation sequencing technologies, which cannot directly distinguish between methylated and unmethylated nucleotides in native DNA. In the GSC23, I will present the application of nanopore technology, focusing on the methylation detection in a key biomarker in cancer samples.


Tea, Coffee and Networking

Session 2: Standards and Perspectives from Publishing and Databases
Session chair:Chris Hunter (GigaScience Press)

Journal perspective: GigaScience Press

Chris Hunter (GigaScience Press)

GigaScience Press ( has the goal of achieving true open science by embracing the UNESCO Open Science Recommendation as the primary goal for its publications and activities. A major part of those efforts is the creation and use of GigaDB, the open access repository of datasets directly associated with all GigaScience journal articles. GigaDB datasets are created by the authors with expert guidance from highly experienced data curation staff, making use of relevant standards and ensuring deposition of all relevant data in public repositories. Our curation staff keep themselves abreast of many on-going standards efforts including the GSC MIxS as well as relevant ontologies, and guide authors on their appropriate use in dataset metadata. Here I will present GigaDB and highlight some of the ways we utilise standards in our curation work.

ENA: Improving Experimental metadata standards

Peter Woollard (ENA, EMBL-EBI, UK)

Peter Woollard, Josie Burgin, Guy Cochrane

Core and diverse sample metadata has been explicitly captured with checklist templates for a number of years, by the European Nucleotide Archive(ENA) and other INSDC partners. There is now a broader and more complex spread of sequencing experiment related metadata that could usefully be collected too, due to the increasing use of sequencing technologies to study the general biological world, particularly for human health and the environment. Capturing experiment metadata information more accurately and consistently will increase the usefulness of the data, by making it more FAIR.

We are exploring experimental checklists conceptually similar to existing sample level checklists to tailor metadata provided for different ‘types’ of sequencing experiments. We have integrated learnings from sample checklists, including the need to have checklist versioning and dependency validation. To do the initial validation for the experiment checklists, we are using: JSON, JSON schema and ELIXIR bio validation technologies. These can rapidly catch most validation issues and provide immediate feedback to users. Deeper automated validation will still be performed to ensure INSDC standards.

Currently, we have a dozen 'experiment type' checklists ranging from metabarcoding to spatial transcriptomics. These experiment type checklist JSON and accompanying JSON schema files are all driven from a single JSON configuration file. It will be straightforward and sustainable to add further experiment types.

A pilot use and submission of experiment type checklists is planned for later this year. All code and documentation is publicly accessible:

In this talk, we will outline what we are doing and illustrate how it will improve the standardisation of sequence experimental metadata.

The DDBJ resources based on standards

Kyung-Bum Lee (DDBJ, Japan)

The DDBJ (DNA Data Bank of Japan) Center is a global biological database serving as a comprehensive repository for diverse biological information. We collect and handle a variety of data types, including raw, assembled, and annotated nucleotide sequence data, as well as functional genomics data (GEA: Genomic Expression Archive), metabolomics data (MetaboBank), and human genetic and phenotypic data (JGA: Japanese Genotype-Phenotype Archive).
The DDBJ seeks to broaden its collaboration with other national-class data providers like KOBIC (Korea Bioinformation Center) and BRIN (National Research and Innovation Agency, Indonesia) as a founding member of the International Nucleotide Sequence Database Collaboration (INSDC) with NCBI and EBI. Key components of science are inclusion and FAIR-ness. I will outline the DDBJ databases based on such principles.

National Genomics Data Center (China) perspective on standards

Yiming Bao (National Genomics Data Center, China) [virtual]

National Genomics Data Center (NGDC) at Beijing Institute of Genomics, Chinese Academy of Sciences was established by the Ministry of Science and Technology and Ministry of Finance of China in 2019. NGDC is a national platform for archiving, managing and processing a wide range of genomics related data. These include the BioProject and BioSample databases, Genome Sequence Archive (GSA=SRA) family, Genome Warehouse (GWH=WGS), GenBase (=GenBank), Gene Expression Nebulas (GEN=GEO), Genome Variation Map (GVM=dbSNP), and many others. Following largely the data structure and standards of the corresponding databases in INSDC, NGDC is ready and has started to smoothly exchange its data with those of INSDC, therefore is working hard towards becoming a partner of INSDC. Additionally, NGDC has developed several procedures and tools to facilitate the implementation of various standards.

Platinum Sponsor talk 2: PacBio - Shifting paradigms with PacBio HiFi sequencing

Zuwei Qian (PacBio)

The combination of high accuracy, long read length, kinetic information and evenness of coverage makes PacBio HiFi long read sequencing a unique sequencing technology platform, providing best-in-class performance in de novo assembly, calling of all variant types, long-range allelic phasing, 5mC methylation calling, full-length transcriptomes (including single cell level), and many other applications. This technology is poised to set the standard for genomics research where comprehensive detection of all variant type is a must. In this presentation, highlights of application examples and opportunities leveraging these features for human, plant and animal genomic researches and associated computational solutions will be presented.




Lunch and Networking

Afternoon Breakout Sessions - NO VIRTUAL ATTENDANCE
Click + to expand

Afternoon Breakout Session Options for in-person participants. Summary and/or minutes to be posted for asynchronous input/consultation of virtual participants.
Breakout groups (in-person only); findings posted online for input/consultation
● Topic 1. GSC Compliance and Interoperability Working Group (Chair Chris Hunter)
● Topic 2. Standards in Personalized Medicine

Session chair:Participants are invited to choose 1 of the sessions to attend.

Tea, Coffee and Networking

Wednesday August 9th

(3) Metabolomics & (4) New Sequencing Technologies

Welcome and Logistics
Lynn Schriml + local host

All keynotes are 25 minutes plus 5 minutes for Q&A. All other presentations are 12 minutes plus 3 minutes for Q&A.

Session chair:Lynn Schriml
Keynote: Standards in Metabolomics
Claire O’Donovan (EMBL-EBI, UK) [virtual]

The aim of this talk is to present the current status of standards in Metabolomics with its diverse communities including academia and industry. It will highlight some of the collaborative efforts and challenges that the community faces and how we are evolving resources to respond to its needs. It will also include how metabolomics is learning from the other omics communities' experiences and how all the omics need to interact together going forward.

Session chair:Sakda Khoomrung (Siriraj Hospital, Mahidol University, Thailand)
Session 3: Genomic Standards for Metabolomics
Session chair:Sakda Khoomrung (Siriraj Hospital, Mahidol University, Thailand)

Mass spectrometry-based metabolomics standards in clinical research

Sakda Khoomrung

Siriraj Center of Research Excellent in Metabolomics and Systems Biology (SiCORE-MSB), Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University
Center of Excellence for Innovation in Chemistry (PERCH-CIC), Faculty of Science, Mahidol University, Bangkok 10400 Thailand

Owing to growing interest in personalized medicine, the combination of clinical data, multi-omics data, and systems analysis has been increasingly studied. Mass spectrometry (MS)-based metabolomics is an ideal technology for identifying and quantifying metabolites in various biological samples. Experimental MS workflow, along with advanced bioinformatics, is considered one of the fundamental tools in systems biology for phenotype characterization and the development of precision medicine. In fact, the standardized methods/protocols for metabolomics, including sample preparation, metabolite identification, data processing and analysis, and quantification, represent crucial steps for the translation of research outcomes into clinical practice. Quantitative metabolomics allows the accumulation and comparison of metabolomics data across different studies, which eventually can lead to the establishment of a critical resource for biomarker research, precision medicine, and bridging scientific outcomes with clinical applications. In this talk, I will discuss several fact-based challenges, including metabolite identification, data processing and analysis, and the measurement precision of absolute quantification in clinical research.

Advancements in NMR Metabolomics: Enhancing Sample Handling and Metabolite Identification

Jutarop Phetcharaburanin (Khon Kaen University)

Metabolomics, a powerful tool in systems biology, enables the characterization of phenotypes. Nuclear magnetic resonance (NMR) spectroscopy plays a pivotal role in capturing dynamic changes in metabolites as a response to gene-environment interactions, diets, diseases, and stimuli. However, the NMR metabolomics pipeline is susceptible to confounding factors stemming from improper handling of biospecimens, including inadequate containers, storage conditions, and durations. These inconsistencies can introduce consequential effects that undermine accurate analysis. By utilizing NMR metabolomics, an optimized fecal sample handling strategy has been developed, leading to improved fecal metabolic phenotyping-based diagnoses. Additionally, the identification of metabolites presents another critical step in the NMR metabolomics pipeline. Misassignments of metabolites, caused by factors like human errors and the complexity of spectral data, can result in data misinterpretation and misunderstanding. To mitigate these challenges, a computational approach, specifically artificial intelligence-guided metabolite identification, is being developed. This approach aims to enhance the accuracy and precision of NMR metabolite identification, enabling more reliable data interpretation.

Metabolomics for agriculture applications

Umaporn Uawisetwathana (National Center for Genetic Engineering and Biotechnology-BIOTEC.)

Metabolomics is increasingly employed to comprehensively study of all metabolites of organisms. It has been proven to be instrumental to unravel important metabolites as well as molecular pathways governing genotypes and phenotypes of an organism. Several metabolomics applications in agricultural research focus on identification of biomarkers and discovery of metabolic mechanism underlying particular traits to rationally develop an agricultural practice as well as new varieties with better yields, resistance to biotic/abiotic stresses and desirable nutritional contents. With challenging of untargeted metabolomics in plant research, standardization at each step of metabolomics pipeline is considered. Examples of metabolomics standard for agricultural research will be presented. In summary, metabolomics have been successfully proved as a high-throughput screening method in functional genomics to generate more fundamental knowledge of the biochemical process underlying crucial traits.

Metadata Informed Metabolomics

Yuri Corilo (PNNL, USA)

This presentation will discuss the importance of metabolomics standards, data models, and metadata. The metabolome is a complex molecular system; multiple sample preparation methods and analytical techniques are required to identify its vast molecular compositional space. We will provide an overview of how we aim to capture and describe a complete and unified experimental design for metabolomics, including metadata on the samples, sample processing, instruments used, and metabolites. This comprehensive metadata is crucial for establishing standardized workflows for molecular annotation and promoting data reuse. Additionally, we will share our vision for utilizing metadata to enhance the accuracy of metabolite annotation and facilitate new molecular discoveries in metabolomics.

Platinum Sponsor talk 3: Waters; Our solutions for Omics Analysis

YanTing Lim (Field Marketing Specialist, Waters Pacific Pte Ltd.)

Omics as a scientific discipline identifies, describes, and quantifies the biomolecules and molecular pathways that contribute to the form and function of cells and tissues. This can be done in a targeted or untargeted way or alternatively using imaging techniques to map these molecules to their distribution in tissue. This presentation provides an overview of Waters’ portfolio for Omics Analysis and explore how the latest innovations are being used to help scientists better characterize their samples.


Tea, Coffee and Networking

Session 4: New Sequencing Technologies and Genomic Standards
Session chair:Scott Jackson

Standardization of next-generation sequencing method to study gut microbial diversity in shrimp

Wanilada Rungrassamee (National Center for Genetic Engineering and Biotechnology, Thailand)

The gut microbiota plays an essential role in animal health and production, and its study has become increasingly important in the aquaculture sector. However, standard protocols and best practices for measuring shrimp microbiota are not well established, leading to uncertainty in the accuracy of results and complicating cross-study analyses.
In this study, we investigated the influence of key methodological variables on the gut microbiota profiles of black tiger shrimp, an important product of the aquaculture industry in Asia Pacific region. We used pooled gut samples and synthetic DNA spike-in standards to evaluate four commercial kits for DNA extraction with primer sets for the V1-V2, V3-V4, and V6-V8 hypervariable regions of the 16S rRNA gene. We evaluated the performance of the kits and PCR primers based on the diversity of observed microbiota profiles, taxon-specific biases, and accuracy of quantification of spike-in standards. We also evaluated different bioinformatics pipelines for data analysis, focusing on the advantages of using denoising algorithms compared to traditional clustering methods to resolve sequence diversity in dominant species.
Our results showed that the variables assessed had a significant impact on shrimp gut microbiota profiles. This highlights the need for standard protocols and best practice guidelines for measuring shrimp microbiota. Our study provides several recommendations to improve the accuracy and reproducibility of shrimp microbiota measurements and should be of great benefit to aquaculture microbiome research.

Obligate insect symbionts: insights from new genome sequences and possible standards for these tiny genomes

Tanja Woyke (JGI)

Insects, the most species-rich animal group, comprise over one million depicted species. Insect-bacteria partnerships are widespread and significantly impact the evolutionary success and diversification of this animal group. Bacterial endosymbionts supplement essential nutrients absent in insects' diets, aid in the degradation of tough food sources, defend against natural enemies, and enhance resistance to stress and insecticides. Additionally, certain 'selfish' insect endosymbionts manipulate insect reproduction to facilitate their own spread. This talk will explore newly discovered obligate insect symbiont genomes and potential criteria for standards for these small genomes.

Intergalactic Microbes: Uncovering the Invisible Co-Pilots of Space Exploration

Kasthuri Venkateswaran (JPL, NASA)

In our quest to comprehend space, understanding the invisible travelers accompanying us is crucial. These microscopic entities are key to astrobiology, astronaut health, and the maintenance of space habitats. Our research has portrayed a vibrant microbial ecosystem on the International Space Station (ISS) and NASA cleanrooms where spacecraft are assembled.

Through advanced molecular techniques and traditional culture methods, our Microbial Tracking initiatives have extensively catalogued the dynamic microbial populations adapting to extraterrestrial conditions. These adaptable communities can be both allies and threats to human health.

Our investigations delved into the microbial universe, uncovering complexities through amplicon sequencing, metagenomics, and resistomes. We've identified about 3,000 bacterial and fungal strains, some new species, along with their potential virulence traits and beneficial metabolites.

Yet, our pioneering Environmental 'Omics' project isn't simply an academic exercise - it's a practical guide for space exploration. The main objective of our microbial tracking research is to leverage these insights to enhance human health within the ISS and other similar closed systems. We're dedicated to transforming our fundamental research into tangible benefits, such as pathogen detection and the development of health countermeasures.

In keeping with our commitment to scientific transparency and innovation, we've made our 'omics' datasets accessible on NASA's GeneLab bioinformatics platform. This platform houses a robust database and advanced computational tools. We warmly invite the wider scientific community to make use of these resources, contributing to the ongoing exploration and understanding of life - seen and unseen - throughout the cosmos.

Platinum Sponsor talk 4: A Practical Guide to Microbiome Reference Standards

Kris Locken (Zymo research)

Microbiome data is being generated at an unprecedented pace. In many cases, a lack of proper controls or comparison to microbiome reference materials means that important and high-impact conclusions cannot be reproduced or reliably compared to similar data sets. Microbiome standards are imperative for microbial community profiling and analysis. Whereas the microbial compositions of experimental samples are variable and often unknown, microbiome standards provide an accurate, and consistent measurement as a basis for comparison. Different Types of microbial reference materials including both whole cell and DNA mock communities, spike in controls and microbiome reference materials are useful for assessing bias at different steps of the microbiome analysis workflow.


Lunch and Networking

Afternoon Breakout Sessions - NO VIRTUAL ATTENDANCE
Click + to expand

Afternoon Breakout Session Options for in-person participants. Summary and/or minutes to be posted for asynchronous input/consultation of virtual participants.
Breakout groups (in-person only); findings posted online for input/consultation
● Topic 3. National Microbiome Data Collaborative workshop- Using GSC portal and Standards (Chair Ramona Walls- Critical Path Institute)
● Topic 4. Metabolomics Standards
● Topic 5. Standardized Protocols

Session chair:TBC

Tea, Coffee and Networking

Thursday August 10th

(5) GSC Current and Evolving Standards & (6) Comparative Genomics

Welcome and Logistics
Lynn Schriml + Sakda

All keynotes are 25 minutes plus 5 minutes for Q&A. All other presentations are 12 minutes plus 3 minutes for Q&A.

Session chair:Lynn Schriml
2023 Recipient of 'The Dawn Field Award for Outstanding Contributions to Genomic Standards'- SPONSORED by ISME
Montana Smith (PNNL, USA)

IMSE sponsor logo
Accomplishing FAIR: For data generators and users.
FAIR data can be hard to accomplish and implementing standards can be a challenge. User feedback and usability testing provides valuable information that enables us to improve standards and increase community adoption. In my talk, I’ll share some of the feedback we’ve received and changes that have been suggested to improve adoption and understanding as well as development of the new MISIP (Minimal information for Stable Isotope Probing) checklist currently in development. Providing FAIR data will improve and expand scientific discovery, and providing an easily adopted and comprehensive standard will make FAIR data obtainable.

Session chair:Lynn Schriml
Session 5: GSC Current & Evolving Standards
Session chair:Ramona Walls (Critical Path Institute, USA)

GSC Compliance and Interoperability Working Group

Ramona Walls (Critical Path Institute / CIG co-chair) / Chris Hunter (GigaScience Press/ CIG co-chairs)

The GSC Compliance and Interoperability Working Group (CIG) is responsible for defining (with input from the GSC Board) and implementing GSC policies around standards, with a primary focus on the Minimum Information about any (x) Sequence (MIxS) standards, and for maintaining the GSC website. The last year has been very busy for the CIG. We are reviewing three new MIxS checklists or extensions (Urobiome, MISIP, and MiNAS) for the upcoming MIxS v.7 release. After converting MIxS hosting to Github and LinkML (see the following talk) with the 2022 release of MIxS v.6, we have spun off a Technical Working Group (TWG) that oversees all technical aspects of MIxS, including Github issue and project tracking, LinkML coding, and creating releases. The TWG is currently working to stabilize our LinkML implementation. The CIG and TWG are working together to update GSC policies to make MIxS more sustainable and serve MIxS in a form that is machine validatable. The main GSC website ( has undergone a major overhaul, with new content being added regularly, including links to the formal MIxS documentation generated automatically by LinkML. Both CIG and TWG are open and active working groups, and we are seeking new members from the GSC community who are willing to provide a few hours of work per month.

A Social and Technical Implementation of MIxS in LinkML by GSC and NMDC.

Mark Miller (LBNL, USA)

The Genomic Standards Consortium published the first version of the Minimum Information about any (X) Sequence standard in May 2011 (DOI:10.1038/nbt.1823). Since then, it has since been adopted, in part, into the INSDC’s requirements for describing Biosample and sequence submissions. That means that all of those records have a minimum set of required attributes and that submitters are provided with a standardized vocabulary of other terms they can use to describe their submissions.
The National Microbiome Data Collaborative was launched by the US Department of Energy in ~2020 with the intent of aggregating standardized metadata and results from numerous environmental omics projects (DOI:10.1038/s41579-020-0377-0). NMDC is distinctive in the fact that multiple omics modalities are supported (metagenomics, metaproteomics, metabolomics, etc.) and that the metadata conform to a formal schema with semantic web and linked data compatibility.
NMDC adopted many MIxS terms to describe biosamples but recognized the potential for higher levels of enforceable formality in the MIxS standard. This led to a 2020 effort to express MIxS in RDF, which has since evolved into a campaign to express MIxS in the LinkML language, which can be converted into RDF. Multiple stakeholders from NMDC and the wider GSC community have contributed to this new implementation of MIxS, which stands as a formal specification independent of any organization that may wish to implement portions of it. This was only possible because the contributors had a combination of skills and principles that included sustainable software development, schema testing with valid and invalid examples, the ability to gather user input, and an appreciation for predictable release processes.

Urobiome new MiXS extension

Lisa Karstens (Oregon Health & Science Universit, USA) [virtual]

Over the past decade, complementary sequence-based and culture-based approaches have provided clear, reproducible evidence that the human urinary bladder has a microbial community (the urobiome) that includes bacteria, fungi and viruses. The urobiome appears to be associated with several urological disorders in the absence of clinically identifiable infections, including kidney disease, overactive bladder, and bladder cancers. However, urobiome research has had inconsistent reporting of sampling conditions and participant-related factors, which will substantially limit reuse and secondary analyses of data collected from urobiome studies. To enable urobiome research to be Findable, Accessible, Interoperable and Reusable (FAIR), consensus amongst researchers on information collected and metadata standards are needed. Towards this goal, the urobiome research community has generated a consensus statement (published in 2021) and has worked with the GSC to develop an extension of Minimum Information about any (X) Sequence (MIxS) standards.

Measurement Assurance for Innovation in Microbiome Science

Scott Jackson (NIST, USA)

Appreciation for the role of microbes in our lives has been growing rapidly, but the measurement science needed to understand and fully exploit microbial systems has developed at a much slower pace than the industries dependent on them demands. In all applications involving complex microbial communities, the research is hampered by the lack of standards, protocols, and technical infrastructure to allow confidence in the data and comparability. At NIST, we are developing tools to enable measurement assurance of complex microbial systems for applications in the clinic, agriculture, and the environment.

Platinum Sponsor talk 5: The Impact of Rapid Next-Generation-Sequencing on Precision Oncology.

Svetta Kwan (ThermoFisher - GenePlus)

Over the last two decades, with the evolving targeted therapies in the Oncology and Haematology-Oncology landscape, more cancer patients can live longer, as demonstrated in many clinical trial data on improvement in the overall survival of cancer patients. The evolvement of targeted therapies has shifted the paradigm in clinical molecular testing, leading to more sequential tests and a longer waiting time and the need for more tissues. Tissue scarcity, longer turn-around-time, and rebiopsy patients are now problems for many clinicians and pathologists around the globe, impacting the overall survival of real-world cancer patients in their treatment journey. In addition, in most developing countries, patients discovered cancers are usually in Stages III and IV onwards, which is a real battle against time. The need to democratize In-House Next-Generation-Sequencing (NGS) to improve actual personalized medicine and allows the development of local expertise in biomarker testing to support the future of precision medicine. Because every patient deserves informed therapy and treatment, and every patient deserves a chance to fight for their life.


Tea, Coffee and Networking

Session 6: Genomic Standards for Comparative Genomics
Session chair:Nikos Krypides (JGI)

The Global Biodata Coalition: towards a sustainable biodata infrastructure

Chuck Cook (Global Biodata Coalition, UK)

Collectively, life science data resources around the world form a vast, distributed, and interconnected infrastructure that is critical for life science and biomedical research. Unlike other scientific infrastructures, biodata resources are globally distributed and lack any kind of central coordination. The distributed nature of the infrastructure supports innovation, but lends itself poorly to the long-term sustainability of individual biodata resources and the infrastructure as a whole. The Global Biodata Coalition (GBC) brings together life science research funding organisations that recognise these challenges and acknowledge the threat that the lack of sustainability poses. They agree to work together to find ways to improve sustainability.

In the presentation I will provide an overview of the global biodata resource infrastructure, focusing in particular on challenges to providing sustained long-term funding to the resources that comprise the infrastructure. Covering some of the work that GBC has carried out to understand and classify biodata resources and the entire biodata resource infrastructure, we will outline the Global Core Biodata Resource programme and Inventory project and also introduce the stakeholder consultation processes around approaches to sustainability and open data. Finally, we will lay out the path GBC is taking to engage researchers, informaticians, funding organisations and other stakeholders in moving towards greater sustainability for these critical resources.

Gut microbiota alteration associated with SLE (systemic lupus erythematosus) development

Naraporn Somboonna (Chulalongkorn University, Thailand)

Gut microbiota play an important role in nutritional, metabolic and immunological systems. The balance of microbiological ecology in the intestine (homeostasis) is crucial for health maintenance. Dysbiosis of gut microbiota and diversity can lead to immunological disorders and associated with many diseases, including systemic lupus erythematosus (SLE); a chronic autoimmune disease effecting at least five million people worldwide.
To study gut microbiota alteration associated with SLE, fecal samples were collected from lupus-prone mice, by chemically induced (pristane) and genetically induced (FcRIIb knockout) SLE development. We found that gut microbiota of lupus-prone mice began to change at 4 months of age, and some clinical symptoms were found significantly different compared with healthy mice beginning at 6 months after microbiota changes. Differences of physicochemical conditions in each section of and feces can affect the composition of gut microbiota, we investigated gut microbiota composition along mouse GI tract among different groups of mice at various ages.
Here I present our preliminary data, and suggested that gut microbiota composition was associated with SLE by increasing or decreasing of some bacteria, hence the modification of the gut microbiota, using diet, prebiotics and probiotics, might function to re-balance the composition of gut microbiota and, hopefully relieve the SLE progression in a safe manner than the immunosuppressed drug. Our ongoing experiment is to perform fecal transplant of fecal portion that was statistically denoted important to the non-SLE, as preventive and therapeutic strategies, and determine gut microbiota along clinical SLE progression, in pristane and FcRIIb knockout mice.

Microbial Comparative Genomics (MGE Standards: viruses and plasmids)

Nikos Krypides (JGI)

Mobile genetic elements (MGEs), which comprise viruses and plasmids, are highly abundant across all life forms, showcasing a remarkable array of genetic diversity. Their exceptional capability to mobilize is pivotal in facilitating horizontal gene transfer, a mechanism that enables the acquisition of genetic information through means other than vertical inheritance. This process fosters the exchange of genetic material among distantly related lineages, profoundly influencing evolution, ecological innovation, and the dynamics of biological communities and biogeochemical cycles. Despite their crucial role, the functional repertoire of MGEs remains largely unexplored, primarily due to a significant portion of their genes lacking known functions. Additionally, the dynamics of gene gain, loss, and exchange have not been systematically studied across large datasets.
I will summarize recent large scale efforts to characterize the diversity of MGEs, and discuss some ideas for the development of new genomic standards for these genetic elements. By setting up new genomic standards for MGEs, researchers can enhance our understanding of their functional significance and unravel their potential impact on various aspects of life, including evolution, ecological dynamics, and biogeochemical cycles. The development of such standards promises to open up exciting avenues for future research, making strides towards unlocking the full potential and implications of MGEs in the world of genetics and beyond.

Towards producing representative genome catalogues for microbial communities

Rob Finn (MGnify, EMBL-EBI, UK)

An increasingly common output arising from the analysis of metagenomic datasets is the generation of metagenome-assembled genomes (MAGs). However, the discovery of MAGs is problematics and there are no requirements to archive them in an INSDC database, and comparison of these MAG collections is hampered by the lack of uniformity in their generation, annotation and storage. To address this, we have developed MGnify Genomes, a growing collection of biome-specific microbial genome catalogues generated using MAGs and publicly available isolate genomes from a common biome. Strategies for improving these catalogues and subsequent expansion will be discussed, as well as an outline of how these can be used to contextualise new datasets.

How do you achieve standards in taxonomy when taxonomic freedom is paramount?

Phil Hugenholtz (The University of Queensland, Australia)

Naming of microorganisms (nomenclature) is highly standardized through nomenclatural codes such as the International Code of Nomenclature of Prokaryotes (ICNP), which have rules governing how names are formed and used. By contrast, classification of microorganisms (taxonomy) is completely unregulated to ensure freedom of taxonomic opinion. This is due in part to the recognition that taxonomy could become methodologically outdated if fixed in time. However, we have entered the age of genome-based taxonomy, and genomes are the most fundamental blueprints of life making it unlikely that a widely accepted alternative methodology resulting in a radically different and improved taxonomy will be developed. I suggest that now is the time to adopt a standardized taxonomic framework based on comparative analysis of genome sequences, or at least we should establish a primary taxonomic reference ('Tassonomia Franca') that will facilitate unambiguous scientific communication of microbial diversity.

Close meeeting and Handoff to GSC24
Lynn Schriml (GSC president)


Lunch and Networking

Afternoon Breakout Sessions - NO VIRTUAL ATTENDANCE
Click + to expand

Afternoon Breakout Session Options for in-person participants. Summary and/or minutes to be posted for asynchronous input/consultation of virtual participants.
Breakout groups (in-person only); findings posted online for input/consultation
● Topic 6. Standardising Open Microbiome Data analysis and workflow sharing to further federated resource development

Session chair:Rob Finn
Friday August 11th

Industry workshops

MGNify: The EMBL-EBI metagenomics analysis portal. Rob Finn (EMBL-EBI, UK)
Rob Finn, Lorna Richardson, Tanya Gurbich

Registration for this workshop is required, please see

Oxford Nanopore Sequencing Training (whole day). Various trainers.
Scott Tighe, Tip Wongsurawant and Nanopore Staff

Registration for this workshop is required, please see


If you have any problems or questions, please contact us via e-mail at:

© Genomics Standards Consortium 2022. Powered by Jekyll