SweGen whole-genome sequencing from the Northern Sweden Population Health Study

Swedish population
genetic variation
reference dataset
genomics
Dataset
Publisher

Uppsala University

Published

4 April 2025

The dataset contains whole-genome sequencing data (aligned read files) in CRAM-format (lossless compression) for a total of 58 DNA samples originating from the Northern Sweden Population Health Study (NSPHS). For each of the 58 individuals, DNA was extracted from a blood sample and subject to whole genome sequencing (WGS). The WGS was performed using 2x150 bp paired-end chemistry on Illumina HiSeq X Ten instrumentation at the SciLifeLab National Genomics Infrastructure (NGI) in Stockholm and Uppsala. FASTQ files generated by WGS were analyzed using the nf-core pipeline Sarek, which includes pre-processing, alignment to the human GRCh38 reference genome, and germline variant calling. The NSPHS study was approved by the local ethics committee at the University of Uppsala (Regionala Etikprövningsnämnden, Uppsala, 2005:325 and 2016-03-09). All participants gave their written informed consent to the study including the examination of environmental and genetic causes of disease in compliance with the Declaration of Helsinki.

This dataset is one of 4 datasets included in the study “SweGen: a whole-genome data resource of genetic variability in a cross-section of the Swedish population” (http://identifiers.org/ega.study:EGAS50000000906).

Official landing page: http://identifiers.org/ega.dataset:EGAD50000001325