SweGen whole-genome sequencing from the Northern Sweden Population Health Study

Swedish population
genetic variation
reference dataset
genomics
Dataset
Author

Uppsala University

Published

April 4, 2025

The dataset contains whole-genome sequencing data (aligned read files) in CRAM-format (lossless compression) for a total of 58 DNA samples originating from the Northern Sweden Population Health Study (NSPHS). For each of the 58 individuals, DNA was extracted from a blood sample and subject to whole genome sequencing (WGS). The WGS was performed using 2x150 bp paired-end chemistry on Illumina HiSeq X Ten instrumentation at the SciLifeLab National Genomics Infrastructure (NGI) in Stockholm and Uppsala. FASTQ files generated by WGS were analyzed using the nf-core pipeline Sarek, which includes pre-processing, alignment to the human GRCh38 reference genome, and germline variant calling. The NSPHS study was approved by the local ethics committee at the University of Uppsala (Regionala Etikprövningsnämnden, Uppsala, 2005:325 and 2016-03-09). All participants gave their written informed consent to the study including the examination of environmental and genetic causes of disease in compliance with the Declaration of Helsinki.

This dataset is 1 of 4 included in the study titled SweGen: a whole-genome data resource of genetic variability in a cross-section of the Swedish population, http://identifiers.org/ega.study:EGAS50000000906.

Official landing page: http://identifiers.org/ega.dataset:EGAD50000001325