· Nicole Belles · Genetics 101
How is DNA data stored?
From cheek swab to encrypted file—here's how your child's DNA gets translated into digital data, what kinds of files it produces, and how it's kept secure.
Think of your genome (i.e., your full DNA) as a very long instruction manual written in chemical building blocks, called bases or nucleotides, representing 4 letter codes (A, C, G, T). A, C, G, T that stand for the building blocks of Adenine, Cytosine, Guanine and Thymine. Genetic testing turns that biological code into a digital file, similar to how a photo becomes a JPEG or a document becomes a PDF.
First, your DNA is read by a machine. You start by giving a sample (i.e., cheek swab). In a lab, machines called DNA sequencers read your DNA. They don’t store your DNA physically. They translate it into letters (A, C, G, T) for data storage. Think of this process being similar to scanning a book and converting it into editable text.
After sequencing, your DNA is stored as a very large digital file, basically long strings of letters. Your full genome is ~3 billion letters and that is why raw data files are very large (i.e., gigabytes in size).
A small section might look like:
ACGTGGTCAAGT...
These raw files often have names like:
- FASTQ is the rawest version
- BAM/CRAM is a compressed, organized version
An analogy might be that FASTQ is raw video footage compared to BAM/CRAM as an edited, compressed movie file.
Next, software organizes and interprets the data by comparing your DNA to a reference human genome.
The software looks for differences (called variants), missing or extra pieces or known markers linked to traits or health risks. This process creates a smaller, more useful file known as the VCF (Variant Call Format), which is a list of your unique genetic differences. Instead of storing the whole genome, this file only stores your unique genetic information.
Your genetic results are digitally stored in an encrypted database in GeneSprout’s secure system. This is similar to systems used by banks. The data is often split into raw data (e.g., very large files) and the interpreted results (e.g., smaller summary files).
Lastly, you receive a family and pediatrician friendly report. The report is a summarized interpretation of your genetic data, not your full genome. The report may indicate that no significant findings were detected for the screened conditions or it may indicate variants linked to an increased risk for a certain screened condition. GeneSprout only screens for a specific set of actionable childhood conditions.
Key Takeaways
- Your DNA is converted into letters (A, C, G, T)
- Those letters are stored as large digital files
- Software finds important differences and stores them in smaller summary files
- GeneSprout keeps this data in a secure system
- You receive an easy to understand results report that can be shared with your pediatrician
How is your genetic data secured during storage?
Protected by U.S. healthcare privacy laws
Your genetic data is covered by HIPAA, which governs how medical data is stored and shared. It is also protected by the Genetic Information Nondiscrimination Act (GINA) which prevents discrimination by health insurers and employers based on genetics.
It’s encrypted
Your genetic data is treated like highly sensitive medical information. It is encrypted when stored and when transmitted. Encryption turns your data into unreadable code unless someone has the key. Without the key, the data looks like gibberish.
Strict access controls are applied
Only authorized people or systems can access your data. Login systems with strong authentication and detailed logs tracking access. Only you have access to your data unless you explicitly provide consent to allow others (your doctor, a researcher) to access it.
Data is de-identified if used for research purposes
Data is only used for research purposes if you explicitly consent. Data used in research is de-identified. This means the name and other personally identifiable information is removed. Additional safeguards are also applied to ensure the security of the data.
This article is for informational purposes only and is not a substitute for professional medical advice. If you have concerns about your child’s health, speak with your pediatrician.