Raw Data

8 Key Points about Raw Data Files

If you are going to download a raw data file from an at-home DNA test like 23andMe or AncestryDNA, here are eight key points to be aware of:

•       The responsibility for the security and privacy of the data is in your hands once you     download the file

•       Raw data have not been validated (thus files can, and often do, contain errors)

•       Raw data generated by one company will typically differ from the next

•       Raw data generally include markers from only a small fraction of the entire genome

•       There are consistent issues recognized for certain markers in raw data files (i.e. false positives, also called miscalls)

•       There are additional problematic markers currently unknown, uncomfirmed, and/or unreported

•       A raw data file without a separate tool to analyze it is generally not useful

•       A finding in the raw data can be a “hint” in the right direction but is never the final answer


One of my areas of specialty work through Watershed DNA is helping people with their raw data.

Raw data can be very useful for many purposes, but there are limitations. This doesn't mean using raw data is of no benefit. Rather, it's better to know there are both benefits and limitations, and be aware of them as you move forward.

I’ve taken my own raw data to many different third-party tools, some for genealogical purposes (investigating how my DNA matches other people's, for example) and some for health (figuring out if any well-established health risk markers could be found). My "insider’s" view has given me a better understanding of raw data generated by consumer genetics companies testing and how to make the best use of the data.

I have worked with a number of clients in the past interested in understanding how to “do more”. Some have wanted basic guidance in what direction to take with a raw data file, and some have wanted me to do more of the leg work. I’m happy to meet my clients where they are and help them on to the next steps.

Have a raw data file and interested in knowing what to do with it?

Reach out through my contact me button in the upper right corner. I’d be happy to work with you.  


Raw Data: What is it?

You know that phrase "No moss grows on a rolling stone"? I think the world of consumer genomics is best considered as the rolling stone that will never find an end.

Much has happened in the consumer genomics world in the past 8 months since I published a video on YouTube to explain "raw data" and its uses, benefits, and limitations.

It could use some updating, but the basic messages are unchanged: 

1) You can get more than you bargained for when you hunt through your raw data.

2) You might go through a period of confusion before you have a sense of clarity again.

3) You can contribute your information to research and help future generations.

4) No two people will have the same experiences or emotional reactions to downloading, uploading, and uncovering information from a raw data file.  

5) I am here as a resource.

Before you take your raw data out of your ancestry testing account, please consider stopping and watching this video: "DNA Raw Data: What is it?"

Reach out for a one-time consultation with me, before you make the download or after you've used a tool to sort through your raw data and have gotten back a report. I don't mind chasing a rolling stone with you! It makes for an interesting and enlightening journey, for sure.

"High ROH" in your DNA - what is it and what can you do next?

Sometimes in the course of testing DNA, we get a surprise with the results. A surprise we have found happening more often the more we test is the DNA feature of high levels of ROH, or "runs of homozygosity." This feature often reveals a recent close genetic relationship between the parents of the person tested. 

If you have used a tool to analyze your DNA like the "Are Your Parents Related?" tool on GEDmatch (or David Pike's tools), and the results show the probability of your parents being closely related is high, the "High ROH" information sheet is for you. Click the link below to see it. 

Sometimes only a small region of DNA (or only one chromosome) shows a high ROH result. Different biological reasons explain these findings. Reach out to Watershed DNA if you need more support.

If you could use support in the form of the story of someone else who discovered ROH, you can follow this link to John’s story about his ROH discovery here.

**This post was updated 12/29/18 to include the link to the updated version of John’s story and shared with his permission.