Ball Python Reference Genome

I was reading Dr Seidel and student’s papers. Lots of it was over my head but something I got to wonder about was how much having to use the “scaffold-level assembly” of the Burmese python genome with “discontinuities” is holding back identification of additional ball python mutations. I gathered that Burmese are like >97% similar to balls but it sounds like maybe there are some problems with how the Burmese genome was done presumably some years back.

How much time/money/work would it take to produce a ball python reference genome with today’s equipment and best practices? How much would that help in the pursuit of new ball python mutation tests if someone was willing to fund it? Maybe we could get a Departmental of Agriculture or Department of Commerce grant? Or has technology made it GoFundMe level now?

1 Like

There is an unofficial ball reference genome out there, it just is not in the databases

1 Like

Is it available and more helpful (better quality?) than the 10 year old Burmese reference genome for finding ball python mutations?

I was reading about the billions spent on the first human reference genome and was wondering what it would cost to do a new species now.

I know of a couple snake research folks that have access to it (I am not one of them though)

Cost to sequence a whole genome is maybe a couple grand, plus or minus… Cost to analyze and annotate, that is a fair bit more and also reflective of the quality of the sequence data

1 Like

Still if down from $3,000,000,000 to maybe less than $30,000 in 20 years that is pretty impressive. Maybe $3,000 for everything soon.

Oh yeah, the cost reduction (and size reduction) in sequencing technology has been fantastic. Really the largest issue hampering things now is computation

The raw computer power or the expert interpretation etc? Any hope of AI or just better computer programming helping on that side?

Little from column A, little from column B

It takes a fair amount of computer power to take all the sequence data, sort it, line it up, and then stitch it together. Add in weird genome architectures that require some pretty sophisticated programs to interpret what the sequence really is versus what may get spit out by mistake (think of sequences that repeat dozens or hundreds of time, how do you know you have the right number of repeats?)

Once you have all that in place, then comes the ‘making sense’ part; what are genes and what are not gene? Where are the regulatory elements? What is actually just junk? Which genes, if any, can you accurately identify and tie to a specific function/protein/pathway?

Bioinformatics is making great strides in this area, but they are also playing a bit of catch-up because the data generation has become insane

As for AI… My personal opinion is that the AI of today is still mostly smoke and mirrors so I do not hold out a lot of hope that it will be able to actually solve anything


I’m assuming Aiden lab will have a ball python assembly posted at some point through their DNA zoo project, they’ve been working with the Houston zoo to sequence and assemble a lot of different genomes using Hi-C.


Hi, so I’m completely new to this but I am in a masters in environemental genomics and I am actually extremely interested in the topic, may even get some work done in whole genome assembly at some point (it’s a topic I’m rather fascinated by)
I’ll let you all know if I can get any of my internships to line up with these topics, I’m mosly into metagenomics lately but may actually find this to be a great topic for a thesis!


I admittedly know nothing about genomics but I do know that having one more person interested in furthering research, knowledge and advancement as it pertains to reptiles is a blessing. Good luck with your thesis and with finding an internship that ticks all the boxes for you!

Welcome to the forum, and please keep us posted!