Machine learning and Morph Identification - It actually kinda works

john · October 29, 2020, 4:21pm

@erie-herps In this approach, you don’t usually “teach” deep learning models anything directly. You just show them many instances and they learn for themselves. That’s a big part of the beauty of deep learning. It moves away from the manual creation of features and heuristics as we done in the 2000’s.

@chesterhf may I ask what type of model you trained? I assume you’re using some kind of transfer learning, perhaps fine-tuning over a model that was pretrained on ImageNet?

This project on my longer list to play with as well. I knew that impressive results could probably be obtained with just a few hours of playing around. Transfer learning does quite well with a small amount of data, which is good since it might be difficult to gather a large amount of data given that sites like MorphMarket usually have TOS that prohibit scraping. I am also curious how it performs on the harder cases versus experts.

Nicely done.

chesterhf · October 29, 2020, 5:44pm

Unfortunately this isn’t feasible as this system looks at images as a whole and forms algorithms based on patterns and similarities. Short of providing accurately labeled images for it to train on, I can’t really teach it anything.

Currently I don’t have hets as their own categories, because even thought they can influence colors and patterning, that would create more problems then it would solve. Given that I need 50+ pictures of each morph to train on, including hets as their own category would take away from the images I need to train the basic morphs and I wouldn’t be able to use possible hets at all. Under the current system, a single gene Mojave, Mojave het pied and a Mojave 66% pos het lavender would all just be labeled as “Mojave”.

Honestly something like this would be ideal if we wanted to forge ahead with creating a morph ID program. I think “crowdsourcing” some of it will be essential in order to get enough images of all the morphs and because of the vast amount of time this is going to consume.

I’m using Microsoft Lobe, which was just released on Oct. 26th and is an image classification machine learning software. It’s essentially a blank slate that chooses the best algorithm/architecture for your model once you start training. Currently I’m using ~50 images per morph to train on and then 10-20 more for practice. Most of the images are from MorphMarket ads and I’ve tried to be vigilant about only using pictures that I was confident were correctly labeled and appeared to be a valid example of what the morph generally looks like.

Sorry…I can not and try to find images elsewhere

erie-herps · October 29, 2020, 6:04pm

What do you mean by scraping in this context. Usually scraping is with bots that extract the html and data from the main database along with the pictures which extracts personal information of users and would allow the scraper to copy the website under a different similar domain and make it look like the original and edit the entire website for data harvesting. If you meant scraping as that then I don’t know what harm could be done by @chesterhf using lots of images for the software. If you meant scraping as another term then I don’t know if that would be in violation or not but under this definition I think it would be okay.

chesterhf · October 29, 2020, 6:10pm

I’m not doing all of that, just literally going ad by ad and downloading images and labeling them by morph, so no information is downloaded or attached to them. Since it didn’t fit the definition of scraping I figured it would be fine to use the images for training purposes

erie-herps · October 29, 2020, 6:13pm

That’s why if that definition is accurate to the intent of the rule then you should be fine since you’re extracting pure pictures, not information and definitely no ill-intent.

bz_exoticz · October 29, 2020, 7:46pm

This is a really cool project. I have zero experience with machine learning but have often wondered if it could be applied.

Random thoughts from someone that knows nothing.

It would be interesting if you could tier the morphs in a sense for the program. If there was a way to designate W/T Normal as the base then incomplete dominate morphs, both heterozygous and homozygous, then hets visual and recessive, allowing for the computer to determine over time how they affect the look of a snake. In other words, defining a punnet square and genetic theory into a program. Over time, it could be combined with a genetic calculator type model and predict changes.

eaglereptiles · November 24, 2020, 7:21pm

A month later…

How’s this going @chesterhf ?

chesterhf · November 24, 2020, 8:18pm

I haven’t done too much with it recently, I think the last morphs I added were Pinstripe and Piebald, and then ran into the immediate problem that even though a “high white”, “medium white” and “low white” piebald all have the same morph label, they look pretty different. It really confused the program and threw things off a bit and I wasn’t sure how to proceed with morphs have have super variable phenotypes (like Pied) or morphs that changed color/pattern as they aged (like Banana). Also as it got more complex I wasn’t sure how much computing power would be required to continue because my laptop only has 16GB of RAM, and I really don’t want to bog it down too much because I need it for work.
Then I realized I had spent entirely too much time focusing on it and really needed to finish my thesis so I can graduate.
So it has a lot of potential, is surprisingly accurate, but I’ve put it to the side for now. If anyone wants to continue with it or collaborate on something I’m happy to to export the project, because it would make a really cool app or addition to Morph Market