Hey there! 👋
I've been contemplating building a solution leveraging the stable-diffusion model. I've come across several intriguing serverless GPU providers like replicate.com, banana.dev, and the like.
A friend pointed out, though, that these might run a bit slow. Plus, I made an interesting discovery while comparing costs. If you take the Nvidia T4 GPU replicate pricing ($0.033 per minute), AWS actually comes out about 3x cheaper for an Nvidia T4 GPU on-demand instance ($0.0087 per minute) and even 10x cheaper when you factor in EC2 spot instances.
So I'm thinking, wouldn't it make more sense to just use my own AWS GPU?
Do you know if there's a tool or something that might make this process easier?
Any tips or advice you can share? 🤔
06/11/2023, 11:43 AM
You are missing the serverless part, the cost is only when the GPU is used else it would be really cheap on standby. If you need it for training then spot instances over ec2 might make sense but in production with less frequency better to start with serverless GPU till it gains some steam
06/11/2023, 11:47 AM
Yes. @Amal David I agree and in the case of production with serverless GPU. Which solution will you choose ?Platform like replicate.com, banana.dev if you want to use external resources but if you want to use your AWS credits for example. Do you know a tool that serverless GPU in house ?
06/11/2023, 11:52 AM
AFAIK, AWS doesn't have serverless GPU. If you have credits to burn within a short window then go ahead with AWS itself.
There were 2 reasons why these platforms had quick adoption
1. Serverless GPU so less cost during dev cycle and in early stages
2. Getting compute quota in time with AWS/GCP, sometimes it takes days to even get extra 4vCPUs in AWS which might be the need of the hour
Also checkout baseten, it's a little cheaper there as well
06/11/2023, 12:02 PM
Good point about quota. I was missing it.
From what i check the price of running GPU preemptible instances in your infra seems to be always cheaper: https://fullstackdeeplearning.com/cloud-gpus/
So i was thinking about building something like https://github.com/ebhy/budgetml ( GCP based) for AWS using EC2 spot instances.
But maye be it's premature optimzation and should go for not in house (replicate,bana,baseten) ... But I wonder why these companies don't offer capability to deploy their service in customer infra. Because it seems that service can be slow sometimes (friend of mine who is using it) + allow people to use your own cloud credits. I have personnaly 1000$ in AWS due to AWS Activate program
06/11/2023, 4:12 PM
Spot vs preemptible are much similar between AWS and GCP.
It's highly unlikely any of these providers would give deploy in own infra as that takes away their USP/monetization. They basically have multi-cloud provisioning with committed usage and load balancing to optimize cost between multiple vendors/availability which helps them negotiate better deals with any cloud providers. Think economies of scale helping them make more $$.
But yes, anything on the open-source might help save a few $$ on the short run for startups