RunPod Blog (Page 3)

RunPod Blog

Sign in Subscribe

Run Deepseek R1 On Just 480GB of VRAM

Run Deepseek R1 On Just 480GB of VRAM

Even with the new closed model successes of Grok and Sonnet 3.7, DeepSeek R1 is still considered a heavyweight in the LLM arena as a whole, and remains the uncontested open-source LLM champion (at least until DeepSeek R2 launches, anyway.) We've written before about the concerns of

GitHub Integration Now In GA - Build Images from GitHub Repos Even Faster

GitHub Integration Now In GA - Build Images from GitHub Repos Even Faster

RunPod is pleased to announce that our GitHub integration is officially out of beta and ready for production use! This feature enables you to iterate your work more quickly by building packages to deploy on RunPod serverless directly from a GitHub repo, removing all of the friction involved in creating

Introduction to Websocket Streaming with RunPod Serverless

Introduction to Websocket Streaming with RunPod Serverless

In this followup to our 'Hello World' tutorial, we'll create a serverless endpoint that processes base64-encoded files and streams back the results. This will demonstrate how you can work with file input/output over our serverless environment by encoding the file as data within a JSON

Founder Series #1: Origin Story

Founder Series #1: Origin Story

Let's get more personal and establish a baseline. For everyone that's been enjoying RunPod, thank you for spreading the word. I am Pardeep Singh (aka flash-singh), CTO at RunPod and one of the co-founders along with Zhen Lu. What triggered me to share? Of all things

How To Run a "Hello World" in RunPod Serverless

How To Run a "Hello World" in RunPod Serverless

If you're new to serverless computing and Docker, this guide will walk you through creating your first RunPod serverless endpoint from scratch. We'll build a simple "Hello World" application that demonstrates the basic concepts of serverless deployment on RunPod's platform. You'

Mistrall Small 3 Eschews Synthetic Data - What Does This Mean?

Mistrall Small 3 Eschews Synthetic Data - What Does This Mean?

Mistral AI has released Mistral Small 3, which claims to not use synthetic data in their training pipeline. Weighing in at 22B with 32k context, this is a lightweight model that can be run at full weights on an A40, and nearly any GPU spec that we offer while quantized.

The Complete Guide to GPU Requirements for LLM Fine-tuning

The Complete Guide to GPU Requirements for LLM Fine-tuning

When deciding on a GPU spec to train or fine-tune a model, you're likely going to need to hold onto the pod for hours or even days for your training run. Even a difference of a few cents per hour easily adds up, especially if you have a

DeepSeek R1 - What's the Hype?

DeepSeek R1 - What's the Hype?

DeepSeek R1 is a recently released model that has been topping benchmarks in several key areas. Here's some of the leaderboards it's shot to the top of: * LiveBench: Second place, with only GPT4 o1-2024-12-17 surpassing it as of this writing. * Aider: Ditto. * Artificial Analysis: Fifth place,

5090s Are Almost Here: How Do They Shape Up Against the 4090?

5090s Are Almost Here: How Do They Shape Up Against the 4090?

Another year has come and another new card generation from NVidia is on the way. 5090s are due to become widely available this January, and RunPod is going to be extremely eager to support them once they do. Along with the new Blackwell architecture, the 5090 is set to launch

How Do I Transfer Data Into My Pod?

How Do I Transfer Data Into My Pod?

We've got a number of questions recently about how to transfer data into pods, and while do we have some transfer methods in our docs we'd like to go into a bit more detail on how to upload files into your pod. In general, for large

What's New for Serverless LLM Usage in RunPod in 2025?

What's New for Serverless LLM Usage in RunPod in 2025?

Out of all of the use cases that our serverless architecture has, LLMs are one of the best examples of it. Because so much of LLM use is dependent on the human using it to process, digest, and type a response, you save so much on GPU spend by ensuring

H200 Tensor Core GPUs Now Available on RunPod

H200 Tensor Core GPUs Now Available on RunPod

We're pleased to announce that H200 is now available on RunPod at a price point of $3.99/hr in Secure Cloud. This GPU spec boosts the available VRAM for NVidia-based applications up to 141GB in a single unit along with increased memory bandwidth. Here are how the

RunPod Sponsors CivitAI's Project Odyssey 2024 Competition

RunPod Sponsors CivitAI's Project Odyssey 2024 Competition

RunPod is proud to sponsor Season 2 of Project Odyssey 2024 from CivitAI, the world's largest AI filmmaking competition. We've written in the past about prominent open source packages like LTX, Mochi, and Hunyuan Video – here's your chance to show off your skills and

Train Your Own Video LoRAs with diffusion-pipe

Train Your Own Video LoRAs with diffusion-pipe

You can now train your own LoRAs for Flux, Hunyuan Video, and LTX Video with tdrussells' diffusion-pipe, a training script for video diffusion models. Let's run through an example of how this is done with Hunyuan Video. Start Up a Pod First, start up a pod with

Serverless for Artificial Intelligence and Machine Learning Workloads

Serverless for Artificial Intelligence and Machine Learning Workloads

The need to upscale, reduce operational overhead, and bring cost efficiency allows serverless computing to revolutionize AI/ML workloads. Scaling often results in expensive cost management and hardware maintenance that becomes unbearable with traditional infrastructure. RunPod dynamically allocates resources in these instances to work seamlessly with modern AI workflows. This

A Leap into the Unknown: Why I Joined RunPod

A Leap into the Unknown: Why I Joined RunPod

This entry has been contributed by Jean-Michael Desrosiers, Head of Enterprise at RunPod. I take shots—sometimes far too many, and in wildly different directions. I always have, and it’s been a part of my DNA for as long as I can remember. Picture an overly enthusiastic explorer darting

Deploy Repos Straight to RunPod with GitHub Integration

Deploy Repos Straight to RunPod with GitHub Integration

RunPod is pleased to announce its latest feature aimed at making the lives of developers easier: GitHub integration! Previously, Docker images were the primary method of deploying endpoints, and while this is still functional and useful, requires a number of intermediary steps. Now, with GitHub integration you can deploy directly

Lightricks LTXVideo: Sleeper Hit Open Source Video Generation

Lightricks LTXVideo: Sleeper Hit Open Source Video Generation

With new packages like Mochi and Hunyuan Video now out, there have been some other video packages that have come out that have also slipped under the radar that definitely deserve some more love. LTXVideo by Lightricks appears to be slept on despite coming out with an out of the

Building an OCR System Using RunPod Serverless

Building an OCR System Using RunPod Serverless

Learn how to build an Optical Character Recognition (OCR) system using RunPod Serverless and pre-trained models from Hugging Face to automate the processing of receipts and invoices. Introduction Processing receipts and invoices manually is both time-consuming and prone to errors. Optical Character Recognition (OCR) systems can automate this task by

Community Spotlight: How AnonAI Scales Its Chatbot Agents Through RunPod

Community Spotlight: How AnonAI Scales Its Chatbot Agents Through RunPod

RunPod is pleased to share the story of one of our valued clients, Autonomous. We at RunPod believe very strongly in the power of free speech and privacy - our pods are run in secure environments with optional encryption and we stand by our promise that we do not inspect

Announcing Global Networking For Cross-Data Center Communication

Announcing Global Networking For Cross-Data Center Communication

RunPod is pleased to announce its launch of our Global Networking feature, which allows for cross-data center communication between pods. When a pod with the feature is deployed, your pods can communicate with each other over a virtual internal network facilitated by RunPod. This means that you can have pods

How Much Can a GPU Cloud Save You, Really?

How Much Can a GPU Cloud Save You, Really?

Machine learning, AI, and data science workloads rely on powerful GPUs to run effectively, so organizations are deciding to either invest in on-prem GPU clusters or use cloud-based GPU solutions like RunPod. This article will show considerations of infrastructure requirements and compare the cost and performance to help you choose

Scoped API Keys Now Available on RunPod

Scoped API Keys Now Available on RunPod

We've released an expansion to our handling of API keys on RunPod. Previously, you were able to create API keys with read or read and write permissions, but now you can scope keys by endpoint and have more fine-grained control over what your keys allow access to. Here&