RunPod Blog (Page 7)

RunPod Blog

Sign in Subscribe

How To Install SillyTavern in a RunPod Instance

Text Generation

How To Install SillyTavern in a RunPod Instance

While some might prefer the simple text-based entry of an oobabooga interface, others might want something a little more robust. SillyTavern offers a number of additional features above and beyond most methods of interfacing with an LLM, and there's been some demand on getting it set up and

16k Context LLM Models Now Available On RunPod

Text Generation

16k Context LLM Models Now Available On RunPod

Hot off the heels of the 8192-token context SuperHOT model line, Panchovix has now released another set of models with an even higher context window, matching the 16384 token context possible in the latest version of text-generation-webui (Oobabooga). Such a large context window is going to vastly improve performacne in

RunPod Partners With Defined.ai To Democratize and Accelerate AI Development

Runpod Platform

RunPod Partners With Defined.ai To Democratize and Accelerate AI Development

RunPod is excited to announce its partnership with Defined.ai, the world's largest marketplace of ethically sourced training datasets for AI models. This collaboration seeks to provide AI developers working with text-to-speech, speech-to-text models, and those fine-tuning LLMs the opportunity to access enterprise-grade conversational speech and text datasets.

A Deep Dive Into Creating an Effective TavernAI Character

A Deep Dive Into Creating an Effective TavernAI Character

Roleplay has become a surprisingly popular use of AI over the past year, with entire services popping up devoted specifically to interacting with characters, fictional or not. It's begun to breach the realm of academic writing (such as this paper on ArXiv on LLM RP from EleutherAI from

How To Use Very Large Language Models with RunPod - 65b (and higher) models

Text Generation

How To Use Very Large Language Models with RunPod - 65b (and higher) models

Many LLMs (such as the classic Pygmalion 6b) are small enough that they can fit easily in almost any RunPod GPU offering. Others such as Guanaco 65B GPTQ are quantized which is a compression method. to reduce memory usage, meaning that you will be able to fit the model into

SuperHot 8k Token Context Models Are Here For Text Generation

Text Generation

SuperHot 8k Token Context Models Are Here For Text Generation

Esteemed contributor TheBloke has done it again, and textgen enjoyers everywhere now have another avenue to further increase their AI storytelling partner's retention of what is occurring during a scene. Available on his GitHub are quantizations of several well known models, including but not limited to the following:

Worker | Local API Server Introduced with runpod-python 0.10.0

Worker | Local API Server Introduced with runpod-python 0.10.0

Up to this point, developing a serverless worker has required test inputs to be passed in through a test_input.json file or, alternatively, passed in with the --test_input argument. While this method works fine, it doesn't fully replicate the interactive nature of an API server. Today,

VS Code Server | Local-Quality Development Experience

VS Code Server | Local-Quality Development Experience

Experience a new level of development with Visual Studio Code (VS Code) and RunPod. This guide will walk you through using the VS Code Server template on RunPod, enabling you to leverage GPU instances for your development needs. By the end of this tutorial, you will be able to interact

Savings Plans Are Here For Secure Cloud Pods- How To Purchase a Monthly Plan And Save Big

Runpod Platform

Savings Plans Are Here For Secure Cloud Pods- How To Purchase a Monthly Plan And Save Big

If you are a frequent RunPod customer, you may be interested in our latest feature, Savings Plans. These plans allow you to invest in an up-front payment in a specific pod type that saves you more money the more you use you pod. For individuals with high workloads that anticipate

Deploying Python Machine Learning models on RunPod, without any docker-stress

Deploying Python Machine Learning models on RunPod, without any docker-stress

What if I told you, you can now deploy pure python machine learning models with zero-stress on RunPod! Excuse that this is a bit of a hacky workflow at the moment. We'll be providing better abstractions in the future! Prerequisites and Notes * The tutorial only works for containers

Runpod Platform

RunPod is Proud to Sponsor the StockDory Chess Engine

Chess engines are a powerful tool for players of all levels. They provide an accurate and detailed analysis of the board position, allowing players to rapidly analyze specific positions or entire games. Furthermore, chess engines can be used to test various strategies on different openings or endgames, helping players to

Introducing FlashBoot: 1-Second Serverless Cold-Start

Introducing FlashBoot: 1-Second Serverless Cold-Start

RunPod's serverless journey started just a few months ago, yet we've come a long way. In pursuit of reducing costs, striving for efficiency, and performance improvements, we are finally making FlashBoot available for all endpoints at no additional cost! 🎉 What is FlashBoot? We have been tinkering

A1111 Serverless API, Step-by-Step Video Tutorial

A1111 Serverless API, Step-by-Step Video Tutorial

This blog post features a video tutorial from generativelabs.co that provides step-by-step instructions on how to use the Stable Diffusion A1111 API with RunPod Serverless. Whether you're a beginner or an experienced user, the RunPod & Stable Diffusion Serverless video tutorial offers useful information. Beginners will find

KoboldAI - The Other Roleplay Front End, And Why You May Want to Use It

Text Generation

KoboldAI - The Other Roleplay Front End, And Why You May Want to Use It

As many blog entries in the past have been written on Oobabooga/text-generation-webui, we would be remiss if we failed to mention there was another much-loved frontend available for use on Runpod that may be of significant value to anyone interested in writing or roleplaying with an AI. KoboldAI comes

Breaking Out Of The 2048 Token Context Limit in Oobabooga

Text Generation

Breaking Out Of The 2048 Token Context Limit in Oobabooga

Since its inception, Oobabooga has had a hard upper limit of context of 2048 tokens for how much it can consider. Since this buffer includes everything in the Chat Settings panel including context, greeting, and any additional recent entries in the log, this can very quickly fill up to the

Groundbreaking H100 NVidia GPUs Now Available On RunPod

Groundbreaking H100 NVidia GPUs Now Available On RunPod

The demand for generative AI models continues to explode, as does the need for hardware capable of harnessing their ever-escalating performance requirements. While consumer-grade GPUs typically used for gaming are great for learning, tinkering, or hobbyist pursuits, as noted in our previous blog the computational demands can quickly outstrip the

Introducing our Faster-Whisper Serverless Endpoint

Introducing our Faster-Whisper Serverless Endpoint

You read the title! Whisper just got faster with RunPod's new Faster-Whisper serverless endpoint. What is Whisper? For those who haven't used it before, Whisper is an AI speech recognition model trained on hundreds of thousands of hours of multilingual human speech. It's great

How To Remix Your Artwork with ControlNet And Stable Diffusion

How To Remix Your Artwork with ControlNet And Stable Diffusion

While you're undoubtedly familiar with the functions that Stable Diffusion includes that lets you derive images using other images such as input (img2img) you may find there's a certain level of being at the mercy of the model and simply constantly reiterating until you get something

Creating a Vlad Diffusion Template for RunPod

The default Pod templates and models are pretty cool (if we say so ourselves), but play with them for too long and you'll start to get used to them. If you're looking for something new and exciting again, it might be time to create a new

How to Work With Long Term Memory In Oobabooga and Text Generation

Text Generation

How to Work With Long Term Memory In Oobabooga and Text Generation

As fun as text generation is, there is regrettably a major limitation in that currently Oobabooga can only comprehend 2048 tokens worth of context, due to the exponential amount of compute required for each additional token considered. This context consists of everything provided on the Character tab, along with as

How to Create Convincing Human Voices With Bark AI

How to Create Convincing Human Voices With Bark AI

The Bark AI model is an innovative technology that can be used to generate realistic human voices. This technology takes advantage of the latest advancements in natural language processing, deep learning, and voice synthesis. By combining these technologies, it is able to accurately replicate the nuances and inflections of real

Run Hugging Face spaces on RunPod!

Run Hugging Face spaces on RunPod!

Hugging Face Spaces are interactive demos that showcase AI models directly on the Hugging Face platform. They're great for experimenting with AI capabilities, but what if you want more computing power or need to run these models in your own environment? Or you want to use them as

Reduce Your Serverless Automatic1111 Start Time

Reduce Your Serverless Automatic1111 Start Time

I've found that many users are using the Automatic1111 stable diffusion repo not only as a GUI interface, but as an API layer. If you're trying to scale a service on top of A1111, shaving off a few seconds from your start time can be really

Pygmalion-7b from PygmalionAI has been released, and it's amazing

Text Generation

Pygmalion-7b from PygmalionAI has been released, and it's amazing

Last month, the latest iteration of the Pygmalion model was released. Although it is not that much larger as it is still only a 7b model compared to the commonly used 6b version, what it does with that parameter space has also been improved by leaps and bounds, especially with

Kohya LoRA on RunPod

Kohya LoRA on RunPod

Our good friend SECourses has made some amazing videos showcasing how to run various genative art projects on RunPod. His latest video, titled "Kohya LoRA on RunPod", is a great introduction on how to get into using the powerful technique of LoRA (Low Rank Adaptation). Here's