Blog
Replicate
STABLE

LLaVA 13B

Visual instruction tuning towards large language and vision models with GPT-4 level capabilities

Inputs
Prompt
Prompt to use for text generation
Image
Input image
Max Tokens
Maximum number of tokens to generate. A word is generally 2-3 tokens
Temperature
Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic
Top P
When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
Outputs
Collection
The output strings