Blog
Replicate
STABLE

Gemma 2B Instruct

2B instruct version of Google’s Gemma model

Inputs
Max New Tokens
Maximum number of tokens to generate. A word is generally 2-3 tokens
Min New Tokens
Minimum number of tokens to generate. To disable, set to -1. A word is generally 2-3 tokens.
Output Index
If output is a list, which element to use
Prompt
Prompt to send to the model.
Repetition Penalty
A parameter that controls how repetitive text can be. Lower means more repetitive, while higher means less repetitive. Set to 1.0 to disable.
Temperature
Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic, 0.75 is a good starting value.
Top K
When decoding text, samples from the top k most likely tokens; lower to ignore less likely tokens
Top P
When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
Outputs
Value
The output string