OPENAI INTROS Two models of open languages that can act on consumer graphic processor-optimized to run on devices with just 16 GB of memory

Published:

Opeli has developed some modern models of open language optimized for consumer GPU. IN Blog postOpeli announced “GPT-OSS-120B” and “GPT-OSS-20B”, the first designed to work on one GPU processor by 80 GB, and the second optimized for operation on EDGE devices with only 16 GB of memory.

Both models operate the transformer using an expert mix model, popularized model using Deepseek R1. Despite the concentration of GPT-OS-120B and 20B design on consumer GPU, both support up to 131,072 context lengths, the longest available for local inference. GPT-OS-120B activates the parameters of 5.1 billion on token, and GPT-OS-20B activates 3.6 billion parameters per token. Both models operate alternating dense and locally banded attention patterns and operate a multi -story group with group 8.

Related articles