flash-attention-with-sink implements an attention variant used in GPT-OSS 20B that integrates a "sink" step into FlashAttention. This repo focuses on the forward path and provides an experimental ...
This server operates in READ-ONLY mode for safety. It can read and analyze memory but cannot modify it. All operations are logged for security auditing.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results