We incorporate an inefficient reference PyTorch implementation in gpt_oss/torch/model.py. This code uses essential PyTorch operators to indicate the exact model architecture, with a small addition of supporting tensor parallelism in MoE so which the much larger model can operate with this particular code (e.in any case , I am pleased that I was abl