The dtype contract
A Neat model route has two contracts:
- the public contract your app sees through
Tensor,Sample,InputOptions, model specs, and graph endpoints; - the model route contract Neat resolves from the compiled model archive and the selected preprocess/postprocess path.
Do not assume every public boundary is FP32. Some boundaries carry images, encoded media, packed detection payloads, INT8 tensors, BF16 tensors, or application-defined tensor semantics. Inspect the specs first; the specs are the contract.
Inside the route, Neat inserts quantization, tessellation, cast, detessellation, dequantization, and postprocess stages when the compiled model contract requires them.
The four MLA input cases
A model archive tells Neat two important things about the first MLA stage:
- the MLA input dtype, usually BF16 or INT8;
- whether MLA-side tessellation is already part of the compiled kernel.
That gives four preprocess graph families:
| MLA dtype | MLA tess | Preprocess graph family | What Neat inserts before the MLA |
|---|---|---|---|
| BF16 | yes | Preproc | Resize, color convert, normalize. The MLA stage tessellates internally. |
| BF16 | no | Tess | Resize, color convert, normalize, tessellate. |
| INT8 | yes | Quant | Resize, color convert, normalize, quantize. The MLA stage tessellates internally. |
| INT8 | no | QuantTess | Resize, color convert, normalize, quantize, tessellate. |
Inspect ResolvedPreprocessPlan to see what the planner chose.
What tessellation means
Tessellation arranges tensor bytes into the tile geometry the MLA input scratchpad expects. It is a layout transformation: same logical tensor, different memory order.
The matching detessellation happens after MLA output when the route needs to return a natural tensor layout to the next stage or to the app.
Boundary upgrades
Neat can add higher-level route stages on top of the four-case dtype decision:
- Generic Preproc: uses
PreprocessOptionsto apply resize, color, layout, normalize, quantize, tessellate, or explicit transform intent before inference. - BoxDecode: decodes detection heads for models that need a detection postprocess stage. The app selects the family with
BoxDecodeType, such asYoloV8, and filtering fields such asscore_threshold,nms_iou_threshold, andtop_k.
These upgrades change which kernels run and what output contract the app receives. For example, a raw model output tensor and a decoded detection tensor are not the same public contract.
What this means for app code
- Inspect
model.input_specs()andmodel.output_specs()before allocating inputs or decoding outputs. - Use
ModelOptions.preprocessto state what kind of input you provide: image input, tensor input, resize, color, layout, normalization, quantization, or tessellation intent. - Use
model.resolved_preprocess_plan()/model.preprocess_plan()to see what Neat planned from your options plus the model archive. - Do not assume output dtype, shape, or layout. Read the output spec and, when needed, the returned tensor metadata.
- Decode boxes, pose, or segmentation only when the output contract is the matching packed format.
- Treat INT8/BF16/tessellation details as route behavior unless a public spec or tensor explicitly exposes them.
No vibes. Read the contract, then move the bytes.
Decode output deliberately
Use the decode helper that matches the output contract.
| Output contract | C++ | Python |
|---|---|---|
| Raw tensor | Use the returned Tensor / TensorList directly | Use the returned tensor directly, or to_numpy(...) / to_torch(...) |
| Packed boxes | simaai::neat::decode_bbox(...) | pyneat.decode_bbox(...) |
| Packed pose | simaai::neat::decode_pose(...) | pyneat.decode_pose(...) |
| Packed segmentation | simaai::neat::decode_segmentation(...) | pyneat.decode_segmentation(...) |
Decoded boxes use a float32 [N, 6] tensor with columns x1, y1, x2, y2, score, and class_id. Pose and segmentation decoders return the boxes plus task-specific tensors for keypoints or masks.
Preserve coordinate metadata
Detection coordinates often need preprocessing metadata to map from model space back to source-frame space. Preserve metadata through the graph when you use letterbox, resize, ROI lists, render, or detection decode.
Relevant metadata can include target size, scaled size, padding, color conversion, axis permutation, normalization, quantization, tessellation, ROI windows, and per-ROI affine transforms.
If decoded boxes land in the wrong place, check metadata propagation before you blame NMS. See Data formats and Preproc ROI Lists.
Related types
PreprocessOptions— application preprocess intent.ResolvedPreprocessPlan— what the planner compiled.PreprocessGraphFamily— which preprocess family was chosen.Tensor— public tensor payload and metadata.Sample— payload plus runtime metadata.