Schema Inference uses annotated Confidence Tiers rather than aggressive or conservative type inference
filedge inspect annotates each inferred column type with a Confidence Tier (high / low / ambiguous) rather than silently picking the most specific type or defaulting everything to string.
Two alternatives were considered. Aggressive inference — infer the most specific type whenever the majority of sampled values parse — produces better starting configs but silently misleads operators when sparse nulls or format exceptions appear beyond the sample window. An operator who accepts an integer (high) inference on a column that turns out to have N/A on row 50,000 gets a pipeline that FAILs in production with no warning they were given. Conservative inference — default to string unless 100% of sampled values parse cleanly — avoids false positives but produces configs full of string columns that operators must correct manually, defeating the tool's purpose.
The annotated approach was chosen because the tool samples only the first N rows by default (1,000). That is a deliberate limitation for speed, not a guarantee of completeness. Showing the operator exactly what the tool observed — and how much trust to place in the inference — makes the limitation visible rather than hiding it inside a confident-looking type declaration. Operators can accept high-confidence inferences quickly and scrutinise low and ambiguous ones, rather than having to verify every column from scratch. The required: field follows the same pattern: required: true (no nulls in sample — verify against full dataset) is more honest than a silent required: true that breaks on the first null outside the sample window.