Quick Tip: Statistics on Booleans
Due to q
’s type promotion rules, it’s entirely legal to use statistical functions on boolean vectors. avg
tends to be the most useful, but all of them should work as expected.
A typical use case: average null
ness (handy while developing ETL code). We will simulate with a vector of float
s mistakenly parsed as int
s (due to mostly looking like int
s):
q)v:@[string 1000?1000;-10?1000;,[;".123"]] q)t:flip(enlist`v)!(enlist"I";" ")0:v q)t v --- 468 959 221 694 934 865 344 997 314 580 45 745 898 935 64 177 238 361 850 241 .. q)avg null t v| 0.01 q)
So 10% of t.v
is null, which (unless this is expected) should cause us to ask ourselves whether "I"
was really the right parse.
Labels: tip