A subtle mistake(?) that means ent requires nuanced usage. The program seems to be a mix of randomness test and entropy measurement. But those are incompatible. Use it carefully either with the -c option for IID min.entropy measurement, or as a compact randomness test focusing on bit/byte distribution only.
“a program, ent, which applies various tests to sequences of bytes stored in files and reports the results of those tests. The program is useful for evaluating pseudorandom number generators for encryption…”
In “evaluating pseudorandom number generators for encryption”, the required entropy rate is 1 bit/bit or 8 bits/byte. Anything substantially less is useless for cryptography. There is no need to measure it as the only result concerning us is a pass/fail determination within agreed confidence bounds, à la the other standard randomness tests like dieharder. Yet there are no bounds and no determination of any p values for confidence other than for a bit/byte distribution $ \chi^2 $.
And it can’t be used for general entropy measurement in it’s default setting, as it reports the wrong type of entropy. Cryptography focuses on the most conservative min.entropy $(H_{\infty})$, not Shannon entropy. ent reports Shannon entropy which is always higher for all sample distributions other than uniform. See Note 1. And uniform distributions are uncommon from most entropy sources.
As an example, see the following entropy calculations for a synthetic IID Gaussian distribution which might be optimal ADC samples of Zener breakdown noise, like:-
Synthetic Zener breakdown noise samples.
$ ent /tmp/gauss.bin
Entropy = 6.369663 bits per byte. <====
Optimum compression would reduce the size
of this 1000000 byte file by 20 percent.
Chi square distribution for 1000000 samples is 2607328.04, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bytes is 126.9692 (127.5 = random).
Monte Carlo value for Pi is 3.999639999 (error 27.31 percent).
Serial correlation coefficient is -0.001901 (totally uncorrelated = 0.0).
Expand ent -c report for above distribution:-
$ ent -c /tmp/gauss.bin
Value Char Occurrences Fraction
29 1 0.000001
32 1 0.000001
36 $ 1 0.000001
37 % 2 0.000002
39 ' 2 0.000002
40 ( 5 0.000005
41 ) 2 0.000002
42 * 3 0.000003
43 + 3 0.000003
44 , 2 0.000002
45 - 3 0.000003
46 . 3 0.000003
47 / 4 0.000004
48 0 5 0.000005
49 1 10 0.000010
50 2 15 0.000015
51 3 11 0.000011
52 4 27 0.000027
53 5 20 0.000020
54 6 34 0.000034
55 7 35 0.000035
56 8 42 0.000042
57 9 53 0.000053
58 : 43 0.000043
59 ; 55 0.000055
60 < 76 0.000076
61 = 84 0.000084
62 > 93 0.000093
63 ? 128 0.000128
64 @ 158 0.000158
65 A 152 0.000152
66 B 195 0.000195
67 C 224 0.000224
68 D 269 0.000269
69 E 296 0.000296
70 F 322 0.000322
71 G 390 0.000390
72 H 453 0.000453
73 I 509 0.000509
74 J 581 0.000581
75 K 710 0.000710
76 L 803 0.000803
77 M 865 0.000865
78 N 981 0.000981
79 O 1128 0.001128
80 P 1323 0.001323
81 Q 1359 0.001359
82 R 1554 0.001554
83 S 1825 0.001825
84 T 1891 0.001891
85 U 2293 0.002293
86 V 2469 0.002469
87 W 2773 0.002773
88 X 2924 0.002924
89 Y 3373 0.003373
90 Z 3679 0.003679
91 [ 3924 0.003924
92 \ 4243 0.004243
93 ] 4684 0.004684
94 ^ 5167 0.005167
95 _ 5541 0.005541
96 ` 6004 0.006004
97 a 6408 0.006408
98 b 7019 0.007019
99 c 7415 0.007415
100 d 8186 0.008186
101 e 8662 0.008662
102 f 9138 0.009138
103 g 9851 0.009851
104 h 10265 0.010265
105 i 10916 0.010916
106 j 11499 0.011499
107 k 12183 0.012183
108 l 12743 0.012743
109 m 13313 0.013313
110 n 14065 0.014065
111 o 14445 0.014445
112 p 15125 0.015125
113 q 15782 0.015782
114 r 16168 0.016168
115 s 16664 0.016664
116 t 17094 0.017094
117 u 17368 0.017368
118 v 18223 0.018223
119 w 18281 0.018281
120 x 18798 0.018798
121 y 18836 0.018836
122 z 19424 0.019424
123 { 19566 0.019566
124 | 19929 0.019929
125 } 19739 0.019739
126 ~ 19682 0.019682
127 20151 0.020151 <==== most common
128 19878 0.019878
129 19778 0.019778
130 19503 0.019503
131 19647 0.019647
132 18958 0.018958
133 19253 0.019253
134 18806 0.018806
135 18488 0.018488
136 17856 0.017856
137 17457 0.017457
138 17007 0.017007
139 16554 0.016554
140 16262 0.016262
141 15537 0.015537
142 14953 0.014953
143 14774 0.014774
144 13832 0.013832
145 13393 0.013393
146 12467 0.012467
147 12017 0.012017
148 11480 0.011480
149 10989 0.010989
150 10487 0.010487
151 9659 0.009659
152 9144 0.009144
153 8597 0.008597
154 8196 0.008196
155 7450 0.007450
156 7029 0.007029
157 6408 0.006408
158 6000 0.006000
159 5499 0.005499
160 5068 0.005068
161 � 4681 0.004681
162 � 4249 0.004249
163 � 3955 0.003955
164 � 3601 0.003601
165 � 3260 0.003260
166 � 3001 0.003001
167 � 2670 0.002670
168 � 2364 0.002364
169 � 2227 0.002227
170 � 2081 0.002081
171 � 1747 0.001747
172 � 1634 0.001634
173 � 1381 0.001381
174 � 1197 0.001197
175 � 1160 0.001160
176 � 1048 0.001048
177 � 827 0.000827
178 � 764 0.000764
179 � 657 0.000657
180 � 595 0.000595
181 � 544 0.000544
182 � 427 0.000427
183 � 402 0.000402
184 � 367 0.000367
185 � 289 0.000289
186 � 236 0.000236
187 � 228 0.000228
188 � 209 0.000209
189 � 151 0.000151
190 � 154 0.000154
191 � 105 0.000105
192 � 90 0.000090
193 � 93 0.000093
194 � 70 0.000070
195 � 52 0.000052
196 � 60 0.000060
197 � 32 0.000032
198 � 32 0.000032
199 � 39 0.000039
200 � 16 0.000016
201 � 17 0.000017
202 � 18 0.000018
203 � 16 0.000016
204 � 10 0.000010
205 � 13 0.000013
206 � 5 0.000005
207 � 10 0.000010
208 � 6 0.000006
209 � 2 0.000002
210 � 1 0.000001
211 � 4 0.000004
212 � 2 0.000002
213 � 1 0.000001
214 � 5 0.000005
218 � 1 0.000001
219 � 1 0.000001
220 � 1 0.000001
223 � 1 0.000001
227 � 1 0.000001
237 � 1 0.000001`
Total: 1000000 1.000000
Entropy = 6.369663 bits per byte. <==== Shannon entropy.
Optimum compression would reduce the size
of this 1000000 byte file by 20 percent.
Chi square distribution for 1000000 samples is 2607328.04, and randomly
would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bytes is 126.9692 (127.5 = random).
Monte Carlo value for Pi is 3.999639999 (error 27.31 percent).
Serial correlation coefficient is -0.001901 (totally uncorrelated = 0.0).
Which gives $Pr(X=127) = 0.020151$ and hence $H_{\infty} = -\log_2(0.020151) = 5.633004$ bits/byte. That is only 88% of ent’s default measure. A wackier sample distribution might drop that percentage considerably lower still. And wacky distributions are certainly possible as you can see elsewhere on this site.
Notes:-
Also on the ent page (bottom) is this:-
BUGS: Note that the “optimal compression” shown for the file is computed from the byte- or bit-stream entropy and thus reflects compressibility based on a reading frame of the chosen width (8-bit bytes or individual bits if the -b option is specified). Algorithms which use a larger reading frame, such as the Lempel-Ziv [Lempel & Ziv] algorithm, may achieve greater compression if the file contains repeated sequences of multiple bytes.
A consequence of note 1 above is that an 8-bit window presupposes IID data with a relaxation period $\ngtr$ 8 bits. Sadly it is common to see ent used (incorrectly) against non-IID data sets. In those cases the default reported entropy would be much higher than the true rate.