Documentation
www.exept.de
Everywhere
for:

Class: LimitedPrecisionReal

Inheritance:

```   Object
|
+--Magnitude
|
+--ArithmeticValue
|
+--Number
|
+--LimitedPrecisionReal
|
+--AbstractIEEEFloat
|
+--Float
|
+--HalfFloat
|
+--LargeFloat
|
+--LongFloat
|
+--QDouble
|
+--RaisedNumber
|
+--ShortFloat
```

Package:
stx:libbasic
Category:
Magnitude-Numbers
Version:
rev: 1.238 date: 2024/01/15 08:48:59
user: cg
file: LimitedPrecisionReal.st directory: libbasic
module: stx stc-classLibrary: libbasic

Description:

```Abstract superclass for any-precision floating point numbers (i.e. IEEE floats and doubles).

Short summary for beginners (find details in wikipedia):
========================================================

Floating point numbers are represented with a sign,
a mantissa and an exponent, and the number's magnitude is:
mantissa * (2 raisedTo: exponent)
with (1 > mantissa >= 0) and exponent adjusted as required for the mantissa to be in that range
(so called ''normalized'')

therefore,
13 asFloat mantissa -> 0.8125
13 asFloat exponent ->  4
0.8125 * (2 raisedTo:4) -> 13

and:
104 asFloat mantissa -> 0.8125
104 asFloat exponent -> 7
0.8125 * (2 raisedTo:7) -> 104

and:
0.1 mantissa -> 0.8
0.1 exponent -> -3
0.8 * (2 raisedTo:-3) -> 0.1

however:
(1 / 3.0) mantissa -> 0.666666666666667
(1 / 3.0) exponent -> -1
0.666666666666667 * (2 raisedTo:-1) -> 0.333333333333333

Danger in using Floats:
=======================

Beginners seem to forget (or never learn?) that fltoating point numbers
are always APPROXIMATIONs of some value.
You may never ever use them when exact results are neeed (i.e. when computing money!)
Take a look at the ScaledDecimal and FixedDecimal classes for that.

The Float/Double confusion in ST/X:
===================================

Due to historic reasons, ST/X's Floats are what Doubles are in VisualWorks.

The reason is that in some Smalltalks, double floats are called Float, and no single float exists (VSE, V'Age),
whereas in others, there are both Float and Double classes (VisualWorks).
In order to allow code from both families to be loaded into ST/X without a missing class error, and without
loosing precision, we decided to use IEEE doubles as the internal representation of Float
and make Double an alias to it.
This should work for either family (except for the unexpected additional precision in some cases).

If you really only want single precision floating point numbers, use ShortFloat instances.
But be aware that there is usually no advantage (neither in memory usage, due to memory alignment restrictions,
nor in speed), as these days, the CPUs are just as fast doing double precision operations.
(There might be a noticable difference when doing bulk operations, and you should consider using FloatArray for those).

Hardware supported precisions
=============================

The only really portable sizes are IEEE-single and IEEE-double floats (i.e. ShortFloat and Float instances).
These are supported on all architectures.
Some CPUs provide an extended precision floating point number,
however, the downside is that CPU-architects did not agree on a common format and precision:
some use 80 bits, others 96 and others even 128.
See the comments in the LongFloat class for more details.
We recommend using Float (i.e. IEEE doubles) unless absolutely required,
and care for machine dependencies in the code otherwise.
For higher precision needs, you may also try the new QDouble class, which gives you >200bits (60digits)
of precision on all machines or the software emulated QuadFloat or OctaFloat classes
(all come at a noticable performance price, though).
For very high precision (actually: arbitrary), take a look at the LargeFloat class.

Range and Precision of Storage Formats:
=======================================

Format |   Class    |   Array Class   | Bits / Significant  | Smallest Pos Number | Largest Pos Number | Significant Digits
|            |                 |      (Binary)       |                     |                    |     (Decimal)
-------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
half   |     --     | HalfFloatArray  |    16 / 11          |  6.10.... x 10−5    | 6.55...  x 10+5    |      3.3
-------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
single | ShortFloat | FloatArray      |    32 / 24          |  1.175... x 10-38   | 3.402... x 10+38   |      6-9
-------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
double | Float      | DoubleArray     |    64 / 53          |  2.225... x 10-308  | 1.797... x 10+308  |     15-17
-------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
double | LongFloat  |     --          |   128 / 113         |  3.362... x 10-4932 | 1.189... x 10+4932 |     33-36
extend.|            |                 |                     |                     |                    |
(SPARC)|            |                 |                     |                     |                    |
-------+            |                 |---------------------+---------------------+--------------------+--------------------
double |            |                 |    96 / 64          |  3.362... x 10-4932 | 1.189... x 10+4932 |     18-21
extend.|            |                 |                     |                     |                    |
(x86)  |            |                 |                     |                     |                    |
-------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
--   | QDouble    |     --          |   256 / 212         |  2.225... x 10-308  | 1.797... x 10+308  |     >=60
-------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
--   | QuadFloat  |     --          |   128 / 113         |  1.054... x 10-4931 | 1.189... x 10+4932 |     >=60
-------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
--   | OctaFloat  |     --          |   256 / 237         |  3.271... x 10-78913| 1.611... x 10+78913|     >=60
-------+------------+-----------------+---------------------+---------------------+--------------------+--------------------
--   | LargeFloat |     --          |     arbitrary       |  arbitrarily small  |  arbitrarily large |     arbitrary
-------+------------+-----------------+---------------------+---------------------+--------------------+--------------------

HalfFloats are only supported in fixed array containers.
This was added for OpenGL and other graphic libraries which allow for texture,
and vertex data to be passed quickly in that format (see http://www.opengl.org/wiki/Small_Float_Formats).

Long- and LargeFloat are not supported as array containers.
These formats are seldom used for bulk data.

QDoubles are special soft floats; slower in performance, but providing 4 times the precision of regular doubles.

To see the differences in precision:

'%60.58f' printf:{ 1 asShortFloat exp } -> '2.718281828459045*090795598298427648842334747314453125'          (32 bits)
'%60.58f' printf:{ 1 asFloat exp }      -> '2.718281828459045*090795598298427648842334747314453125'          (64 bits)
'%60.58f' printf:{ 1 asLongFloat exp }  -> '2.718281828459045235*4281681079939403389289509505033493041992'   (only 80 valid bits on x86)

'%60.58f' printf:{ 1 asQDouble exp }    -> '2.71828182845904523536028747135266249775724709369995957496698'   (>200 bits)

correct value is:                           2.71828182845904523536028747135266249775724709369995957496696762772407663035354759457138217852516642742746

Bulk Containers:
================
If you have a vector or matrix (and especially: large ones) of floating point numbers, the well known
Array is a very inperformant choice. The reason is that it keeps pointers to each of its elements, and each element
(if it is a float) is itself stored somewhere in the object memory.
Thus, there is both a space overhead (every float object has an object header, for class and other information), and
also a performance overhead (extra indirection, cache misses and alignment inefficiencies).
For this, the bulk numeric containers are provided, which keep the elements unboxed and properly aligned.
Use them for matrices and large numeric vectors. They also provide some optimized bulk operation methods,
Take a look at FloatArray, DoubleArray, HalfFloatArray etc.

Comparing Floats:
=================
Due to rounding errors (usually on the last bit(s)), you shalt not compare two floating point numbers
using the #= operator. For example, the value 0.1 cannot be represented as a sum of powers-of-two fractions,
and will therefore always be an approximation with a half bit error in the last bit of the mantissa.
Usually, the print functions take this into consideration and return a (faked) '0.1'.
However, this half bit error may accumulate, for example, when multiplying that by 0.1 then by 100,
the error may get large enough to be no longer pushed under the rug by the print function,
and you will get '0.9999999999999' from it.

Also, comparing against a proper 1.0 (which is representable as an exact power of 2),
you will get a false result.
i.e. (0.1 * 0.1 * 100 ~= 1.0) and (0.1 * 0.1 * 100 - 1.0) ~= 0.0
This often confuses non-computer scientists (and occasionally even some of those).

For this, you should always provide an epsilon value, when comparing two non-integer numbers.
The epsilon value is the distance you accept two number to be apart to be still considered equal.
Effectively the epsilon says are those nearer than this epsilon?.

Now we could say is the delta between two numbers smaller than 0.00001,
and get a reasonable answer for big numbers. But what if we compare two tiny numbers?
Then a reasonable epsilon must also be much smaller!

Actually, the epsilon should always be computed dynamically depending on the two values compared.
That is what the #isAlmostEqualTo:nEpsilon: method does for you. It does not take an absolute epsilon,
but instead the number of distinct floating point numbers that the two compared floats may be apart.
That is: the number of actually representable numbers between those two.
Effectively, that is the difference between the two mantissas,
when the numbers are scaled to the same exponent, taking the number of mantissa bits into account.

This software is furnished under a license and may be used
only in accordance with the terms of that license and with the
inclusion of the above copyright notice.   This software may not
be provided or otherwise made available to, or used by, any
other person.  No title to or ownership of the software is
hereby transferred.

```

Class protocol:

class initialization
initialize
initialize ANSI compliant float globals

Usage example(s):

 `````` self initialize ``````

constants
NaN
return the constant NaN (not a Number) in my representation.
Here, based on the assumption that division of zero by zero generates a NaN
(which is defined as such in the IEEE standard).
If a subclass does not, it has to redefine this method and generate a NaN differently

Usage example(s):

 `````` ShortFloat NaN Float NaN LongFloat NaN LargeFloat NaN IEEEFloat NaN ``````

negativeInfinity
return an instance of myself which represents negative infinity (for my instances).
Warning: do not compare equal against infinities;
instead, check using isFinite or isInfinite

Usage example(s):

 `````` ShortFloat negativeInfinity Float negativeInfinity LongFloat negativeInfinity LargeFloat negativeInfinity QDouble negativeInfinity IEEEFloat negativeInfinity ``````

constants & defaults
computeEpsilon
compute the maximum relative spacing of instances of mySelf
(i.e. the value-delta of the least significant bit from the
next number after 1.0 and 1.0).
See https://en.wikipedia.org/wiki/Machine_epsilon

Usage example(s):

 `````` Float radix Float precision ShortFloat computeEpsilon -> 1.192093e-07 Float computeEpsilon -> 2.22044604925031E-16 LongFloat computeEpsilon -> 1.084202172485504434E-19 QDouble computeEpsilon -> 7.77876909732643E-62 QuadFloat computeEpsilon -> 1.92593e-34 OctaFloat computeEpsilon -> 9.05568e-72 QuadFloat radix (QuadFloat coerce:QuadFloat radix) => 2.00000 2 asQuadFloat => 2.00000 ``````

eBias
that is the offset of the zero exponent when stored.
The computation below assumes standard IEEE format

Usage example(s):

 `````` Float eBias -> 1023 ShortFloat eBias -> 127 HalfFloat eBias -> 15 LongFloat eBias -> 16383 QuadFloat eBias -> 16383 OctaFloat eBias -> 262143 QDouble eBias -> 1023 LargeFloat eBias -> 0 ``````

Usage example(s):

 `````` 1.0 numBitsInExponent 11 1.0 eBias 1023 1.0 emin -1022 1.0 emax 1023 1.0 fmin 2.2250738585072E-308 1.0 fmax 1.79769313486232E+308 ``````

emax
The largest exponent value allowed by instances of this class.
The computation below assumes standard IEEE format

Usage example(s):

 `````` Float emax -> 1023 ShortFloat emax -> 127 LongFloat emax -> 16383 QuadFloat emax -> 16383 OctaFloat emax -> 262143 QDouble emax -> 1023 ``````

emin
The smallest exponent value allowed by (normalized) instances of this class.
The computation below assumes standard IEEE format

Usage example(s):

 `````` Float emin -> -1022 ShortFloat emin -> -126 LongFloat emin -> -16382 QuadFloat emin -> -16382 OctaFloat emin -> -262142 QDouble emin -> -1022 ``````

epsilon
return the maximum relative spacing of instances of mySelf
(i.e. the value-delta of the least significant bit)
according to ISO C standard;
Ada, C, C++ and Python language constants;
Mathematica, MATLAB and Octave; and various textbooks
see https://en.wikipedia.org/wiki/Machine_epsilon

Usage example(s):

 `````` Float epsilon -> 2.22044604925031E-16 ShortFloat epsilon -> 1.192093e-07 LongFloat epsilon -> 1.084202172485504434E-19 QDouble epsilon -> 7.778769097326426826491248689356e-62 ``````

fmax
The largest value allowed by instances of this class.
Not required to return an instances of the class,
but may return a double (aka Float) with that value (eg. for HalfFloats)

Usage example(s):

 `````` Float fmax -> 1.79769313486232E+308 ShortFloat fmax -> 3.402823e+38 LongFloat fmax -> 1.189731495357231765E+4932 HalfFloat fmax -> 65504.0 QuadFloat fmax -> 1.189731495e4932 OctaFloat fmax -> 1.61132571748e78913 QDouble fmax -> error (IEEEFloat size:16 exponentSize:5) fmax asFloat 65504.0 ``````

fmaxDenormalized
the largest denormalized value which can be represented
by instances of this class.
Should actually be sent to the instance,
because of IEEEFloat, which has instance-specific representation

fmin
the smallest normalized non-zero value which can be represented
by instances of this class;
should actually be sent to the instance,
because some of my subclasses have an instance-specific representation.
Not required to return an instances of the class,
but may return a double (aka Float) with that value (eg. for HalfFloats)

Usage example(s):

 `````` (1.0 asIEEEFloat:8) fmin -> 0.015625 HalfFloat fmin -> 6.103515625e-05 ShortFloat fmin -> 1.175494e-38 Float fmin -> 2.2250738585072e-308 LongFloat fmin -> 3.362103143112093506e-4932 QuadFloat fmin -> 3.3621031431119363650068581666578087e-4932 OctaFloat fmin QDouble fmin -> 2.2250738585072e-308 (IEEEFloat size:16 exponentSize:5) fmin asFloat 6.103515625e-05 Float fmin = (2.0 raisedTo:Float emin) -> true ShortFloat fmin = (2.0 raisedTo:ShortFloat emin) -> true QuadFloat fmin = (2.0 asQuadFloat raisedTo:QuadFloat emin) -> true OctaFloat fmin = (2.0 asOctaFloat raisedTo:OctaFloat emin) -> true ``````

fminDenormalized
the smallest non-zero value which can be represented
by instances of this class;
should actually be sent to the instance,
because of IEEEFloat, which has instance-specific representation

** This method must be redefined in concrete classes (subclassResponsibility) **

infinity
return an instance of myself which represents positive infinity (for my instances).
Warning: do not compare equal against infinities;
instead, check using isFinite or isInfinite

Usage example(s):

 `````` ShortFloat infinity Float infinity LongFloat infinity LargeFloat infinity IEEEFloat infinity QuadFloat infinity OctaFloat infinity QDouble infinity ``````

maxSmallInteger
answer the largest possible SmallInteger value as instance of myself.
Notice: if my precision is smaller than the number of bits in a SmallInteger
you'll loose some precision.

Usage example(s):

 `````` Float maxSmallInteger. 4.61168601842739e+18 LongFloat maxSmallInteger. 4611686018427387903.0 ShortFloat maxSmallInteger. 4.611686e+18 QDouble maxSmallInteger. 4.61169e+18 QuadFloat maxSmallInteger. 4.61169e+18 ``````

minSmallInteger
answer the smallest possible SmallInteger value as instance of myself

Usage example(s):

 `````` Float maxSmallInteger. LongFloat maxSmallInteger. ShortFloat maxSmallInteger. QDouble maxSmallInteger. LargeFloat maxSmallInteger. Float minSmallInteger. LongFloat minSmallInteger. ShortFloat minSmallInteger. QDouble minSmallInteger. LargeFloat minSmallInteger. ``````

instance creation
fromBytes: bytes
Float fromBytes:#[0 0 0 0 0 0 8 0]

fromInteger: anInteger
return a float with anInteger's value.
Since floats have a limited precision, you usually loose bits when doing this
with a large integer
i.e. when numDigits is above the flt. pnt number's precision.
(see Float decimalPrecision, LongFloat decimalPrecision).
Also, a domainError could be raised, if the integer cannot be
represented as an instance of the receiver class.
(can be caught with trapInfinity:)

Usage example(s):

 `````` ShortFloat fromInteger:2 12345678901234567890 asShortFloat 1234567890 asFloat 1234567890 asFloat asInteger -1234567890 asFloat asInteger 12345678901234567890 asFloat storeString 12345678901234567890 asFloat asInteger -12345678901234567890 asFloat asInteger 12345678901234567890 asLongFloat 12345678901234567890 asLongFloat asInteger -12345678901234567890 asLongFloat asInteger 123456789012345678901234567890 asLongFloat 123456789012345678901234567890 asLongFloat asInteger -123456789012345678901234567890 asLongFloat asInteger 1234567890123456789012345678901234567890 asLongFloat 1234567890123456789012345678901234567890 asLongFloat asInteger -1234567890123456789012345678901234567890 asLongFloat asInteger 'this test is on 65 bits'. self assert: 16r1FFFFFFFFFFFF0801 asDouble ~= 16r1FFFFFFFFFFFF0800 asDouble. 'this test is on 64 bits'. self assert: 16r1FFFFFFFFFFFF0802 asDouble ~= 16r1FFFFFFFFFFFF0800 asDouble. 'nearest even is upper'. self assert: 16r1FFFFFFFFFFF1F800 asDouble = 16r1FFFFFFFFFFF20000 asDouble. 'nearest even is lower'. self assert: 16r1FFFFFFFFFFFF0800 asDouble = 16r1FFFFFFFFFFFF0000 asDouble. -- loosing bits! (Float fromInteger:16r1FFFFFFFFFFFF0801) asInteger hexPrintString '1FFFFFFFFFFFF1000' (Float fromInteger:16r1FFFFFFFFFFFF0880) asInteger hexPrintString '1FFFFFFFFFFFF1000' (Float fromInteger:16r1FFFFFFFFFFFFFF0801) asInteger hexPrintString '2000000000000000000' (Float fromInteger:16r1FFFFFFFFFFFFFFFFFFFF0801) asInteger hexPrintString '2000000000000000000000000' (LongFloat fromInteger:16r1FFFFFFFFFFFF0801) asInteger hexPrintString '1FFFFFFFFFFFF0800' (LongFloat fromInteger:16r1FFFFFFFFFFFF0880) asInteger hexPrintString '1FFFFFFFFFFFF0880' (LongFloat fromInteger:16r1FFFFFFFFFFFFFF0880) asInteger hexPrintString '1FFFFFFFFFFFFFF0800' (LongFloat fromInteger:16r1FFFFFFFFFFFFFFFFFFFF0801) asInteger hexPrintString '2000000000000000000000000' (QuadFloat fromInteger:16r1FFFFFFFFFFFF0801) asInteger hexPrintString '1FFFFFFFFFFFF0801' (QuadFloat fromInteger:16r1FFFFFFFFFFFFFF0801) asInteger hexPrintString '1FFFFFFFFFFFFFF0801' (QDouble fromInteger:16r1FFFFFFFFFFFF0801) asInteger hexPrintString '1FFFFFFFFFFFF0801' (QDouble fromInteger:16r1FFFFFFFFFFFFFF0801) asInteger hexPrintString '1FFFFFFFFFFFFFF0801' (OctaFloat fromInteger:16r1FFFFFFFFFFFF0801) asInteger hexPrintString '1FFFFFFFFFFFF0801' (OctaFloat fromInteger:16r1FFFFFFFFFFFFFF0801) asInteger hexPrintString '1FFFFFFFFFFFFFF0801' ``````

fromLimitedPrecisionReal: anLPReal
return a float with anLPReal's value.
You might loose bits when doing this.
Slow fallback.

fromNumerator: numerator denominator: denominator
Create a limited precision real from a Rational.
This version will answer the nearest flotaing point value,
according to IEEE 754 round to nearest even default mode

Usage example(s):

 `````` Time millisecondsToRun:[ 1000000 timesRepeat:[ Float fromNumerator:12345678901234567890 denominator:987654321 ]. ] |fraction| fraction := 12345678901234567890//987654321. Time millisecondsToRun:[ 1000000 timesRepeat:[ fraction asFloat ]. ] ``````

new: aNumber
catch this message - not allowed for floats/doubles

random
( an extension from the stx:libbasic2 package )
Float random
Float32 random

read a float from a string

Usage example(s):

 `````` Float readFrom:'.1' Float readFrom:'0.1' Float readFrom:'0' ShortFloat readFrom:'.1' ShortFloat readFrom:'0.1' ShortFloat readFrom:'0' LongFloat readFrom:'.1' LongFloat readFrom:'0.1' LongFloat readFrom:'0' LimitedPrecisionReal readFrom:'bla' onError:nil Float readFrom:'bla' onError:nil ShortFloat readFrom:'bla' onError:nil ``````

queries
decimalEmax
Answer how many digits of accuracy this class supports

Usage example(s):

 `````` ShortFloat emax ShortFloat decimalEmax Float emax Float emin Float decimalEmax LongFloat emax LongFloat emin LongFloat decimalEmax ``````

decimalPrecision
return the number of valid decimal digits

Usage example(s):

 `````` HalfFloat decimalPrecision -> 3 ShortFloat decimalPrecision -> 7 Float decimalPrecision -> 16 LongFloat decimalPrecision -> 19 QuadFloat decimalPrecision -> 34 OctaFloat decimalPrecision -> 71 QDouble decimalPrecision -> 61 ``````

defaultPrintPrecision
the default number of digits when printing

Usage example(s):

 `````` ShortFloat defaultPrintPrecision -> 5 Float defaultPrintPrecision -> 6 LongFloat defaultPrintPrecision -> 8 QDouble defaultPrintPrecision -> 10 QuadFloat defaultPrintPrecision -> 9 OctaFloat defaultPrintPrecision -> 11 LargeFloat defaultPrintPrecision -> 12 ``````

defaultPrintfPrecision
the default number of digits when printing with printf's %f format.
Notice, that the C-language standard states that this should be 6;
however, we can adjust it on a per-class basis.

denormalized
Return whether the instances of this class can
represent values in denormalized format.

exactDecimalPrecision
return the exact number of decimal digits

Usage example(s):

 `````` HalfFloat exactDecimalPrecision -> 3.612359947967774002 ShortFloat exactDecimalPrecision -> 7.224719895935548004 Float exactDecimalPrecision -> 15.95458977019100184 LongFloat exactDecimalPrecision -> 19.26591972249479468 QuadFloat exactDecimalPrecision -> 34.01638951002987185 OctaFloat exactDecimalPrecision -> 71.34410897236353654 QDouble exactDecimalPrecision -> 61.41011911545215804 ``````

hasSharedInstances
return true if this class can share instances when stored binary,
that is, instances with the same value can be stored by reference.
Although not really shared, floats should be treated
so, to be independent of the implementation of the arithmetic methods.

isAbstract
Return if this class is an abstract class.
True is returned for LimitedPrecisionReal here; false for subclasses.

Usage example(s):

 `````` 1.0 class isAbstract ``````

isIEEEFormat
return true, if this machine represents floats in IEEE format.
Currently, no support is provided for non-ieee machines
to convert their floats into this (which is only relevant,
if such a machine wants to send floats as binary to some other
machine).
Machines with non-IEEE format are VAXen and IBM370-type systems
(among others). Today, every system uses IEEE format floats.

numBitsInExponent
return the number of bits in the exponent

** This method must be redefined in concrete classes (subclassResponsibility) **

numBitsInIntegerPart
answer the number of bits in the integer part of the mantissa.
I.e. 0 is returned if there is a hidden bit, 1 if not.
Most floating point formats are normalized to get rid of the extra bit.

numBitsInMantissa
return the number of bits in the mantissa (the significant)
Typically the precision is 1 more than the significant due to the hidden bit
the hidden bit is not counted here.

** This method must be redefined in concrete classes (subclassResponsibility) **

numHiddenBits
answer the number of hidden bits in the mantissa.
This will return 0 or 1; 0 if there is no hidden bit, 1 if there is.
Most floating point formats are normalized to get one extra bit of precision
and thus will return 1 here.

precision
answer the precision (the number of bits in the mantissa) of my elements (in bits)
If my elements are IEEE floats, where only the fraction from the normalized mantissa is stored,
there will be a hidden bit and the mantissa will be actually represented by 1 more binary digits
(i.e. the number returned is 1 plus the actual number of bits stored)
any hidden bits are included here

Usage example(s):

 `````` HalfFloatArray precision ShortFloat precision Float precision LongFloat precision QDouble precision ``````

** This method must be redefined in concrete classes (subclassResponsibility) **

Instance protocol:

Compatibility-Squeak
defaultNumberOfDigits
( an extension from the stx:libcompat package )
marked as obsolete by exept MBP at 13-11-2021

** This is an obsolete interface - do not use it (it may vanish in future versions) **

accessing
at: index

at: index put: aValue

arithmetic
* aNumber
return the product of the receiver and the argument.

+ aNumber
return the sum of the receiver and the argument, aNumber

- aNumber
return the difference of the receiver and the argument, aNumber

/ aNumber
return the quotient of the receiver and the argument, aNumber

// aNumber
return the integer quotient of dividing the receiver by aNumber with
truncation towards negative infinity.

ceiling
(comment from inherited method)
return the integer nearest the receiver towards positive infinity.

floor
(comment from inherited method)
return the receiver truncated towards negative infinity

timesTwoPower: anInteger
multiply self by a power of two.
I.e. self * (2**n)
Implementation takes care of preserving class and avoiding overflow/underflow
if possible; otherwise returns infinity or zero.
Thanks to Nicolas Cellier for this code

Usage example(s):

 `````` (3 asShortFloat timesTwoPower:10) -> 3072.0. (3 asFloat timesTwoPower:10) -> 3072.0. (3 asShortFloat timesTwoPower:100) -> 3.802952e+30. (3 asFloat timesTwoPower:100) -> 3.80295180068469e+30. (3 asShortFloat timesTwoPower:200) -> inf. (3 asFloat timesTwoPower:200) -> 4.82081413277697e+60. (1 asShortFloat timesTwoPower: 3) class = ShortFloat. (1 asLongFloat timesTwoPower: 1024). (1 asFloat timesTwoPower: -1024) timesTwoPower: 1024. (1 asLongFloat timesTwoPower: -1024) timesTwoPower: 1024. (2.0 asShortFloat timesTwoPower: -150) timesTwoPower: 150 (2.0 asLongFloat timesTwoPower: -150) timesTwoPower: 150 (2.0 asFloat timesTwoPower: -150) timesTwoPower: 150 (2.0 asShortFloat timesTwoPower: -149) timesTwoPower: 149 (2.0 asLongFloat timesTwoPower: -149) timesTwoPower: 149 (2.0 asFloat timesTwoPower: -149) timesTwoPower: 149 (ShortFloat infinity timesTwoPower:10) -> inf (LongFloat infinity timesTwoPower:10) -> inf (Float infinity timesTwoPower:10) -> inf Time millisecondsToRun:[ 1000000 timesRepeat:[ (2.0 timesTwoPower: 150) ] ] ``````

bytes access
digitBytes
answer the float's digit bytes in IEEE format.
Use the native machine byte ordering.

Usage example(s):

 `````` 1.0 digitBytes Float pi digitBytes ShortFloat pi digitBytes ``````

digitBytesMSB: msb
answer the float's digit bytes im IEEE format.
If msb == true, use MSB byte order, otherwise LSB byte order.

Usage example(s):

 `````` Float pi digitBytesMSB:false Float pi digitBytesMSB:true ShortFloat pi digitBytesMSB:false ShortFloat pi digitBytesMSB:true ``````

coercing & converting
asFloat
(comment from inherited method)
return a float with same value

asFraction
This conversion uses the continued fraction method to approximate
a floating point number.
In contrast to #asTrueFraction, which returns exactly the value of the float,
this rounds in the last significant bit of the floating point number.

Usage example(s):

 `````` 1.1 asFraction 1.2 asFraction 0.3 asFraction 0.5 asFraction (1/5) asFloat asFraction (1/8) asFloat asFraction (1/13) asFloat asFraction (1/10) asFloat asFraction (1/10) asFloat asTrueFraction asFixedPoint scale:20 3.14159 asFixedPoint scale:20 3.14159 storeString 3.14159 asFraction asFloat storeString 1.3 asFraction 1.0 asFraction 1E6 asFraction 1E-6 asFraction ``````

asIEEEFloat
( an extension from the stx:libbasic2 package )
return an IEEE soft float with same value as receiver

Usage example(s):

 `````` 123 asFloat asIEEEFloat 0 asShortFloat asIEEEFloat 0.0 asIEEEFloat Float NaN asIEEEFloat Float positiveInfinity asIEEEFloat Float negativeInfinity asIEEEFloat ShortFloat NaN asIEEEFloat ShortFloat positiveInfinity asIEEEFloat ShortFloat negativeInfinity asIEEEFloat QuadFloat NaN asIEEEFloat QuadFloat positiveInfinity asIEEEFloat QuadFloat negativeInfinity asIEEEFloat ``````

asIEEEFloat: numBits
return an IEEE soft float with same value as receiver and numBits overAll
numBits should be a multiple of 8,
i.e. 32 for IEEE single, 64 for double, 128 for quadFloat, etc.)

Usage example(s):

 `````` 123 asFloat asIEEEFloat 123 asFloat asIEEEFloat:32 123 asFloat asIEEEFloat:16 12 asFloat asIEEEFloat:8 12 asIEEEFloat:8 0 asShortFloat asIEEEFloat 0.0 asIEEEFloat ``````

asInteger
return an integer with same value - might truncate.
Does not raise an error for non-finite numbers (NaN or INF)

Usage example(s):

 `````` 12345.0 asInteger 1e15 asInteger 1e33 asInteger asFloat 1e303 asInteger asFloat ``````

asLargeFloat
( an extension from the stx:libbasic2 package )
return a large float with (approximately) my value.
If the LargeFloat class is not present, a regular float is returned

asLargeFloatPrecision: n
( an extension from the stx:libbasic2 package )
return a large float with (approximately) my value.
If the largeFloat class is not present, a regular float is returned

Usage example(s):

 `````` 1.0 asLargeFloatPrecision:10 ``````

asLimitedPrecisionReal
return a float of any precision with same value

asLongFloat
(comment from inherited method)
return a longFloat with same value

asOctaFloat
( an extension from the stx:libbasic2 package )
(comment from inherited method)
return an octaFloat with same value

( an extension from the stx:libbasic2 package )

asRational
Same as asFraction fro st-80 compatibility.

Usage example(s):

 `````` 1.1 asRational 1.2 asRational 0.3 asRational 0.5 asRational (1/5) asFloat asRational (1/8) asFloat asRational (1/13) asFloat asRational 3.14159 asRational 3.14159 asRational asFloat 1.3 asRational 1.0 asRational ``````

asShortFloat
(comment from inherited method)
return a shortFloat with same value.
Does NOT raise an error if the receiver exceeds the float range.

asTrueFraction
Answer a fraction or integer that EXACTLY represents self,
an any-precision IEEE floating point number, consisting of:
numMantissaBits bits of normalized mantissa (i.e. with hidden leading 1-bit)
optional numExtraBits between mantissa and exponent (normalized flag for ext-real)
numExponentBits bits of 2s complement exponent
1 sign bit.
Taken from Float's asTrueFraction

Usage example(s):

 ``````(result asFloat = self) ifFalse: [self error: 'asTrueFraction validation failed']. ``````

Usage example(s):

 `````` 1.0 asLongFloat asTrueFraction 0.3 asFloat asTrueFraction (5404319552844595/18014398509481984) 0.3 asShortFloat asTrueFraction (5033165/16777216) 0.3 asLongFloat asTrueFraction (5404319552844595/18014398509481984) 0.3 asQuadFloat asTrueFraction (5404319552844595/18014398509481984) 0.3 asOctaFloat asTrueFraction (5404319552844595/18014398509481984) 1.25 asTrueFraction (5/4) 1.25 asShortFloat asTrueFraction (5/4) 1.25 asLongFloat asTrueFraction (5/4) 0.25 asTrueFraction (1/4) 0.25 asShortFloat asTrueFraction (1/4) 0.25 asLongFloat asTrueFraction (1/4) -0.25 asTrueFraction (-1/4) -0.25 asShortFloat asTrueFraction (-1/4) -0.25 asLongFloat asTrueFraction (-1/4) 3e37 asTrueFraction 30000000000000002158062836758597337088 3e37 asShortFloat asTrueFraction 30000001069098037760363920625477091328 3e37 asLongFloat asTrueFraction 30000000000000002158062836758597337088 3e37 asQuadFloat asTrueFraction 30000000000000002158062836758597337088 3e37 asOctaFloat asTrueFraction 30000000000000002158062836758597337088 3e37 asQDouble asTrueFraction 30000000000000002158062836758597337088 0 asLongFloat negated asTrueFraction LongFloat NaN asTrueFraction LongFloat infinity asTrueFraction LongFloat negativeInfinity asTrueFraction Float fmin asTrueFraction Float fminDenormalized asTrueFraction Float fmaxDenormalized asTrueFraction LongFloat fmin asTrueFraction LongFloat fminDenormalized asTrueFraction LongFloat fmaxDenormalized asTrueFraction ``````

comparing
< aNumber
return true, if the argument is greater

double dispatching
differenceFromFraction: aFraction
sent when a fraction does not know how to subtract the receiver

equalFromFraction: aFraction
sent when a fraction does not know how to compare with the receiver

lessFromFraction: aFraction
aFraction does not know how to compare to the receiver -
Return true if aFraction < self.

productFromFraction: aFraction
sent when a fraction does not know how to multiply the receiver

quotientFromFloat: aFloat
return the quotient of aFloat and the receiver.
Return aFloat / self

quotientFromFraction: aFraction
Return the quotient of the argument, aFraction and the receiver.
Sent when aFraction does not know how to divide by the receiver.

sumFromFraction: aFraction
sent when a fraction does not know how to add the receiver

sumFromTimestamp: aTimestamp
I am to be interpreted as seconds, return the timestamp this number of seconds
after aTimestamp

Usage example(s):

 `````` Timestamp now sumFromTimestamp:aTimestamp 100.0 sumFromTimestamp:Timestamp now |t1 t2| t1 := Timestamp now. t2 := 1.5 sumFromTimestamp:t1. t1 inspect. t2 inspect. ``````

error reportng
errorUnsupported

inspecting
inspectorExtraAttributes
( an extension from the stx:libtool package )
extra (pseudo instvar) entries to be shown in an inspector.

printing & storing
commonPrintOn: aStream
a zero mantissa is impossible - except for zero and a few others

printOn: aStream
0.0 printOn:Transcript. Transcript cr.
PrintfScanf printf:'%g' on:Transcript argument:0.0. Transcript cr.

0.0 asIEEEFloat printOn:Transcript. Transcript cr.
PrintfScanf printf:'%g' on:Transcript argument:0.0 asIEEEFloat. Transcript cr.

0.0 asOctaFloat printOn:Transcript. Transcript cr.
PrintfScanf printf:'%g' on:Transcript argument:0.0 asOctaFloat. Transcript cr.

-0.0 printOn:Transcript. Transcript cr.
PrintfScanf printf:'%g' on:Transcript argument:-0.0. Transcript cr.

-0.0 asIEEEFloat printOn:Transcript. Transcript cr.
PrintfScanf printf:'%g' on:Transcript argument:-0.0 asIEEEFloat. Transcript cr.
PrintfScanf printf:'%-g' on:Transcript argument:-0.0 asIEEEFloat. Transcript cr.
PrintfScanf printf:'%+g' on:Transcript argument:0.0 asIEEEFloat. Transcript cr.
PrintfScanf printf:'%+g' on:Transcript argument:-0.0 asIEEEFloat. Transcript cr.
PrintfScanf printf:'% g' on:Transcript argument:-0.0 asIEEEFloat. Transcript cr.
PrintfScanf printf:'% g' on:Transcript argument:0.0 asIEEEFloat. Transcript cr.

1234.0 asIEEEFloat printOn:Transcript. Transcript cr.
PrintfScanf printf:'%g' on:Transcript argument:1234.0 asIEEEFloat. Transcript cr.

1e39 asIEEEFloat printOn:Transcript. Transcript cr.
PrintfScanf printf:'%g' on:Transcript argument:1e39 asIEEEFloat. Transcript cr.

PrintfScanf printf:'% g' on:Transcript argument:IEEEFloat NaN. Transcript cr.
PrintfScanf printf:'% g' on:Transcript argument:IEEEFloat infinity. Transcript cr.
PrintfScanf printf:'% g' on:Transcript argument:IEEEFloat negativeInfinity. Transcript cr.

printStringScientific
return a 'user friendly' scientific printString.
Notice: this returns a Text object with superscript digits,
which requires a font capapble of displaying it correctly.
Also: the returned string is not meant to be read back - purely for GUIs

Usage example(s):

 `````` 1.23456 printString -> '1.23456' 1.23456 printStringScientific 1.23456×10^0 (with superscript zero at end) 1.23e14 printStringScientific 1.23×10^14 (with superscript zero at end) PrintfScanf printf:'%e' argument:1.23456 -> '1.23456e0' PrintfScanf printf:'%g' argument:1.23456 -> '1.23456' PrintfScanf printf:'%f' argument:1.23456 -> '1.23456' PrintfScanf printf:'%e' argument:1.234 -> '1.234e0' PrintfScanf printf:'%g' argument:1.234 -> '1.234' PrintfScanf printf:'%f' argument:1.234 -> '1.234' ``````

printStringWithFormat: format
return a printed representation of the receiver;
fmt must be of the form: .nn, where nn is the number of digits.
To print 6 valid digits, use printStringWithFormat:'.6'
For Floats, the default used in printString, is 15 (because its a double);
for ShortFloats, it is 6 (because it is a float)

Usage example(s):

 `````` Float pi printStringWithFormat:'.20' => '3.141592653589793116' Float pi asQuadFloat printStringWithFormat:'.20' => '3.14159265358978956320' ``````

private accessing
digitBytes: bytesLSB

queries
decimalEmax
Answer how many digits of exponent-accuracy this class supports

Usage example(s):

 `````` 1.0 asShortFloat emax 1.0 asShortFloat decimalEmax 1.0 asFloat emax 1.0 asFloat emin 1.0 asFloat decimalEmax 1.0 asLongFloat emax 1.0 asLongFloat emin 1.0 asLongFloat decimalEmax ``````

decimalPrecision
Answer how many significant decimal digits (accuracy) this instance supports

Usage example(s):

 `````` 1.0 asShortFloat decimalPrecision -> 7 1.0 asFloat decimalPrecision -> 15 1.0 asLongFloat decimalPrecision -> 19 1.0 asQDouble decimalPrecision -> 61 1.0 asLargeFloat decimalPrecision -> 15 (1.0 asLargeFloatPrecision:200) decimalPrecision -> 60 (1.0 asLargeFloatPrecision:400) decimalPrecision -> 120 1.0 asQuadFloat decimalPrecision -> 34 1.0 asOctaFloat decimalPrecision -> 71 1.0 asIEEEFloat decimalPrecision -> 15 (1.0 asIEEEFloat:128) decimalPrecision -> 34 (1.0 asIEEEFloat:256) decimalPrecision -> 71 (1.0 asIEEEFloat:512) decimalPrecision -> 148 (1.0 asIEEEFloat:1024) decimalPrecision -> 302 1.0 asLongFloat asIEEEFloat decimalPrecision -> 15 1.0 asShortFloat asIEEEFloat decimalPrecision -> 15 ``````

defaultPrintPrecision
the default number of digits when printing

Usage example(s):

 `````` 1.0 asFloat defaultPrintPrecision 15 1.0 asLongFloat defaultPrintPrecision 19 1.0 asShortFloat defaultPrintPrecision 6 1.0 asQDouble defaultPrintPrecision 60 1.0 asQuadFloat defaultPrintPrecision 30 1.0 asOctaFloat defaultPrintPrecision 70 (1.0 asLargeFloatPrecision:100) defaultPrintPrecision 29 (1.0 asLargeFloatPrecision:200) defaultPrintPrecision 59 (1.0 asLargeFloatPrecision:300) defaultPrintPrecision 79 ``````

defaultPrintfPrecision
the default number of digits when printing with printf's %f format.
Notice, that the C-language standard states that this should be 6;
however, we can adjust it on a per-class basis.

eBias
that is the offset of the zero exponent when stored
(i.e. the real exponent is exponentBits - eBias).
This is implemented on the instance side,
because of IEEEFloat, which has instance-specific representation.

Usage example(s):

 `````` 1.0 numBitsInExponent 11 1.0 eBias 1023 1.0 emin -1022 1.0 emax 1023 1.0 fmin 2.2250738585072E-308 1.0 fmax 1.79769313486232E+308 ``````

Usage example(s):

 `````` 1.0 asLongFloat numBitsInExponent 15 1.0 asLongFloat eBias 16383 1.0 asLongFloat emin -16382 1.0 asLongFloat emax 16383 1.0 asLongFloat fmin 3.362103143112093506E-4932 1.0 asLongFloat fmax 1.189731495357231765E+4932 ``````

Usage example(s):

 `````` 1.0 asShortFloat numBitsInExponent 8 1.0 asShortFloat eBias 127 1.0 asShortFloat emin -126 1.0 asShortFloat emax 127 1.0 asShortFloat fmin 1.175494e-38 1.0 asShortFloat fmax 3.402823e+38 ``````

Usage example(s):

 `````` 1.0 asQuadFloat numBitsInExponent 15 1.0 asQuadFloat eBias 16383 1.0 asQuadFloat emin -16382 1.0 asQuadFloat emax 16383 1.0 asQuadFloat fmin 1.0 asQuadFloat fmax ``````

Usage example(s):

 `````` 1.0 asIEEEFloat numBitsInExponent 15 1.0 asIEEEFloat eBias 16383 1.0 asIEEEFloat emin -16382 1.0 asIEEEFloat emax 16383 1.0 asIEEEFloat fmin 1.0 asIEEEFloat fmax ``````

emax
The largest exponent value allowed by instances like this.
The computation below assumes standard IEEE format.
This is also implemented on the instance side,
because of IEEEFloat, which has instance-specific representation.

Usage example(s):

 `````` Float emax -> 1023 ShortFloat emax -> 127 LongFloat emax -> 16383 QuadFloat emax -> 16383 OctaFloat emax -> 262143 QDouble emax -> 1023 ``````

emin
The smallest exponent value allowed by (normalized) instances of this class.
The computation below assumes standard IEEE format.
This is implemented on the instance side,
because of IEEEFloat, which has instance-specific representation.

epsilon
return the maximum relative spacing of instances of mySelf
(i.e. the value-delta of the least significant bit)
according to ISO C standard;
Ada, C, C++ and Python language constants;
Mathematica, MATLAB and Octave; and various textbooks
see https://en.wikipedia.org/wiki/Machine_epsilon

exponent
generic; assumes IEEE float

Usage example(s):

 `````` 1.0 exponent 1 1.0 xexponent 1 0.0 exponent 0 0.0 xexponent 0 Float fmin exponent -1021 Float fmin xexponent -1021 (Float fmin / 2) exponent -1022 (Float fmin / 2) xexponent -1022 (Float fmin / 4) exponent -1023 (Float fmin / 4) xexponent -1023 (Float fmin / 32) exponent -1026 (Float fmin / 32) xexponent -1026 (Float fminDenormalized) exponent -1073 (Float fminDenormalized) xexponent -1073 Float NaN exponent Float infinity exponent ``````

exponentBits
extract the biased exponentBits.
Assumes that subclasses are IEEE based (or at least can provide
an IEEE compatible byteArray for themself

Usage example(s):

 `````` 0.0 mantissaBits 0 0.0 exponentBits 0 1.0 mantissaBits hexPrintString -> '0' 1.0 mantissaWithHiddenBits hexPrintString -> '10000000000000' 1.0 exponentBits -> 1023 16r3FF 2.0 mantissaBits hexPrintString -> '0' 2.0 mantissaWithHiddenBits hexPrintString -> '10000000000000' 2.0 exponentBits -> 1024 16r400 3.0 mantissaBits hexPrintString -> '8000000000000' 3.0 mantissaWithHiddenBits hexPrintString -> '18000000000000' 3.0 exponentBits -> 1024 16r400 4.0 mantissaBits hexPrintString -> '0' 4.0 exponentBits -> 1025 16r401 5.0 mantissaBits hexPrintString -> '4000000000000' 5.0 exponentBits -> 1025 16r401 -5.0 mantissaBits hexPrintString -> '4000000000000' -5.0 exponentBits -> 1025 16r401 0.1 mantissaBits hexPrintString '1999999999999A' 0.1 exponentBits 1019 16r3FB 0.3 mantissaBits hexPrintString '13333333333333' 0.3 exponentBits 1021 16r3FD 0.3 asShortFloat mantissaBits 10066330 16r99999A 0.3 asShortFloat exponentBits 125 16r7D 0.3 asLongFloat mantissaBits 11068046444225730560 16r9999999999999800 0.3 asLongFloat exponentBits 16381 16r3FFD 0.3 asQDouble mantissaBits Float fmin exponentBits 1 Float fminDenormalized exponentBits ``````

fmax
the largest finite value which can be represented
by normalized instances of this class;
this is implemented on the instance side,
because of IEEEFloat, which has instance-specific representation.

fmaxDenormalized
the largest denormalized value which can be represented
by instances of this class.
This is implemented on the instance side,
because of IEEEFloat, which has instance-specific representation.

fmin
the smallest non-zero value which can be represented
by normalized instances of this class;
this is implemented on the instance side,
because of IEEEFloat, which has instance-specific representation.

fminDenormalized
the smallest non-zero value which can be represented by instances of this class;
this is implemented on the instance side,
because of IEEEFloat, which has instance-specific representation.

fractionalPart
This has been renamed to #fractionPart for ST80 compatibility.

extract the after-decimal fraction part.
the floats value is
float truncated + float fractionalPart

** This is an obsolete interface - do not use it (it may vanish in future versions) **

hasIEEEFormat
HalfFloat isIEEEFormat true
ShortFloat isIEEEFormat true
Float isIEEEFormat true
LongFloat isIEEEFormat true
OctaFloat isIEEEFormat true
QDouble isIEEEFormat false
LargeFloat isIEEEFormat false

mantissa
extract a float's mantissa (as Float).
That is a float of the same type as the receiver,
such that:
(f mantissa) * (2 ^ f exponent) = f
This assumes that the mantissa is normalized to 0.5 .. 1.0

** This method must be redefined in concrete classes (subclassResponsibility) **

mantissaBits
extract a float's mantissaBits (excl. any hidden bit).
I.e. this returns the normalized mantissaBits as an integer.
Assumes that subclasses are IEEE based (or at least can provide
an IEEE compatible byteArray for themself

Usage example(s):

 `````` 0.0 mantissaBits 1.0 mantissaBits hexPrintString -> '0' 2.0 mantissaBits hexPrintString -> '0' 3.0 mantissaBits hexPrintString -> '8000000000000' 4.0 mantissaBits hexPrintString -> '0' 5.0 mantissaBits hexPrintString -> '4000000000000' 10.0 mantissaBits hexPrintString -> '4000000000000' 0.1 mantissaBits hexPrintString -> '999999999999A' 0.3 mantissaBits hexPrintString -> '3333333333333' 10.0 asShortFloat mantissaBits hexPrintString -> '200000' 10.0 asLongFloat mantissaBits hexPrintString -> 'A000000000000000' 10.0 mantissaWithHiddenBits hexPrintString -> '14000000000000' 10.0 asShortFloat mantissaWithHiddenBits hexPrintString -> 'A00000' 10.0 asLongFloat mantissaWithHiddenBits hexPrintString -> 'A000000000000000' 0.3 asShortFloat mantissaBits -> 1677722 16r19999A 0.3 asLongFloat mantissaBits -> 29514790517935282176 16r19999999999999800 ``````

mantissaWithHiddenBits
extract a float's mantissaBits (incl. any hidden bit).
I.e. this returns the denormalized mantissaBits

Usage example(s):

 `````` 0.0 mantissaBits 0 0.0 mantissaWithHiddenBits 0 1.0 mantissaBits hexPrintString -> '0' 1.0 mantissaWithHiddenBits hexPrintString -> '10000000000000' 2.0 mantissaBits hexPrintString -> '0' 2.0 mantissaWithHiddenBits hexPrintString -> '10000000000000' 0.1 mantissaBits hexPrintString -> '999999999999A' 0.1 mantissaWithHiddenBits hexPrintString -> '1999999999999A' 0.3 mantissaBits hexPrintString -> '3333333333333' 0.3 mantissaWithHiddenBits hexPrintString -> '13333333333333' 10.0 mantissaWithHiddenBits hexPrintString -> '14000000000000' / 2r10100000000000000000000000000000000000000000000000000 10.0 asShortFloat mantissaWithHiddenBits hexPrintString -> 'A00000' / 2r101000000000000000000000 10.0 asLongFloat mantissaWithHiddenBits hexPrintString -> 'A000000000000000' / 2r1010000000000000000000000000000000000000000000000000000000000000 10.0 asQuadFloat mantissaWithHiddenBits hexPrintString -> 'A000000000000000' / 2r1010000000000000000000000000000000000000000000000000000000000000 0.3 asShortFloat mantissaBits -> 1677722 16r19999A 0.3 asLongFloat mantissaBits -> 29514790517935282176 16r19999999999999800 Float fminDenormalized mantissaWithHiddenBits ``````

nextFloat
answer the next representable float after myself

Usage example(s):

 `````` (1.0 nextFloat) storeString (1.0 asShortFloat nextFloat) storeString (67329.234 nextFloat) storeString (67329.234 asShortFloat nextFloat) storeString (10000000000.0 nextFloat) storeString (10000000000.0 asShortFloat nextFloat) storeString ``````

nextFloat: nUlps
answer the next representable float nUlps after myself

** This method must be redefined in concrete classes (subclassResponsibility) **

numBitsInExponent
answer the number of bits in the exponent
11 for double precision:
seeeeeee eeeemmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm
8 for single precision:
seeeeeee emmmmmmm mmmmmmmm mmmmmmmm
15 for long floats (x86):
00000000 00000000 seeeeeee eeeeeeee immmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm
15 for long floats (sparc):
seeeeeee eeeeeeee mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...
seeeeeee eeeeeeee mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...
15 for octuple floats:
seeeeeee eeeeeeee eeeemmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...
other for LargeFloats

numBitsInMantissa
answer the number of bits in the mantissa (the significant) of my instances
any hidden bits are not counted.
11 for half precision:
seeeemmm mmmmmmmm
23 for single precision:
seeeeeee emmmmmmm mmmmmmmm mmmmmmmm
52 for double precision:
seeeeeee eeeemmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm
64 for longfloat precision (x86):
00000000 00000000 seeeeeee eeeeeeee immmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm
112 for longfloat precision (sparc):
seeeeeee eeeeeeee mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...
seeeeeee eeeeeeee mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...

Usage example(s):

 `````` 1.0 numBitsInMantissa 1.0 asShortFloat numBitsInMantissa 1.0 asLongFloat numBitsInMantissa ``````

numHiddenBits
answer the number of bits in the integer part of the mantissa.
Most floating point formats are normalized to get rid of the extra bit.
(i.e. except for LongFloats and LargeFloats,
instances are normalized to exclude any integer bit

precision
answer the precision (the number of bits in the mantissa) of my elements (in bits)
If my elements are IEEE floats, where only the fraction from the normalized mantissa is stored,
there will be a hidden bit and the mantissa will be actually represented by 1 more binary digits
(i.e. the number returned is 1 plus the actual number of bits stored).
Should be redefined in classes which allow per-instance precision specification
the hidden bit is included here

previousFloat
answer the previous representable float after myself

Usage example(s):

 `````` (1.0 previousFloat) storeString (1.0 asShortFloat previousFloat) storeString (67329.234 previousFloat) storeString (67329.234 asShortFloat previousFloat) storeString (10000000000.0 previousFloat) storeString (10000000000.0 asShortFloat previousFloat) storeString ``````

Typically, but not required to be, this will be 2
(as floats ary usually represented as IEEE binary floats)

size
redefined since reals are kludgy (ByteArry)

ulp
answer the distance between me and the next representable number;
One exception here: for fmax, the distance to the previous float is returned

Usage example(s):

 `````` (1.0 nextFloat:1) storeString (1.0 ulp) storeString (10.0 nextFloat:1) storeString (10.0 ulp) storeString (-10.0 nextFloat:1) storeString (-10.0 ulp) storeString (-10.0 nextFloat:-1) storeString (67329.234 nextFloat:1) storeString (67329.234 ulp) storeString (67329.234 asShortFloat nextFloat:1) storeString (67329.234 asShortFloat ulp) storeString Float NaN nextFloat:100000 Float infinity nextFloat:100000 1.0 ulp -> 2.22044604925031E-16 10000000000000000000000.0 ulp -> 2097152.0 34.543 ulp storeString -> '7.1054273576010019E-15' -34.543 ulp storeString -> '7.1054273576010019E-15' Float NaN ulp -> nan 0.0 ulp -> 4.94065645841247E-324 0.0 asShortFloat ulp -> 1.401298e-45 Float infinity ulp -> nan Double fmax previousFloat ulp -> 1.99584030953472E+292 Double fmax ulp -> 1.99584030953472E+292 Double fmin ulp -> 4.94065645841247e-324 Double NaN ulp -> nan ``````

special access
partValues: aBlock
invoke aBlock with sign, exponent and abs(mantissa)

Usage example(s):

 `````` 1.0 partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa]. 2.0 partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa]. -1.0 partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa]. -2.0 partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa]. 1.0 asShortFloat partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa]. 1.0 asLongFloat partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa]. 1.0 asLargeFloat partValues:[:sign :exp :mantissa | Transcript showCR:'%1/%2/%3' with:sign with:exp with:mantissa]. ``````

testing
isFinite
return true, if the receiver is a finite float (not NaN and not +/-INF)

** This method must be redefined in concrete classes (subclassResponsibility) **

isFloat
return true, if the receiver is some kind of floating point number;
true is returned here.
Same as #isLimitedPrecisionReal, but a better name ;-)

isInfinite
return true, if the receiver is an infinite float (+Inf or -Inf).
These are not created by ST/X float operations (they raise an exception);
however, inline C-code could produce them.

Usage example(s):

 `````` 1.0 isInfinite (0.0 uncheckedDivide: 0.0) isInfinite (1.0 uncheckedDivide: 0.0) isInfinite ``````

isLimitedPrecisionReal
return true, if the receiver is some kind of limited precision real (i.e. floating point) number;
true is returned here - the method is redefined from Object.

isNaN
return true, if the receiver is an invalid float (NaN - not a number).
These are usually not created by ST/X float operations (they raise an exception);
however, inline C-code or proceeded exceptions or reading from a stream
could produce them.

** This method must be redefined in concrete classes (subclassResponsibility) **

isNegativeZero
many systems have two float.Pnt zeros

Usage example(s):

 `````` 0.0 asLongFloat isNegativeZero -0.0 asLongFloat isNegativeZero -1.0 asLongFloat isNegativeZero 1.0 asLongFloat isNegativeZero 0.0 asLargeFloat isNegativeZero -0.0 asLargeFloat isNegativeZero ``````

numberOfBits
return the size (in bits) of the real;
typically, this is 64 for Floats and 32 for ShortFloats,
but who knows ...

** This method must be redefined in concrete classes (subclassResponsibility) **

positive
return true if the receiver is greater or equal to zero (not negative)

sign
return the sign of the receiver (-1, 0 or 1)

Usage example(s):

 `````` -1.0 sign -0.0 sign 1.0 sign 0.0 sign Infinity infinity sign Infinity infinity negated sign ``````

truncation & rounding
ceilingAsFloat
for protocol compatibility with floats;
returns the smallest integer which is greater or equal to the receiver as a float

Usage example(s):

 `````` 0.4 asLongFloat ceilingAsFloat ``````

floorAsFloat
for protocol compatibility with floats;
returns the receiver truncated towards negative infinity as a float

Usage example(s):

 `````` 0.4 asLongFloat floorAsFloat ``````

integerAndFractionParts
return the integer and the fraction part of the receiver as a pair
of floats (i.e. the result of the modf function).
Adding the parts gives the original value

integerPart
return a float with value from digits before the decimal point
(i.e. the truncated value)

Usage example(s):

 `````` 1234.56789 integerPart 1.2345e6 integerPart 12.5 integerPart -12.5 integerPart (5/3) integerPart (-5/3) integerPart (5/3) truncated (-5/3) truncated ``````

roundedAsFloat
for protocol compatibility with floats;
returns the receiver rounded to the nearest integer as a float

truncatedAsFloat
return the receiver truncated towards zero as a long float.
This is much like #truncated, but avoids a (possibly expensive) conversion
of the result to an integer.
It may be useful, if the result is to be further used in another
float-operation.

Usage example(s):

 `````` 0.4 asLongFloat truncatedAsFloat ``````

truncatedToPrecision
truncates to the precision of the float.
This is slightly different from truncated.
Taking for example 1e32,
the printed representation will be 1e32,
but the actual value, when truncating to an integer
would be 100000003318135351409612647563264.

This is due to the inaccuracy in the least significant bits,
and the way the print-converter compensates for this.
This method tries to generate an integer value which corresponds
to what is seen in the float's printString.

Here, a slow fallback (generating and rescanning the printString)
is provided, which should work on any float number.
Specialized versions in subclasses may be added for more performance
(however, this is probably only used rarely)

Usage example(s):

 `````` 1e32 asShortFloat truncated 1e32 asShortFloat truncatedToPrecision 1.234e10 asShortFloat truncatedToPrecision 1234e-1 asShortFloat truncatedToPrecision 1e32 truncated 1e32 truncatedToPrecision 1.234e10 truncatedToPrecision 1234e-1 truncatedToPrecision 1e32 asLongFloat truncated 1e32 asLongFloat truncatedToPrecision 1.234e10 asLongFloat truncatedToPrecision 1234e-1 asLongFloat truncatedToPrecision ``````

visiting
acceptVisitor: aVisitor with: aParameter
dispatch for visitor pattern; send #visitFloat:with: to aVisitor.

ST/X 7.7.0.0; WebServer 1.702 at 20f6060372b9.unknown:8081; Tue, 16 Jul 2024 06:52:39 GMT