Smalltalk/X Webserver

Documentation of class 'OctaFloat':

Class: OctaFloat

Inheritance
Description
Class protocol
Instance protocol

Inheritance:

   Object
   |
   +--Magnitude
      |
      +--ArithmeticValue
         |
         +--Number
            |
            +--LimitedPrecisionReal
               |
               +--AbstractIEEEFloat
                  |
                  +--OctaFloat

Package:: stx:libbasic2

Category:: Magnitude-Numbers

Version:: rev: 1.70 date: 2024/01/15 08:48:29; user: cg; file: OctaFloat.st directory: libbasic2; module: stx stc-classLibrary: libbasic2

Notice:
Unfinished, ongoing work.
Basic arithmetic should work, but rounding is not working correctly,
this affects some of the series approximations (i.e. trigonometric).
Therefore for the time being:
Please only use them if you need to represent 256bit floats to be exchanged
with the external world (i.e. to get the bit representation).

If you need more precision than double IEEE precision, wither use QDoubles;
which are faster and provide almost the same precision as OctaFloats,
or use LargeFloats which provide any precision.

OctaFloats represent rational numbers with limited precision
and are mapped to IEEE octuple precision format (256bit),
also called binary256.

Notice, that a software emulation is done, which is much slower.
Thus only use them, if you really need the additional precision;
if not, use Float (which are doubles) or LongFloats which usually have IEEE extended precision (80bit).

OctaFloats give you definite 256 bit quadruple floats,
thus, code using octaFloats is guaranteed to be portable from one architecture to another.

Representation:
256bit octuple IEEE floats (32bytes);
237 bit mantissa,
19 bit exponent,
71 decimal digits (approx.)

Mixed mode arithmetic:
octaFloat op anyFloat -> octaFloat
anyFloat op octaFloat -> octaFloat

Range and precision of storage formats: see LimitedPrecisionReal >> documentation

[aliases:]
Float256

This software is furnished under a license and may be used
only in accordance with the terms of that license and with the
inclusion of the above copyright notice. This software may not
be provided or otherwise made available to, or used by, any
other person. No title to or ownership of the software is
hereby transferred.

Class protocol:

class initialization

initialize: an alias

coercing & converting

coerce: aNumber: convert the argument aNumber into an instance of the receiver (class) and return it.
generality: return the generality value - see ArithmeticValue>>retry:coercing:

constants

NaN

return an octaFloat which represents not-a-Number (i.e. an invalid number)

Usage example(s):

     NaN := nil.
     self NaN

e

return the constant e as octaFloat

Usage example(s):

eDigits has enough digits for 256bit IEEE quads

Usage example(s):

do not use as a literal constant here - we cannot depend on the underlying C-compiler here...

Usage example(s):

     E := nil.
     OctaFloat e

Usage example(s):

Modified (comment): / 27-10-2021 / 11:54:11 / cg

halfPi

return the constant pi/2 as octaFloat

Usage example(s):

halfPiDigits has enough digits for 256bit IEEE quads

infinity

return an octaFloat which represents +INF

Usage example(s):

     PositiveInfinity := nil.
     self infinity

ln10

return the constant natural logarithm log(10) as an octaFloat.

Usage example(s):

ln10Digits has enough digits for 256bit IEEE quads

Usage example(s):

     Ln10 := nil.
     OctaFloat ln10

ln2

return the constant ln(2) as octaFloat

Usage example(s):

ln2Digits has enough digits for 256bit IEEE quads

Usage example(s):

     Ln2 := nil.
     self ln2

negativeInfinity

return an octaFloat which represents -INF

Usage example(s):

     NegativeInfinity := nil.
     self negativeInfinity

phi

return the constant phi as octaFloat

Usage example(s):

phiDigits has enough digits for 256bit IEEE quads

Usage example(s):

     Phi := nil.
     self phi

pi

return the constant pi as octaFloat

Usage example(s):

piDigits has enough digits for 256bit IEEE quads

Usage example(s):

do not use as a literal constant here - we cannot depend on the underlying C-compiler here...

Usage example(s):

     Pi := nil.
     self pi

sqrt2

return the constant sqrt(2) as OctaFloat

Usage example(s):

sqrt2Digits has enough digits for 128bit IEEE quads

Usage example(s):

     LongFloat sqrt2 -> 1.414213562373095049
     QuadFloat sqrt2 -> 1.4142135623730936799802295816596154
     OctaFloat sqrt2 -> 1.41421356237309504880168872420969807856967187537694807317667973799073249

sqrt3

return the constant sqrt(3) as OctaFloat

Usage example(s):

sqrt3Digits has enough digits for 128bit IEEE quads

Usage example(s):

     LongFloat sqrt3 -> 1.732050807568877294
     QDouble sqrt3   -> 1.73205080756888
     QuadFloat sqrt3 -> 1.7320508075688772935274463415058723
     OctaFloat sqrt3 -> 1.73205080756887729352744634150587236694280525381038062805580697945193301

unity

return the neutral element for multiplication (1.0) as OctaFloat

Usage example(s):

     OctaFloatOne := nil.
     self unity

zero

return the neutral element for addition (0.0) as OctaFloat

Usage example(s):

     OctaFloatZero := nil.
     self zero

error reportng

errorUnsupported: you may proceed from this error, to get a long float number result
(of course, with less than expected precision)

instance creation

basicNew

return a new octaFloat - here we return 0.0
- OctaFloats are usually NOT created this way ...
Its implemented here to allow things like binary store & load of octaFloats.
(but it is not a good idea to store the bits of a float - the reader might have a
totally different representation - so floats should be
binary stored in a device independent format).

basicNew: size

(comment from inherited method)
return an instance of myself with anInteger indexed variables.
If the receiver-class has no indexed instvars, this is only allowed
if the argument, anInteger is zero.
** Do not redefine this method in any class **

fromFloat: aFloat

return a new octaFloat, given a float value

Usage example(s):

     OctaFloat fromFloat:123.0
     123.0 asOctaFloat
     123 asOctaFloat

fromInteger: anInteger

return a new octaFloat, given an integer value

Usage example(s):

     self fromInteger:1
     self fromInteger:-1
     self fromInteger:2
     self fromInteger:1024 * 1024 * 1024 * 1024 * 1024 * 1024
     self fromInteger:1e20 asInteger
     self fromInteger:1e100 asInteger
     self fromInteger:2r1010101010101010101010101010101
     self fromInteger:2r1010101010101010101010101010101010101010101010101010101010101010
     self fromInteger:(2 raisedTo:10000)
     1 asIEEEFloat

Usage example(s):

     OctaFloat fromInteger:123
     123 asOctaFloat

fromLongFloat: aLongFloat

return a new octaFloat, given a long float value

Usage example(s):

     OctaFloat fromLongFloat:123.0 asLongFloat

fromShortFloat: aShortFloat

return a new octaFloat, given a float32 value

Usage example(s):

     OctaFloat fromShortFloat:123.0 asShortFloat

new: size

(comment from inherited method)
catch this message - not allowed for floats/doubles

queries

defaultExponentSizeForByteSize: nBytes

(comment from inherited method)
self defaultExponentSizeForByteSize:2 5
self defaultExponentSizeForByteSize:4 8
self defaultExponentSizeForByteSize:5 8
self defaultExponentSizeForByteSize:8 11
self defaultExponentSizeForByteSize:10 15
self defaultExponentSizeForByteSize:16 15
self defaultExponentSizeForByteSize:32 19
self defaultExponentSizeForByteSize:64 32

defaultPrintPrecision

the default number of digits when printing

defaultPrintfPrecision

the default number of digits when printing with printf's %f format.
Notice, that the C-language standard states that this should be 6;
however, we can adjust it on a per-class basis.

epsilon

return the maximum relative spacing of instances of mySelf
(i.e. the value-delta of the least significant bit)
according to ISO C standard;
Ada, C, C++ and Python language constants;
Mathematica, MATLAB and Octave; and various textbooks
see https://en.wikipedia.org/wiki/Machine_epsilon

Usage example(s):

     self epsilon

exponentCharacter

return the character used to print between mantissa an exponent.
Also used by the scanner when reading numbers.

isSupported

numBitsInExponent

answer the number of bits in the exponent.
This is a 256bit octuple float, where 19 bits are available in the exponent:
seeeeeee eeeeeeee eeeemmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...

Usage example(s):

     1.0 class numBitsInExponent -> 11
     1.0 asShortFloat class numBitsInExponent -> 8
     1.0 asLongFloat class numBitsInExponent -> 15
     1.0 asQuadFloat class numBitsInExponent -> 15
     1.0 asOctaFloat class numBitsInExponent -> 19

numBitsInMantissa

answer the number of bits in the mantissa (the significant).
The hidden bit is not counted here.
This is a 256bit octafloat,
where 236 bits are available in the mantissa:
seeeeeee eeeeeeee eeeemmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm mmmmmmmm...

Usage example(s):

     1.0 class numBitsInMantissa
     1.0 asShortFloat class numBitsInMantissa
     1.0 asLongFloat class numBitsInMantissa
     1.0 asQuadFloat class numBitsInMantissa
     1.0 asOctaFloat class numBitsInMantissa

radix

answer the radix of a OctaFloat's exponent
This is an IEEE float, which is represented as binary

Instance protocol:

arithmetic

* aNumber

return the product of the receiver and the argument.

+ aNumber

return the sum of the receiver and the argument, aNumber

- aNumber

return the difference of the receiver and the argument, aNumber

/ aNumber

return the quotient of the receiver and the argument, aNumber

abs

return the absolute value of the receiver
reimplemented here for speed

Usage example(s):

     1.0 asOctaFloat       -> 1.0
     1.0 asOctaFloat abs   -> 1.0
     -1.0 asOctaFloat abs  -> 1.0

negated

return the receiver negated

Usage example(s):

     1.0 asOctaFloat
     1.0 asOctaFloat negated
     -1.0 asOctaFloat negated

rem: aNumber

return the floating point remainder of the receiver and the argument, aNumber

coercing & converting

asFloat

return a Float (i.e. an IEEE double) with same value as the receiver.
Does NOT raise an error if the receiver exceeds the float range or is non-finite.
Returns infinity if the receiver exceeds the float range.

Usage example(s):

     1.0 asOctaFloat asFloat

asOctaFloat

1.0 asOctaFloat asOctaFloat

generality

return the generality value - see ArithmeticValue>>retry:coercing:

comparing

< aNumber

return true, if the argument is greater

= aNumber

return true, if the argument represents the same numeric value
as the receiver, false otherwise

hash

return a number for hashing; redefined, since floats compare
by numeric value (i.e. 3.0 = 3), therefore 3.0 hash must be the same
as 3 hash.

Usage example(s):

     1.2345 hash
     1.2345 asShortFloat hash
     1.2345 asLongFloat hash
     1.2345 asOctaFloat hash

     1.0 hash
     1.0 asShortFloat hash
     1.0 asLongFloat hash
     1.0 asOctaFloat hash

     0.5 asShortFloat hash
     0.5 asShortFloat hash
     0.5 asLongFloat hash
     0.5 asOctaFloat hash

     0.25 asShortFloat hash
     0.25 asShortFloat hash
     0.25 asLongFloat hash
     0.25 asOctaFloat hash

double dispatching

differenceFromOctaFloat: anOctaFloat: sent when anOctaFloat does not know how to subtract the receiver, self
equalFromOctaFloat: anOctaFloat: sent when anOctaFloat does not know how to compare against the receiver, self
lessFromOctaFloat: anOctaFloat: sent when anOctaFloat does not know how to compare against the receiver, self
productFromOctaFloat: anOctaFloat: sent when anOctaFloat does not know how to multiply the receiver, self
quotientFromOctaFloat: anOctaFloat: sent when anOctaFloat does not know how to multiply the receiver, self
sumFromOctaFloat: anOctaFloat: sent when anOctaFloat does not know how to add the receiver, self

mathematical functions

exp: return e raised to the power of the receiver
ln: return the natural logarithm of the receiver.
log: return log base 10 of the receiver.
Alias for log:10.
log2: return logarithm dualis of the receiver.

printing

printOn: aStream: self commonPrintOn:aStream

private accessing

basicAt: index: return an internal byte of the float.
The value returned here depends on byte order, float representation etc.
Therefore, this method should be used strictly private.

Notice:
the need to redefine this method here is due to the
inability of many machines to store floats in non-double aligned memory.
Therefore, on some machines, the first 4 bytes of a float are left unused,
and the actual float is stored at index 5 .. 12.
To hide this at one place, this method knows about that, and returns
values as if this filler wasnt present.
basicAt: index put: value: set an internal byte of the float.
The value to be stored here depends on byte order, float representation etc.
Therefore, this method should be used strictly private.

Notice:
the need to redefine this method here is due to the
inability of many machines to store floats in non-double aligned memory.
Therefore, on some machines, the first 4 bytes of a float are left unused,
and the actual float is stored at index 5 .. 12.
To hide this at one place, this method knows about that, and returns
values as if this filler wasnt present.
basicSize: return the size in bytes of the float.

Notice:
the need to redefine this method here is due to the
inability of many machines to store floats in non-double aligned memory.
Therefore, on some machines, the first 4 bytes of a float are left unused,
and the actual float is stored at index 5 .. 12.
To hide this at one place, this method knows about that, and returns
values as if this filler wasn't present.
exponentSize: numBitsInExponent: I have a hard-coded exponentSize;
verify that instances are created correctly here

queries

eBias

Answer the exponent's bias;
that is the offset of the zero exponent when stored

Usage example(s):

     1.0 asOctaFloat eBias  -> 262143
     1.0 asQuadFloat eBias  -> 16383
     1.0 eBias              -> 1023

emax

The largest exponent value allowed by instances like me.
This is also implemented on the instance side,
because of IEEEFloat, which has instance-specific representation.

Usage example(s):

     Float emax       -> 1023
     ShortFloat emax  -> 127
     LongFloat emax   -> 16383
     QuadFloat emax   -> 16383
     OctaFloat emax   -> 262143
     QDouble emax     -> 1023

emin

The smallest exponent value allowed by (normalized) instances of this class.
This is also implemented on the instance side,
because of IEEEFloat, which has instance-specific representation.

Usage example(s):

     Float emin
     OctaFloat emin
     OctaFloat emax

exponent

extract a normalized float's (unbiased) exponent.
The returned value depends on the float-representation of
the underlying machine and is therefore highly unportable.
This is not for general use.
This assumes that the mantissa is normalized to
0.5 .. 1.0 and the float's value is: mantissa * 2^exp

Usage example(s):

self eBias
    QuadFloat eBias => 16383
    OctaFloat eBias => 262143
     1.0 exponent                    => 1
     1.0 asOctaFloat exponent        => 1
     2.0 exponent                    => 2
     2.0 asOctaFloat exponent        => 2

     3.0 exponent                    => 2
     3.0 asOctaFloat exponent        => 2
     3.0 mantissa                    => 0.75
     3.0 asOctaFloat mantissa        => 0.75
     3.0 mantissa * (2 raisedTo:3.0 exponent) => 3.0
     3.0 asOctaFloat mantissa * (2 raisedTo:3.0 asOctaFloat exponent) => 3.0
     
     4.0 exponent                    3
     4.0 asOctaFloat exponent        3
     0.5 exponent                    0
     0.5 asOctaFloat exponent        0
     0.4 exponent                    -1
     0.4 asOctaFloat exponent        -1
     0.25 exponent                   -1
     0.25 asOctaFloat exponent       -1
     0.2 exponent                    -2
     0.2 asOctaFloat exponent        -2
     0.00000011111 exponent          -23
     0.00000011111 asOctaFloat exponent -23
     0.0 exponent                    0
     0.0 asOctaFloat exponent        0

     0.0 nextFloat               => 4.94065645841247e-324
     0.0 asOctaFloat nextFloat   wrong!
     0.0 nextFloat exponent              => -1073
     0.0 asOctaFloat nextFloat exponent  => -262377

     1e1000 exponent                -> error (INF)
     1Q1000 exponent                -> 3322
     OctaFloat fmax exponent        -> 262144
     OctaFloat fmin exponent        -> -262141
     OctaFloat NaN exponent         -> error
     OctaFloat infinity exponent    -> error

exponentBits

return the bits of my exponent.
These might be biased.

Usage example(s):

     1.0 exponentBits  -> 1023
     -1.0 exponentBits -> 1023
     10.0 exponentBits  -> 1026
     0.125 exponentBits -> 1020
     0.1 exponentBits   -> 1019

     1.0 asQuadFloat exponentBits  -> 16383
     -1.0 asQuadFloat exponentBits -> 16383
     10.0 asQuadFloat exponentBits  -> 16386
     0.125 asQuadFloat exponentBits -> 16380
     0.1 asQuadFloat exponentBits   -> 16379

     1.0 asOctaFloat exponentBits  -> 262143
     -1.0 asOctaFloat exponentBits -> 262143
     10.0 asOctaFloat exponentBits  -> 262146
     0.125 asOctaFloat exponentBits -> 262140
     0.1 asOctaFloat exponentBits   -> 262139

isFinite

return true, if the receiver is a finite float (not NaN and not +/-INF)

Usage example(s):

     1.0 asOctaFloat isFinite            true
     OctaFloat fmin isFinite             true
     OctaFloat fmax isFinite             true
     self NaN isFinite                   false
     self infinity isFinite              false
     self negativeInfinity isFinite      false
     (0.0 uncheckedDivide: 0.0) isFinite false
     (1.0 uncheckedDivide: 0.0) isFinite false

isInfinite

return true, if the receiver is an infinite float (+Inf or -Inf).

Usage example(s):

     1.0 asOctaFloat isInfinite            false
     self NaN isInfinite                   false
     self infinity isInfinite              true
     self negativeInfinity isInfinite      true
     (0.0 uncheckedDivide: 0.0) isInfinite false
     (1.0 uncheckedDivide: 0.0) isInfinite true

isNaN

return true, if the receiver is an invalid float (NaN - not a number).
These are usually not created by ST/X float operations (they raise an exception);
however, inline C-code or proceeded exceptions or reading from a stream
could produce them.

Usage example(s):

     OctaFloat NaN isNaN              true
     self NaN isNaN                   true
     self infinity isNaN              false
     self negativeInfinity isNaN      false
     (0.0 uncheckedDivide: 0.0) isNaN true
     (1.0 uncheckedDivide: 0.0) isNaN false

isZero

return true, if the receiver is zero

Usage example(s):

     0 asOctaFloat isZero
     0 asOctaFloat negated isZero
     1 asOctaFloat isZero

mantissa

extract a normalized float's mantissa (as OctaFloat).
That is a float of the same type as the receiver,
such that:
(f mantissa) * (2 ^ f exponent) = f
The returned value depends on the float-representation of
the underlying machine and is therefore highly unportable.
This is not for general use.
This assumes that the mantissa is normalized to 0.5 .. 1.0

Usage example(s):

     1.0 exponent              -> 1
     1.0 asOctaFloat exponent  -> 1
     1.0 mantissa              -> 0.5
     1.0 asOctaFloat mantissa

     0.25 exponent
     0.25 asOctaFloat exponent
     0.25 mantissa
     0.25 asOctaFloat mantissa

     0.00000011111 exponent
     0.00000011111 mantissa

     1e1000 mantissa

testing

isFloat256: Answer whether the receiver is a 256bit octuple precision float.
Always true here.
isOctaFloat: return true, if the receiver is some kind of quad floating point number (iee quad precision)

trigonometric

cos: return the cosine of the receiver (interpreted as radians)
sin: return the sine of the receiver (interpreted as radians)
tan: return the tangent of the receiver (interpreted as radians)

trigonometric - hyperbolic

cosh: return the hyperbolic cosine of the receiver (interpreted as radians)
sinh: return the hyperbolic sine of the receiver (interpreted as radians)
tanh: return the hyperbolic tangent of the receiver (interpreted as radians)

truncation & rounding

ceiling

return the smallest integer which is greater or equal to the receiver.

Usage example(s):

     0.5 asOctaFloat ceiling
     0.5 asOctaFloat ceilingAsFloat
     -0.5 asOctaFloat ceiling
     -0.5 asOctaFloat ceilingAsFloat

ceilingAsFloat

return the smallest integer-valued float greater or equal to the receiver.
This is much like #ceiling, but avoids a (possibly expensive) conversion
of the result to an integer.
It may be useful, if the result is to be further used in another float-operation.

floor

return the integer nearest the receiver towards negative infinity.

Usage example(s):

     0.5 asOctaFloat floor
     0.5 asOctaFloat floorAsFloat
     -0.5 asOctaFloat floor
     -0.5 asOctaFloat floorAsFloat

floorAsFloat

return the integer nearest the receiver towards negative infinity as a float.
This is much like #floor, but avoids a (possibly expensive) conversion
of the result to an integer.
It may be useful, if the result is to be further used in another float-operation.

ST/X 7.7.0.0; WebServer 1.702 at 20f6060372b9.unknown:8081; Sat, 16 Aug 2025 10:57:06 GMT

Smalltalk/X Webserver

Documentation of class 'OctaFloat':

Class: OctaFloat

Inheritance:

Description:

copyright

Class protocol:

Instance protocol: