Documentation of class 'EncodedStream':



Class: EncodedStream



a stream which transparently encodes/decodes to/from an external decoding.

Use 'stream:encoding:' passing the name of the encoding (eg. 'UTF8')
or 'stream:encoder:', passing an encoder instance.

Use decodedStreamFor:aStream
which looks for '{ Encoding: xxx' near the beginning of the file,
for automatic detection of the encoding.
This is especially targeted towards reading ST/X source files.


Class protocol:

o  on: aStream encodedBy: aStreamEncoder

instance creation
o  stream: streamArg encoder: encoder
s := EncodedStream stream:Transcript encoder:(CharacterEncoder encoderToEncodeFrom:#utf8 into:#unicode).
s nextPutAll:('öäü' utf8Encoded)

s := EncodedStream stream:('öäü' readStream) encoder:(CharacterEncoder encoderToEncodeFrom:#utf8 into:#unicode).
s next:3

o  stream: streamArg encoding: encodingSymbol
|baseStream s|
baseStream := '' readWriteStream.
s := EncodedStream stream:baseStream encoding:#utf8.
s nextPutAll:'öäü'.
baseStream reset; contents.

s contents

o  decodedStreamFor: aStream
given a positionable stream, guess its encoding (by reading the
first few lines, looking for a string with an encoding hint,
and return an appropriate encoded string, which does the decoding
on the fly. Used mostly to read UTF8 files (source code)

o  encoderFor: anEncodingSymbol
no encoder needed

Instance protocol:

o  on: aStream encodedBy: aStreamEncoder
Initialize the receiver on aStream with aStreamEncoder.

o  encoder

o  encoder: aCharacterEncoder

o  encoding
(comment from inherited method)
for compatibility with encoded stream

o  inputStream
(comment from inherited method)
return the receiver.
for compatibility with filtering streams

o  lineNumber
the linenumber doesn't change when characters are decoded

o  pathName
if our base stream has a pathname, delegate...

o  readStream
read from self

o  stream

o  stream: aStream
Modified (format): / 28-05-2023 / 08:11:41 / cg

o  stream: aStream encoder: aCharacterEncoder

chunk input/output
o  nextChunk
reads a smalltalk chunk.
as a side effect, check for an encoding chunk

obsolete positioning
o  position0Based
to be obsoleted - use position

** This is an obsolete interface - do not use it (it may vanish in future versions) **

o  position1Based
to be obsoleted - use position

** This is an obsolete interface - do not use it (it may vanish in future versions) **

o  contentsSpecies
(comment from inherited method)
this should return the class of which an instance is
returned by the #contents method. Here, Array is returned,
since the abstract Stream-class has no idea of the underlying
collection class.
It is redefined in some subclasses - for example, to return String.

o  isEncoderFor: encodingString

o  next
(comment from inherited method)
return the next element of the stream
- we do not know here how to do it, it must be redefined in subclass

stream protocol
o  atEnd
(comment from inherited method)
return true if the end of the stream has been reached;
- we do not know here how to do it, it must be redefined in subclass

o  close
(comment from inherited method)
close the stream - nothing done here.
Added for compatibility with external streams.

o  collection
return the underlying container; nil, if there is none (eg. external streams).
Here we return nil, as the underlying collection (if any) is useless to the outside world

o  contents
bad bad bad: this is ok, when used as a read stream;
but what if I want to fetch the collected data of a writeStream???

o  emphasis: anObject
(comment from inherited method)
ignored here
- allows Streams to be used interchangable with text streams

o  flush
(comment from inherited method)
write out all buffered data - ignored here, but added
to make internalStreams protocol compatible with externalStreams

o  isEmpty
(comment from inherited method)
return true, if the contents of the stream is empty

o  next: nCharactersToRead
(comment from inherited method)
return the next count elements of the stream as aCollection,
which depends on the stream's type - (see #contentsSpecies).

o  nextPut: aCharacter
write the argument, aCharacter.
Answer aCharacter

o  nextPutAll: aCollection
Write each of the objects in aCollection to the receiver stream.
Answer the receiver.

o  nextPutAll: aCollection startingAt: start to: stop
append the elements from first index to last index
of the argument, aCollection onto the receiver (i.e. both outstreams)

o  nextPutAllUnicode: aCollection
Write each of the objects in aCollection to the receiver stream.
Answer the receiver.

This is sent by UnicodeString. Redefined, to avoid duplicate UTF encoding

o  nextPutUtf8: aCharacter
write the argument, aCharacter.
Answer aCharacter.

This is sent by UnicodeString. Redefined, to avoid duplicate UTF encoding

o  peek
(comment from inherited method)
return the next element of the stream without advancing (i.e.
the following send of next will return this element again.)
- we do not know here how to do it, it must be redefined in subclass

o  position
only use #position/#position: to restore a previous position.
Computing relative positions does not work!

Usage example(s):

#position: nils peekChar - make sure, that it positions before peekChar

o  position0Based: newPosition
to be obsoleted - use position

** This is an obsolete interface - do not use it (it may vanish in future versions) **

o  position1Based: newPosition
to be obsoleted - use position

** This is an obsolete interface - do not use it (it may vanish in future versions) **

o  position: newPosition
only use #position/#position: to restore a previous position.
Computing relative positions does not work!
Use #skip: to advance forward.

o  reset

o  setToEnd
Modified (comment): / 09-01-2018 / 17:50:27 / stefan

o  size
not always correct, but probably better than 0.
Better use #isEmpty.

Usage example(s):

self error:'size of input is unknown (due to decoding)'

o  skip: nrToSkip
can only skip forward
returns the receiver

o  sync
(comment from inherited method)
make sure, that the OS writes cached data to the disk
- ignored here, but added to make internalStreams protocol compatible with externalStreams

o  syncData
(comment from inherited method)
tell the OS to ensure that data is synced to disk - ignored here, but added
to make internalStreams protocol compatible with externalStreams

o  isEncodedStream
(comment from inherited method)
true, iff this is an encoder/decoder stream

o  isOpen
for compatibility with externalStream:
return true, if this stream is open.

o  isPositionable
(comment from inherited method)
return true, if the stream supports positioning (some do not).
Since this is an abstract class, false is returned here - just to make certain.

o  isReadable
(comment from inherited method)
return true, if reading is supported by the receiver.
This has to be redefined in concrete subclasses.

o  isUnicodeEncoded
return true, if the streamed collection is any string (8-, 16- or 32-bit),
which definitly is not using UTF-x or JIS or any other encoding.
I.e. 'self next' always returns a unicode character.

o  isUnicodeEncoded: aBoolean
do not set - it is determined implicitly by the encoder

o  isWritable
(comment from inherited method)
return true, if writing is supported by the receiver.
This has to be redefined in concrete subclasses.

o  skipEncodingChunk
if this is a valid chunk (i.e. not a comment or encoding-directive),

o  writeByteOrderMark
write the BOM bytes, that mark whether the stream is UTF8, UTF16-LE, UTF16-BE,... encoded.
Must be the first bytes in the stream/file.


|s es utfString decodedString|

s := '' writeStream.
es := EncodedStream stream:s encoding:#UTF8.
es nextPutAll:'abcäöüαβγ∆∇∀∃'.
utfString := es contents.

s := utfString readStream.
es := EncodedStream stream:s encoding:#UTF8.
decodedString := es upToEnd

