|
Class: TableData
Object
|
+--TableData
- Package:
- stx:libbasic2
- Category:
- Collections-Sequenceable-Table
- Version:
- rev:
1.11
date: 2022/06/30 13:46:07
- user: stefan
- file: TableData.st directory: libbasic2
- module: stx stc-classLibrary: libbasic2
Unfinished
Ongoing work to support some algorithms on table data (such as CSV files).
Snippets to read and process tabular (CSV) data
Snippets to generate a plot.
The category DWD (Deutscher Wetter Dienst / German weather service)
contains specific code to deal with historic weather data.
plot requires the r language to be installed (uses 'r' command).
[instance variables:]
rowData the actual data
columnNames name per column
columnTypes type per column (default is String)
tableName name - only used for labeling graphs
[class variables:]
copyrightCOPYRIGHT (c) 2018 by eXept Software AG
All Rights Reserved
This software is furnished under a license and may be used
only in accordance with the terms of that license and with the
inclusion of the above copyright notice. This software may not
be provided or otherwise made available to, or used by, any
other person. No title to or ownership of the software is
hereby transferred.
instance creation
-
fromFile: filename
-
self fromFile:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt'
-
rows: rowData columnNames: names
-
-
rows: rowData columnNames: names tableName: tableName
-
accessing
-
getColumn: index
-
return a column (by index) as a vector
Usage example(s):
self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
getColumn:1.
|
-
getColumnNamed: name
-
return a column (by name) as a vector
Usage example(s):
self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
getColumnNamed:'TXK'.
|
-
getColumns: indexCollection
-
return multiple columns (by index vector) as a vector of columns
Usage example(s):
self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
getColumns:#(1 2 3).
|
-
getColumnsNamed: names
-
return multiple columns (by name vector) as a vector of columns
Usage example(s):
self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
getColumnsNamed:#('MESS_DATUM' 'TXK' 'TNK').
|
-
tableName: aString
-
set the table's name
-
tableNamePrefix: aString
-
prepend a prefix to the table's name
-
tableNameSuffix: aString
-
append a suffix to the table's name
analysis
-
addBincoSlidingMean3ForColumnNamed: colName
-
add a column with the sliding binco mean (1/4 + 1/2 + 1/4).
As binco:3 is quite common,
that is a tuned version of addBincoSlidingMean:3 forColumnNamed:colName
Usage example(s):
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
meanPerYearOfColumnNamed:'TXK')
addBincoSlidingMean3ForColumnNamed:'Mean_Per_Year_TXK')
tableName:'Feldberg/Schwarzwald';
plot:{ #x -> 'MESS_DATUM' . #y -> { 'Mean_Per_Year_TXK' . 'Sliding_Binco_Mean_of_Mean_Per_Year_TXK' . }}.
|
-
addBincoSlidingMean: n forColumnNamed: colName
-
add a column with the sliding binco mean (1/2^n + ... + 1/4 + 1/2 + 1/4 + ... + 1/2^n)
Usage example(s):
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
meanPerYearOfColumnNamed:'TXK')
addBincoSlidingMean:5 forColumnNamed:'Mean_Per_Year_TXK')
tableName:'Feldberg/Schwarzwald';
plot:{ #x -> 'MESS_DATUM' .
#y -> { 'Mean_Per_Year_TXK' . 'Sliding_Binco_Mean_of_Mean_Per_Year_TXK' . }}.
|
-
addSlidingMean: n forColumnNamed: colName
-
add a column with the sliding mean.
The sliding mean looks meaner, but may introduce lag (phase shift),
which binco avoids.
Usage example(s):
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
meanPerYearOfColumnNamed:'TXK')
addSlidingMean:11 forColumnNamed:'Mean_Per_Year_TXK')
tableName:'Feldberg/Schwarzwald';
plot:{ #x -> 'MESS_DATUM' . #y -> { 'Mean_Per_Year_TXK' . 'Sliding_Mean_of_Mean_Per_Year_TXK' . }}.
|
-
extractRowsWhere: filterBlock
-
return a new table containing only rows for which filterBlock evaluates to true
Usage example(s):
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
meanPerMonthOfColumnNamed:'TXK')
extractRowsWhere:[:row | (row at:1) startsWith:'1945'])
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
meanPerMonthOfColumnNamed:'TXK')
extractRowsWhere:[:row | ((row at:1) from:5 to:6) = '01'])
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
meanPerMonthOfColumnNamed:'TXK')
extractMonth:8)
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
meanPerMonthOfColumnNamed:'TXK')
extractMonth:9)
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
meanPerMonthOfColumnNamed:'TXK')
extractMonth:10)
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19530101_20171231_04931.txt')
meanPerMonthOfColumnNamed:'TXK')
extractMonth:10)
tableNameSuffix:'-Echterdingen';
plot
|
analysis - DWD
-
dwd_extractMonth: monthIndex
-
return a new table containing only rows for that month.
This is specific to DWD data
Usage example(s):
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:1 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:2 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:8 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:9 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:10 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19530101_20171231_04931.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:10 columnName:'MESS_DATUM' format:'%4y%2m')
tableNameSuffix:'-Echterdingen';
plot
|
-
dwd_extractMonth: monthIndex columnName: monthColumnName
-
return a new table containing only rows for that month.
This is specific to DWD data
Usage example(s):
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:1 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:2 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:8 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:9 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:10 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19530101_20171231_04931.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:10 columnName:'MESS_DATUM' format:'%4y%2m')
tableNameSuffix:'-Echterdingen';
plot
|
-
dwd_extractMonth: monthIndex columnName: monthColumnName format: dateFormat
-
return a new table containing only rows for that month.
This is specific to DWD data
Usage example(s):
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:1 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:2 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:8 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:9 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:10 columnName:'MESS_DATUM' format:'%4y%2m')
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19530101_20171231_04931.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:10 columnName:'MESS_DATUM' format:'%4y%2m')
tableNameSuffix:'-Echterdingen';
plot
(((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19530101_20171231_04931.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:3 columnName:'MESS_DATUM' format:'%4y%2m')
tableNameSuffix:'-Echterdingen';
plot
|table|
table := (self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19530101_20171231_04931.txt').
1 to:12 do:[:month |
((table
dwd_meanPerMonthOfColumnNamed:'TXK')
dwd_extractMonth:month columnName:'MESS_DATUM' format:'%4y%2m')
tableNameSuffix:'-Echterdingen';
plot
]
|
-
dwd_extractMonth: month day: day
-
return a new table containing only rows for that day in month
Usage example(s):
((self fromFile:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_extractMonth:1 day:1)
tableName:'1st Januar, Feldberg';
plot:{ #x -> 'MESS_DATUM' . #y -> { 'TXK' . 'TNK' }}
|
-
dwd_extractMonth: month day: day columnName: monthColumnName format: dateFormat
-
return a new table containing only rows for that day in month
Usage example(s):
((self fromFile:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_extractMonth:1 day:1 columnName:'MESS_DATUM' format:'%4y%2m%2d')
plot:{ #x -> 'MESS_DATUM' . #y -> { 'TXK' . 'TNK' }}
|
-
dwd_meanPerMonthOfColumnNamed: colName
-
return a new table containing the arithmetic mean per month of a column
Usage example(s):
((self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerMonthOfColumnNamed:'TXK')
plot
|
-
dwd_meanPerYearOfColumnNamed: colName
-
return a new table containing the arithmetic mean per year of a column
Usage example(s):
((self new
read:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
dwd_meanPerYearOfColumnNamed:'TXK')
plot
|
analysis - masie
-
masie_meanPerMonthOfColumnNamed: colName
-
return a new table containing the arithmetic mean per month of a column
of masie data
Usage example(s):
((self new
readCSV:'/Users/exept/Downloads/masie_4km_allyears_extent_sqkm.csv' separator:$, skip:1)
masie_meanPerMonthOfColumnNamed:'(0) Northern_Hemisphere')
plot
|
helpers
-
indexOfColumnNamed: name
-
find a column index by name
Usage example(s):
self new
read:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
indexOfColumnNamed:'MESS_DATUM'.
|
-
indexOfColumnNamed: name ifAbsent: exceptionValue
-
find a column index by name
Usage example(s):
self new
read:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
indexOfColumnNamed:'MESS_DATUM'.
|
instance creation
-
rows: rowDataArg columnNames: columnNamesArg
-
-
rows: rowDataArg columnNames: columnNamesArg tableName: tableNameArg
-
plotting
-
plot
-
|data tmax|
data := self fromFile:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt'.
tmax := data withColumnsNamed:#('MESS_DATUM' 'TXK' 'TNK').
tmax plot:{ #x -> 'MESS_DATUM' . #y -> { 'TXK' 'TNK' }} .
-
plot: optionalSpec
-
FileBrowserV2 openOn:tmpFile.
-
plot: optionalSpec file: fileName
-
'plot(',xCol,', '
Usage example(s):
|data tmax|
data := self fromFile:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt'.
tmax := data withColumnsNamed:#('MESS_DATUM' 'TXK').
tmax tableName:'Feldberg'.
tmax plot:{ #x -> 'MESS_DATUM' . #y -> { 'TXK' . 'TNK' }}.
|
Usage example(s):
|data tmax|
data := self fromFile:'/Users/exept/Desktop/klima/monatswerte_KL_01050_19240101_20181231_hist/produkt_klima_monat_19240101_20181231_01050.txt'.
tmax := data withColumnsNamed:#('MESS_DATUM_BEGINN' 'MO_TT').
tmax tableName:'Feldberg'.
tmax plot:{ #x -> 'MESS_DATUM_BEGINN' . #y -> { 'MO_TT' }}.
|
-
plotFile: fileName
-
|data tmax|
data := self fromFile:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt'.
tmax := data withColumnsNamed:#('MESS_DATUM' 'TXK').
tmax tableName:'Test'.
tmax plot.
processing
-
removeColumn: index
-
destructively remove a column
Usage example(s):
self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
removeColumn:1;
inspect.
|
-
removeColumnNamed: name
-
destructively remove a column
Usage example(s):
self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
removeColumnNamed:'STATIONS_ID';
removeColumnNamed:'QN_3';
removeColumnNamed:'FX';
removeColumnNamed:'FM';
removeColumnNamed:'RSK';
removeColumnNamed:'RSKF';
removeColumnNamed:'SDK';
removeColumnNamed:'SHK_TAG';
removeColumnNamed:'NM';
removeColumnNamed:'VPM';
removeColumnNamed:'PM';
removeColumnNamed:'TMK';
removeColumnNamed:'UPM';
removeColumnNamed:'eor';
inspect.
|
-
withColumns: indexCollection
-
return a new TableData instance, containing only the given columns
Usage example(s):
(self fromFile:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
withColumns:#(1 2 3).
|
-
withColumnsNamed: nameCollection
-
return a new TableData instance, containing only the given columns
Usage example(s):
(self fromFile:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
withColumnsNamed:#('MESS_DATUM' 'TXK').
|
-
withoutColumn: index
-
return a new TableData instance, without the given columns
Usage example(s):
self new
readCSV:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
withoutColumn:1.
|
-
withoutColumns: indexCollection
-
return a new TableData instance, without the given columns
Usage example(s):
(self fromFile:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt')
withoutColumns:#(1 2 3).
|
reading
-
readCSV: filename
-
self new
readCSV:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
inspect.
Usage example(s):
self new
readCSV:'/Users/exept/Downloads/masie_4km_allyears_extent_sqkm.csv';
inspect.
|
-
readCSV: filename separator: separatorCharacter
-
self new
readCSV:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
inspect.
-
readCSV: filename separator: separatorCharacter skip: numLinesToSkip
-
self new
readCSV:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
inspect.
-
readCSV: filename skip: numLinesToSkip
-
self new
readCSV:'/Users/cg/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt';
inspect.
Usage example(s):
self new
readCSV:'/Users/exept/Downloads/masie_4km_allyears_extent_sqkm.csv' skip:1;
inspect.
|
-
readDemoCSV
-
writing
-
writeCSVToStream: aStream
-
save myself as CSV onto a stream
Usage example(s):
|data tmax|
data := self fromFile:'~/Documents/klima/data/DWD/daily/kl/historical/produkt_klima_tag_19450101_20171231_01346.txt'.
tmax := data withColumnsNamed:#('MESS_DATUM' 'TXK' 'TNK').
String streamContents:[:s | tmax writeCSVToStream:s]
|
self new
readCSV:'/Users/exept/Downloads/masie_4km_allyears_extent_sqkm.csv';
inspect.
((self new
readCSV:'/Users/exept/Downloads/masie_4km_allyears_extent_sqkm.csv' separator:$, skip:1)
masie_meanPerMonthOfColumnNamed:'(0) Northern_Hemisphere')
plot
'/tmp/xxx.csv' asFilename writingFileDo:[:s |
((self new
readCSV:'/Users/exept/Downloads/masie_4km_allyears_extent_sqkm.csv' separator:$, skip:1)
masie_meanPerMonthOfColumnNamed:'(0) Northern_Hemisphere')
writeCSVToStream:s
]
|data tmax|
data := self fromFile:'/Users/exept/Desktop/klima/monatswerte_KL_04927_17920101_19840731_hist/produkt_klima_monat_17920101_19840731_04927.txt'.
tmax := data withColumnsNamed:#('MESS_DATUM_BEGINN' 'MO_TT').
tmax tableName:'Stuttgart 04927'.
tmax plot:{ #x -> 'MESS_DATUM_BEGINN' . #y -> { 'MO_TT' }}.
|data|
data := self fromFile:'/Users/exept/Desktop/klima/monatswerte_KL_04927_17920101_19840731_hist/produkt_klima_monat_17920101_19840731_04927.txt'.
data := data withColumnsNamed:#('MESS_DATUM_BEGINN' 'MO_TT').
data := data dwd_meanPerYearOfColumnNamed:'MO_TT'.
data tableName:'Stuttgart 04927'.
data plot:{ #x -> 'MESS_DATUM' . #y -> { 'Mean_Per_Year_MO_TT' }}.
|data|
data := self fromFile:'/Users/exept/Desktop/klima/monatswerte_KL_05792_19000801_20181231_hist/produkt_klima_monat_19000801_20181231_05792.csv'.
data := data withColumnsNamed:#('MESS_DATUM_BEGINN' 'MO_TT').
data := data dwd_meanPerYearOfColumnNamed:'MO_TT'.
data tableName:'Zugspitze 04927'.
data plot:{ #x -> 'MESS_DATUM' . #y -> { 'Mean_Per_Year_MO_TT' }}.
|