The talk was delivered by Ying Zhang at the the First International Array Databases Workshop , co-located with the EDBT/ICDT 2011 Joint Conference on March 25, 2011 in Uppsala, Sweden.
Publication: http://bit.ly/zyQPBq
Abstract:
Scientific applications are still poorly served by contemporary relational database systems. At best, the system provides a bridge towards an external library using user-defined functions, explicit import/export facilities or linked-in Java/C# interpreters. Time has come to rectify this with SciQL1, a SQL query language for scientific applications with arrays as first class citizens. It provides a seamless symbiosis of array-, set-, and sequence- interpretation using a clear separation of the mathematical object from its underlying implementation. A key innovation is to extend valuebased grouping in SQL:2003 with structural grouping, i.e., fixedsized and unbounded groups based on explicit relationships between their dimension attributes. It leads to a generalization of window-based query processing with wide applicability in science domains. This paper is focused on the language features, extensively illustrated with examples of its intended use.
Axa Assurance Maroc - Insurer Innovation Award 2024
SciQL, A Query Language for Science Applications
1. SciQL
A Query Language for Science
Applications
M. Kersten, Y. Zhang, M. Ivanova, N. Nes
CWI Amsterdam
Array Database Workshop
March 25th, 2011
2. Who needs arrays anyway?
Seismology – 1-D time-series, 3-D spatial data
Astronomy – temporal ordered rasters
Climate simulation – temporal ordered grid
Remote sensing – images of 2-D or higher
Genomics – ordered DNA strings
Scientists love arrays:
HDF5, NETCDF, FITS, MSEED, …
but also use:
lists, tables, XML, ...
2011-03-25 Array Database workshop 2
3. Arrays In DBMS
Research issues already in the 80’s
SQL language extension (add notion of order):
RasQL, AQuery, SRQL, ...
SQL:1999, SQL:2003
collection type, C-style arrays
Algebraic frameworks
(S)RAM, AQL, AML, ...
2011-03-25 Array Database workshop 3
4. Arrays In DBMS
DBMS support
OODB, multi-dimensional DBMS, Sequence DBMS, ...
the Longhorn Array Database
RasDaMan
Array in chunks as BLOB
Array query optimisation on top of DBMS
Known to work up to 12 TBs!
PostgreSQL 8.1
SciDB
Array DBMS from scratch
Overlapping chunks for parallel execution
2011-03-25 Array Database workshop 4
5. What is the problem with RDBMS?
Appropriate array denotations?
Functional complete operation set?
Size limitations due to (BLOB) representations?
Existing foreign files?
Scale?
...
2011-03-25 Array Database workshop 5
6. SciQL
An extension of SQL:2003 (pronounced as ‘cycle’)
Array as first class citizens of DBMS
Seamless integration of tables and arrays
Named dimensions with constraints
Flexible structure-based grouping
Seismology use case
2011-03-25 Array Database workshop 6
7. Array Definitions
Fixed array
y null
CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0
x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0
y INT DIMENSION[0:4:1], null null
1 0.0 0.0 0.0 0.0
v FLOAT DEFAULT 0.0
); 0 0.0 0.0 0.0 0.0
x
0 1 2 3
null
2011-03-25 Array Database workshop 7
8. Array Definitions
Fixed array
y null
CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0
x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0
y INT DIMENSION[0:4:1], null null
1 0.0 0.0 0.0 0.0
v FLOAT DEFAULT 0.0
); 0 0.0 0.0 0.0 0.0
x
0 1 2 3
null
Unbounded array
y
CREATE ARRAY A2 ( 3
x INT DIMENSION, 2
y INT DIMENSION, null
1
v FLOAT DEFAULT 0.0
0
);
x
0 1 2 3
2011-03-25 Array Database workshop 7
9. Array Definitions
Fixed array
y null
CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0
x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0
y INT DIMENSION[0:4:1], null null
1 0.0 0.0 0.0 0.0
v FLOAT DEFAULT 0.0
); 0 0.0 0.0 0.0 0.0
x
0 1 2 3
null
Unbounded array
y
CREATE ARRAY A2 ( 3
x INT DIMENSION, 2
y INT DIMENSION, null
1
v FLOAT DEFAULT 0.0
0
);
x
0 1 2 3
y
3 null
INSERT INTO A2 VALUES 2 0.0 4.5
(1,0,5.5), (1,1,0.4), (2,2,4.5); 1 null 0.4 0.0 null
0 5.5 0.0
x
0 1 2 3
null
2011-03-25 Array Database workshop 7
10. Array Definitions
Fixed array
y null
CREATE ARRAY A1 ( 3 0.0 0.0 0.0 0.0
x INT DIMENSION[0:4:1], 2 0.0 0.0 0.0 0.0
y INT DIMENSION[0:4:1], null null
1 0.0 0.0 0.0 0.0
v FLOAT DEFAULT 0.0
); 0 0.0 0.0 0.0 0.0
x
0 1 2 3
null
Unbounded array
y
CREATE ARRAY A2 ( 3
x INT DIMENSION, 2
y INT DIMENSION, null
1
v FLOAT DEFAULT 0.0
0
);
x
0 1 2 implicit size
3
y
3 null
INSERT INTO A2 VALUES 2 0.0 4.5
(1,0,5.5), (1,1,0.4), (2,2,4.5); 1 null 0.4 0.0 null
0 5.5 0.0
x
0 1 2 3
null
2011-03-25 Array Database workshop 7
11. Array Dimensions
CREATE ARRAY A1 ( CREATE ARRAY A2 (
x INT DIMENSION[0:4:1], x INT DIMENSION,
y INT DIMENSION[0:4:1], y INT DIMENSION,
v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0
); );
Fixed dimensions: [start:final:step]
INT dimension: [size]
Unbounded dimensions: [(start|∗) : (final|∗) : (step|∗)]
Dimension data type: scalar data types
Time series:
CREATE ARRAY Experiment (
time TIMESTAMP DIMENSION [TIMESTAMP ‘2011-03-25’ : * :
INTERVAL ‘1’ MINUTE],
data FLOAT );
2011-03-25 Array Database workshop 8
12. Array versus Table
CREATE ARRAY A1 ( CREATE TABLE T1 (
x INT DIMENSION[0:4:1], x INT,
y INT DIMENSION[0:4:1], y INT, PRIMARY KEY (x,y),
v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0
); );
SELECT * FROM A1; SELECT * FROM T1;
x y v x y v
0 0 0.0
0 1 0.0
0 2 0.0
0 3 0.0
1 0 0.0
1 1 0.0
1 2 0.0
1 3 0.0
2 0 0.0
2 1 0.0
2 2 0.0
2 3 0.0
3 0 0.0
3 1 0.0
3 2 0.0
3 3 0.0
2011-03-25 Array Database workshop 9
13. Array versus Table
CREATE ARRAY A1 ( CREATE TABLE T1 (
x INT DIMENSION[0:4:1], x INT,
y INT DIMENSION[0:4:1], y INT, PRIMARY KEY (x,y),
v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0
); );
SELECT * FROM A1; SELECT * FROM T1;
x y v x y v
0 0 0.0
0 1 0.0
0 2 0.0
0 3 0.0
1 0 0.0
1 1 0.0
1 2 0.0
1 3 0.0
2 0 0.0
2 1 0.0
2 2 0.0
2 3 0.0
3 0 0.0
3 1 0.0
3 2 0.0
3 3 0.0
2011-03-25 Array Database workshop 9
14. Array versus Table
CREATE ARRAY A1 ( CREATE TABLE T1 (
x INT DIMENSION[0:4:1], x INT,
y INT DIMENSION[0:4:1], y INT, PRIMARY KEY (x,y),
v FLOAT DEFAULT 0.0 v FLOAT DEFAULT 0.0
); );
A collection of a priori defined tuples A collection of tuples
To be updated with INSERT/DELETE Explicitly create/remove with INSERT/
(and UPDATE) DELETE
Indexed by dimension expressions Indexed by a (primary) key
Default value for non-dimensional Default value for each column
attributes (i.e., cells)
2011-03-25 Array Database workshop 10
15. Array & Table Coercions
CREATE ARRAY A1 ( SELECT x, y, v FROM A1;
x INT DIMENSION[0:4:1],
y INT DIMENSION[0:4:1], x y v
v FLOAT DEFAULT 0.0 0 0 0.0
); 0 1 0.0
y null
0 2 0.0
3 0.0 0.0 0.0 0.0 0 3 0.0
null
2 0.0 0.0 0.0 0.0
null
1 0 0.0
1 0.0 0.0 0.0 0.0
1 1 0.0
0 0.0 0.0 0.0 0.0
0 1 2 3
x 1 2 0.0
null
1 3 0.0
2 0 0.0
2 1 0.0
2 2 0.0
2 3 0.0
3 0 0.0
3 1 0.0
3 2 0.0
3 3 0.0
2011-03-25 Array Database workshop 11
16. Array & Table Coercions
CREATE ARRAY A1 ( SELECT x, y, v FROM A1;
x INT DIMENSION[0:4:1],
y INT DIMENSION[0:4:1], x y v
v FLOAT DEFAULT 0.0 0 0 0.0
); 0 1 0.0
y null
0 2 0.0
3 0.0 0.0 0.0 0.0 0 3 0.0
null
2 0.0 0.0 0.0 0.0
null
1 0 0.0
1 0.0 0.0 0.0 0.0
1 1 0.0
0 0.0 0.0 0.0 0.0
0 1 2 3
x 1 2 0.0
null
1 3 0.0
2 0 0.0
2 1 0.0
2 2 0.0
2 3 0.0
3 0 0.0
full materialisation! 3 1 0.0
3 2 0.0
3 3 0.0
2011-03-25 Array Database workshop 11
17. Array & Table Coercions
CREATE TABLE T2 (
x INT, y INT, v FLOAT
);
INSERT INTO T2 VALUES
(1,0,5.5), (1,1,0.4),
(2,2,4.5), (1,1,1.3);
x y v
1 0 5.5
1 1 0.4
2 2 4.5
1 1 1.3
2011-03-25 Array Database workshop 12
18. Array & Table Coercions
CREATE TABLE T2 (
x INT, y INT, v FLOAT
);
INSERT INTO T2 VALUES SELECT [x], [y], v FROM T2;
(1,0,5.5), (1,1,0.4),
(2,2,4.5), (1,1,1.3);
y
x y v
3 0.0
1 0 5.5
2 0.0 4.5
1 1 0.4
1 0.0 0.4 0.0 0.0
2 2 4.5
0 5.5 0.0
1 1 1.3 x
0 1 2 3
0.0
An unbounded array
min/max of dimensions are derived from the
minimal bounding rectangle
non-dimentional attributes inherit default
column values
duplicates are overwritten
2011-03-25 Array Database workshop 12
19. Array Modifications
CREATE ARRAY A1 (
x INT DIMENSION[0:4:1],
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
DELETE FROM A1 WHERE x = 1;
y null
3 0.0 null 0.0 0.0
2 0.0 null 0.0 0.0
null null
1 0.0 null 0.0 0.0
0 0.0 null 0.0 0.0
0 1 2 3
x
null
2011-03-25 Array Database workshop 13
20. Array Modifications
CREATE ARRAY A1 (
x INT DIMENSION[0:4:1],
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
DELETE FROM A1 WHERE x = 1;
y null
3 0.0 null 0.0 0.0
2 0.0 null 0.0 0.0
null null
1 0.0 null 0.0 0.0
0 0.0 null 0.0 0.0
0 1 2 3
x
null
creates holes in
the array
2011-03-25 Array Database workshop 13
21. Array Modifications
CREATE ARRAY A1 (
x INT DIMENSION[0:4:1],
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES (1,1,0.5), (2,1,0.5), (3,1,0.5);
y null
3 0.0 null 0.0 0.0
2 0.0 null 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 null 0.0 0.0
0 1 2 3
x
null
2011-03-25 Array Database workshop 14
22. Array Modifications
CREATE ARRAY A1 (
x INT DIMENSION[0:4:1],
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES (1,1,0.5), (2,1,0.5), (3,1,0.5);
y null
3 0.0 null 0.0 0.0
2 0.0 null 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 null 0.0 0.0
0 1 2 3
x
null
set (change)
values of cells
2011-03-25 Array Database workshop 14
23. Array Views
CREATE ARRAY A1 (
x INT DIMENSION[0:4:1],
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT -1.0
);
INSERT INTO A1 VALUES
(1,1,0.5), (2,1,0.5), (3,1,0.5);
y null
3 -1.0 -1.0 -1.0 -1.0
2 -1.0 -1.0 -1.0 -1.0
null null
1 -1.0 0.5 0.5 0.5
0 -1.0 -1.0 -1.0 -1.0
0 1 2 3
x
null
2011-03-25 Array Database workshop 15
24. Array Views
CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 (
x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1],
y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1],
v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0
); ) AS
SELECT x-1, y, v FROM A1 WHERE x > 1
INSERT INTO A1 VALUES UNION
(1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3;
y null
3 -1.0 -1.0 -1.0 -1.0
2 -1.0 -1.0 -1.0 -1.0
null null
1 -1.0 0.5 0.5 0.5
0 -1.0 -1.0 -1.0 -1.0
0 1 2 3
x
null
2011-03-25 Array Database workshop 15
25. Array Views
CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 (
x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1],
y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1],
v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0
); ) AS
SELECT x-1, y, v FROM A1 WHERE x > 1
INSERT INTO A1 VALUES UNION
(1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3;
y null
3 -1.0 -1.0 -1.0 -1.0
2 -1.0 -1.0 -1.0 -1.0
null null
1 -1.0 0.5 0.5 0.5
0 -1.0 -1.0 -1.0 -1.0
0 1 2 3
x
null
2011-03-25 Array Database workshop 15
26. Array Views
CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 (
x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1],
y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1],
v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0
); ) AS
SELECT x-1, y, v FROM A1 WHERE x > 1
INSERT INTO A1 VALUES UNION
(1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3;
y null
3 -1.0 -1.0 -1.0 -1.0
2 -1.0 -1.0 -1.0 -1.0
null null
1 -1.0 0.5 0.5 0.5
0 -1.0 -1.0 -1.0 -1.0
0 1 2 3
x
null
2011-03-25 Array Database workshop 15
27. Array Views
CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 (
x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1],
y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1],
v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0
); ) AS
SELECT x-1, y, v FROM A1 WHERE x > 1
INSERT INTO A1 VALUES UNION
(1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3;
y null
y null 4 0.0 0.0 0.0 0.0 0.0 0.0
3 -1.0 -1.0 -1.0 -1.0 3 0.0 0.0 0.0 0.0 0.0 0.0
2 -1.0 -1.0 -1.0 -1.0 2 0.0 0.0 0.0 0.0 0.0 0.0
null null null null
1 -1.0 0.5 0.5 0.5 1 0.0 0.0 0.0 0.0 0.0 0.0
0 -1.0 -1.0 -1.0 -1.0 0 0.0 0.0 0.0 0.0 0.0 0.0
0 1 2 3
x
-1 0.0 0.0 0.0 0.0 0.0 0.0
null -1 0 1 2 3 4
x
null
2011-03-25 Array Database workshop 15
28. Array Views
CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 (
x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1],
y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1],
v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0
); ) AS
SELECT x-1, y, v FROM A1 WHERE x > 1
INSERT INTO A1 VALUES UNION
(1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3;
y null
y null 4 0.0 0.0 0.0 0.0 0.0 0.0
3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0
0.0 0.0 0.0 0.0
2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0
0.0 0.0 0.0 0.0
null null null null
1 -1.0 0.5 0.5 0.5 1 0.0 0.5
0.0 0.5
0.0 0.5
0.0 0.0 0.0
0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0
0.0 0.0 0.0 0.0
0 1 2 3
x
-1 0.0 0.0 0.0 0.0 0.0 0.0
null -1 0 1 2 3 4
x
null
2011-03-25 Array Database workshop 15
29. Array Views
CREATE ARRAY A1 ( CREATE ARRAY VIEW A2 (
x INT DIMENSION[0:4:1], x INT DIMENSION [-1:5:1],
y INT DIMENSION[0:4:1], y INT DIMENSION [-1:5:1],
v FLOAT DEFAULT -1.0 w FLOAT DEFAULT 0.0
); ) AS
SELECT x-1, y, v FROM A1 WHERE x > 1
INSERT INTO A1 VALUES UNION
(1,1,0.5), (2,1,0.5), (3,1,0.5); SELECT x, y, 1.0 FROM A1 WHERE x = 3;
y null
y null 4 0.0 0.0 0.0 0.0 0.0 0.0
3 -1.0 -1.0 -1.0 -1.0 3 0.0 -1.0 -1.0 -1.0 0.0
0.0 0.0 0.0 1.0 0.0
2 -1.0 -1.0 -1.0 -1.0 2 0.0 -1.0 -1.0 -1.0 0.0
0.0 0.0 0.0 1.0 0.0
null null null null
1 -1.0 0.5 0.5 0.5 1 0.0 0.5
0.0 0.5
0.0 0.5
0.0 1.0
0.0 0.0
0 -1.0 -1.0 -1.0 -1.0 0 0.0 -1.0 -1.0 -1.0 0.0
0.0 0.0 0.0 1.0 0.0
0 1 2 3
x
-1 0.0 0.0 0.0 0.0 0.0 0.0
null -1 0 1 2 3 4
x
null
2011-03-25 Array Database workshop 15
30. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
2011-03-25 Array Database workshop 16
31. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 16
32. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 16
33. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 16
34. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 16
35. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 16
36. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 16
37. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 16
38. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
tiling
≠ 0 0.0 0.0 0.0 0.0
windowing x
0 1 2 3
null
Anchor point
2011-03-25 Array Database workshop 16
39. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
2011-03-25 Array Database workshop 17
40. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 17
41. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 17
42. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 17
43. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 17
44. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 17
45. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 17
46. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 17
47. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x:x+2][y:y+2];
y INT DIMENSION[0:4:1],
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 17
48. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1[1:*][1:*]
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x-1][y], A1[x][y-1],
y INT DIMENSION[0:4:1], A1[x][y], A1[x+1][y], A1[x][y+1];
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
2011-03-25 Array Database workshop 18
49. Array Tiling
CREATE ARRAY A1 ( SELECT [x], [y], AVG(v) FROM A1[1:*][1:*]
x INT DIMENSION[0:4:1], GROUP BY DISTINCT A1[x-1][y], A1[x][y-1],
y INT DIMENSION[0:4:1], A1[x][y], A1[x+1][y], A1[x][y+1];
v FLOAT DEFAULT 0.0
);
INSERT INTO A1 VALUES y null
(1,1,0.5), (2,1,0.5), (3,1,0.5);
3 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
null null
1 0.0 0.5 0.5 0.5
0 0.0 0.0 0.0 0.0
0 1 2 3
x
null
Anchor point
2011-03-25 Array Database workshop 18
50. Seismology Use Case
Recent aftershock in Chili
2TB waveform data at 100Hz
detecting seismic events using STA/
LTA (e.g., 2 sec / 15 sec)
remove false positives
window-based 3 min. cuts
heuristic tests
Current problems
accessing waveform files too slow
unpacking and positioning MSEED
data takes too long
2011-03-25 Array Database workshop 19
51. Seismology Use Case
Recent aftershock in Chili CREATE TABLE MSeed (
station VARCHAR(10);
ts ARRAY (
2TB waveform data at 100Hz tick TIMESTAMP DIMENSION
[* : * : INTERVAL ‘0.01’ SECOND],
detecting seismic events using STA/ data DECIMAL(8,6)
LTA (e.g., 2 sec / 15 sec) )
);
remove false positives
window-based 3 min. cuts
heuristic tests
Current problems
accessing waveform files too slow
unpacking and positioning MSEED
data takes too long
2011-03-25 Array Database workshop 20
52. Seismology Use Case
Recent aftershock in Chili --- avg of 2 sec. windows:
SELECT A.station, A.ts.tick, AVG(A.ts.data)
2TB waveform data at 100Hz FROM MSeed AS A
GROUP BY
detecting seismic events using STA/ A.ts[tick - INTERVAL ‘2’ SECOND : tick];
LTA (e.g., 2 sec / 15 sec)
remove false positives
window-based 3 min. cuts
heuristic tests
Current problems
accessing waveform files too slow
unpacking and positioning MSEED
data takes too long
2011-03-25 Array Database workshop 21
53. Seismology Use Case
Recent aftershock in Chili CREATE TABLE Event(
station STRING,
tick TIMESTAMP,
2TB waveform data at 100Hz ratio FLOAT)
AS
detecting seismic events using STA/ SELECT A.station, A.ts.tick,
LTA (e.g., 2 sec / 15 sec) AVG(A.ts.data)/AVG(B.ts.data) AS ratio
FROM MSeed AS A, MSeed AS B
remove false positives WHERE A.station = B.station
AND A.ts.tick = B.ts.tick
GROUP BY
window-based 3 min. cuts A.ts[tick - INTERVAL ‘2’ SECOND : tick],
B.ts[tick - INTERVAL ‘15’ SECOND : tick]
heuristic tests HAVING AVG(A.ts.data)/AVG(B.ts.data) > ?delta
WITH DATA;
Current problems
accessing waveform files too slow
unpacking and positioning MSEED
data takes too long
2011-03-25 Array Database workshop 22
54. Seismology Use Case
Recent aftershock in Chili -- detect isolated errors by direct environment
-- using wave propagation statics
2TB waveform data at 100Hz CREATE TABLE Neighbors(
head STRING,
detecting seismic events using STA/ tail STRING,
LTA (e.g., 2 sec / 15 sec) delay TIMESTAMP,
weight FLOAT
remove false positives );
window-based 3 min. cuts
heuristic tests
Current problems
accessing waveform files too slow
unpacking and positioning MSEED
data takes too long
2011-03-25 Array Database workshop 23
55. Seismology Use Case
Recent aftershock in Chili -- detect false positives:
SELECT A.station, A.tick
2TB waveform data at 100Hz FROM Event AS A, Event AS B, Neighbor AS N
WHERE A.station = N.head
detecting seismic events using STA/ AND B.station = N.tail
LTA (e.g., 2 sec / 15 sec) AND B.tick = A.tick + N.delay
AND A.ratio > B.ratio * N.weight;
remove false positives
-- remove the false positives from Event
window-based 3 min. cuts
heuristic tests
Current problems
accessing waveform files too slow
unpacking and positioning MSEED
data takes too long
2011-03-25 Array Database workshop 24
56. Seismology Use Case
Recent aftershock in Chili -- pass time series to a UDF, written in, e.g., C:
SELECT A.station, myfunction(A.ts)
2TB waveform data at 100Hz FROM MSeed A, Event B
WHERE A.station = B.station
detecting seismic events using STA/ AND A.ts.tick = B.tick
LTA (e.g., 2 sec / 15 sec) GROUP BY DISTINCT
A.ts[tick - INTERVAL ‘3’ MINUTE : tick];
remove false positives
window-based 3 min. cuts
heuristic tests
Current problems
accessing waveform files too slow
unpacking and positioning MSEED
data takes too long
2011-03-25 Array Database workshop 25
57. Conclusion
Appropriate array denotations
Functional complete operation set
Size limitations due to (blob) representations
Existing foreign files?
Scale?
An Array DBMS for sciences
Symbiosis of relational and array paradigms
2011-03-25 Array Database workshop 26