Disy Tech-Blog

Querying the exact extent of spatial data in Oracle

Querying the exact extent of spatial data in Oracle

25.08.2017 | Eva-Maria Kramer

For most geodata operations or to visualize the spatial data with GIS like Cadenza you need to build a spatial index. And therefore you need a geo metadata entry.

SELECT * FROM USER_SDO_GEOM_METADATA;

Well, table_name, column_name and srid are the easier ones, but what is a proper dim info?

There isn’t just one way of doing it. I was curious and wanted to compare some different ways the query the exact extent of spatial data. Espacially with big tables, and geometries with a huge amount of vertices, I hope to get a relevant performance impact. For better readability, this blog post is focused on only two dimensions. But if you do have 3D data, the dim info needs the proper upper and lower boundary information for the third dimension as well. You can easily expand the statements by yourself.

First, lets collect some ideas to calculate the extent. All of the following statements lead to the same result:

Statement 1:

One Simple way is using the aggregate mbr function. Let the database do the magic all by itself ;)

SELECT SDO_AGGR_MBR(your_geom_column_name) mbr FROM your_table_name;

Statement 2:

Another way is, to directly query the min and max of all geom vertices using SDO_UTIL.GETVERTICES:

SELECT MIN(t.x) min_x,
       MAX(t.x) max_x,
       MIN(t.y) min_y,
       MAX(t.y) max_y
  FROM your_table_name a, 
       TABLE(SDO_UTIL.GETVERTICES(a.your_geom_column_name)) t;

Statement 3 (mbr variation of 2):

Sometimes its a good idea to help the database. First get the SDO_GEOM_MBR for each geometry, and then select the resulting vertices with SDO_UTIL.GETVERTICES.

SELECT MIN(t.x) min_x,
       MAX(t.x) max_x,
       MIN(t.y) min_y,
       MAX(t.y) max_y
  FROM your_table_name a, 
       TABLE(SDO_UTIL.GETVERTICES(SDO_GEOM_MBR(a.your_geom_column_name))) t;

Statement 4:

There is a function in Oracle, SDO_GEOM.SDO_MIN_MBR_ORDINATE. Its simliar to SDO_GEOM_MBR, but you can choose if you want the upper or lower boundary and the ordinate position. So, there is a good chance that the database calculates the mbr per each geometry four times. But to be fair we want to give it a try.

SELECT MIN(SDO_GEOM.SDO_MIN_MBR_ORDINATE(your_geom_column_name, 1)) min_x,
       MAX(SDO_GEOM.SDO_MAX_MBR_ORDINATE(your_geom_column_name, 1)) max_x,
       MIN(SDO_GEOM.SDO_MIN_MBR_ORDINATE(your_geom_column_name, 2)) min_y,
       MAX(SDO_GEOM.SDO_MAX_MBR_ORDINATE(your_geom_column_name, 2)) max_y
  FROM your_table_name; 

The following table is showing the duration in seconds. We have tested different characteristic 2D spatial data tables. Some only containing points, other simple rectangle polygon and other way more complex polygons. (Just to mention, there is no difference if single or multi, points, linestring or polygon, the amount of vertices per geometry matters)

Statement 1: Statement 2: Statement 3: Statement 4: COUNT_OBJECTS SUM_VERTICES AVG_VERTICES
31,18 3,02 7,34 33,97 10000 10000 1
121,49 10,70 21,40 97,70 200000 200000 1
108,65 11,46 22,46 99,68 200000 400000 2
53,86 8,46 13,88 60,24 157500 787500 5
109,53 17,21 23,25 101,86 200000 2000000 10
178,57 185,74 69,69 204,42 200000 37200000 186
1,52 2,56 0,38 2,03 2191 585699 267,32
220,94 330,89 95,62 280,42 200000 72400000 362
0,22 0,83 0,07 0,40 172 203078 1180,69

The most performant result is marked as bold; the second fastest as cursive.

In conclusion, its a good idea to use SDO_UTIL.GETVERTICES and then get the min and max values with normal numeric aggregate functionallity. If the geometries are more complex and have many vertices, you need to reduce the amount of vertices per each geometry with SDO_GEOM_MBR to get the full performance boost.

There is an additional and really performant way to query the table extent in milliseconds using SDO_TUNE.EXTENT_OF. But there will be an ora exception if your table does not have a spatial index yet.

Here is a generic script, generating the different statements for each table/geometry column of current schema:

set timing on;

select 
'
PROMPT Table/Data overview --'||column_name||' IN '||table_name||'
SELECT COUNT(*) count_objects,
       SUM(SDO_UTIL.GETNUMVERTICES('||column_name||')) sum_vertices,
       AVG(SDO_UTIL.GETNUMVERTICES('||column_name||')) avg_vertices,
       '''||column_name||''' column_name,
       '''||table_name||''' table_name,
       TO_CHAR(current_timestamp, ''yyyy-mm-dd hh24:mi:ss'') execution_time
  FROM '||table_name||'
;

PROMPT statement 1 '||column_name||' IN '||table_name||'
SELECT SDO_AGGR_MBR('||column_name||') mbr FROM '||table_name||';

PROMPT statement 2 '||column_name||' IN '||table_name||'
SELECT MIN(t.x) min_x,
       MAX(t.x) max_x,
       MIN(t.y) min_y,
       MAX(t.y) max_y
  FROM '||table_name||' a, 
       TABLE(SDO_UTIL.GETVERTICES (a.'||column_name||')) t
;

PROMPT statement 3 '||column_name||' IN '||table_name||'
SELECT MIN(t.x) min_x,
       MAX(t.x) max_x,
       MIN(t.y) min_y,
       MAX(t.y) max_y
  FROM '||table_name||' a, 
       TABLE(SDO_UTIL.GETVERTICES (SDO_GEOM_MBR(a.'||column_name||'))) t
;

PROMPT statement 4 '||column_name||' IN '||table_name||'
SELECT MIN(SDO_GEOM.SDO_MIN_MBR_ORDINATE('||column_name||', 1)) min_x,
       MAX(SDO_GEOM.SDO_MAX_MBR_ORDINATE('||column_name||', 1)) max_x,
       MIN(SDO_GEOM.SDO_MIN_MBR_ORDINATE('||column_name||', 2)) min_y,
       MAX(SDO_GEOM.SDO_MAX_MBR_ORDINATE('||column_name||', 2)) max_y
 FROM '||table_name||'; 
'

FROM USER_TAB_COLS WHERE data_type LIKE 'SDO_GEOMETRY' AND column_name NOT LIKE '%$%';

The title image was published by Jessica Ruscello under the Unsplash License.