Project

General

Profile

Working with the database » History » Version 10

Rafael Bailon-Ruiz, 2020-12-08 12:40
Add SQL injection warning

1 1 Rafael Bailon-Ruiz
h1. Working with the database
2
3
The feature database is CAMS managing the storage and access of vector data, pieces of information like sensor measurements that can be described in space with a geometric figure such as a points, lines or polygons. See http://wiki.gis.com/wiki/index.php/Vector_data_modelto read more about vector data models.
4
5 2 Rafael Bailon-Ruiz
The basic piece of information of this data model is the _feature_, defined by a _geometry_ that indicates unequivocally its position and shape in the world, and a set of _attributes_ which are the characteristics associated to that location. Related features sharing a common geometry definition and attribute set are usually grouped together.
6 1 Rafael Bailon-Ruiz
7 3 Rafael Bailon-Ruiz
{{toc}}
8 2 Rafael Bailon-Ruiz
9 1 Rafael Bailon-Ruiz
h2. Data model
10
11
The CAMS database model is takes inspiration from the _OGR data model_ and the _OGC OpenPackage specification_.
12
13
h3. Dataset
14
15
A dataset is encompasses a set of feature collections stored in the same database or file.
16
17
18
h3. Collection
19
20
A collection describes the characteristics of features of the same kind or category. I.e.: "Wind", "UAV state", "Liquid water content", etc. It corresponds roughly to a table in a relational database or a layer in many geographic information models.
21
22
A collection is defined by the following parameters:
23
# A computer id (name_id),
24
# A human-readable name,
25 2 Rafael Bailon-Ruiz
# An coordinate reference system as an "EPSG code":https://en.wikipedia.org/wiki/EPSG_Geodetic_Parameter_Dataset,
26 1 Rafael Bailon-Ruiz
# A geometry type (As of 12/2020 only the "point" geometry is supported),
27 3 Rafael Bailon-Ruiz
# A ordered set of attributes and corresponding types. Attributes can be of type _int_, _str_, _float_, or _datetime_.
28 1 Rafael Bailon-Ruiz
# And, optionally, a long description.
29
30
31
h3. Feature
32
33
!vector_feature.png!
34
35 3 Rafael Bailon-Ruiz
The collection field is used to identify the collection to with a particular field belongs; thus determining the geometry and attribute set. 
36 1 Rafael Bailon-Ruiz
37 3 Rafael Bailon-Ruiz
General attributes, *t* (time) and *producer* , are mandatory for features generated repeatedly by UAV sensors. The time attribute is represented by a date and time (datetime.datetime in python) in Coordinated Universal Time (UTC). The producer attribute is a string.
38 1 Rafael Bailon-Ruiz
39 3 Rafael Bailon-Ruiz
Specific attributes are unique to a particular collection. All features of the same collection must have the same attributes, but features of different collections do not need to share specific attributes unlike general ones. For instance, a _"wind"_ collection can have the _"east"_ and _"west"_ attributes to describe the wind vector components.
40 1 Rafael Bailon-Ruiz
41 4 Rafael Bailon-Ruiz
h2. Code architecture
42 1 Rafael Bailon-Ruiz
43 7 Rafael Bailon-Ruiz
The GeoPacakgeDatabase and MemoryDatabase provide two alternative feature storage strategies for the FeatureDatabase. The first uses the GDAL/OGR library to write and read GeoPackage files and the second implements a memory-backed database tailor-made to provide fast access to common simple queries. Depending on the request complexity, the FeatureDatabase _query_ method chooses one of the storage backends will use, the MemoryDatabase when possible or the GeoPackageDatabase otherwise. 
44
45
The GeoPackageDatabase class use Sqlite transactions that can be slow for writing or reading small pieces of data. When writing features it is advised to use the _register_features_ method to delay disk I/O operations and reduce the number of transactions. Anyway, fetching information from the database triggers a write transaction beforehand to ensure data integrity.
46
47
The DataServer class receives AircraftStatus and SensorSample objects from the add_sample and add_status events and converts them to database features.
48
49
50 4 Rafael Bailon-Ruiz
h3. Class diagram
51
!db%20diagram.png!
52 1 Rafael Bailon-Ruiz
53 7 Rafael Bailon-Ruiz
h2. Code examples
54 8 Rafael Bailon-Ruiz
55 9 Rafael Bailon-Ruiz
<code>nephelae_base/unittests/test_feature_database.py</code> provides many examples on using the CAMS database.
56 1 Rafael Bailon-Ruiz
57 10 Rafael Bailon-Ruiz
p{border: solid 1px #8B0000; padding: 1em; margin: 1em; background: #FEE}. %{color:red; font-weight: bold; font-size: large}Important:% *You should be aware that neither CAMS or GDAL/OGR sanitize SQL statements. Your program may be the target of "SQL injection":https://en.wikipedia.org/wiki/SQL_injection attacks by malicious users resulting on important data loss and/or serious denial-of-service.*
58 9 Rafael Bailon-Ruiz
59 7 Rafael Bailon-Ruiz
<pre><code class="python">
60
fdb = FeatureDatabase("database.gpkg")  # Create a FeatureDatabase with memory and geopackage storage backends
61 1 Rafael Bailon-Ruiz
62 7 Rafael Bailon-Ruiz
lwc_attrs = (("t", "datetime"), ("producer", "str"), ("humidity", "float"))
63
lwc_collection = CollectionSchema(
64
    "lwc", "Liquid Water Content", 32631, "point", lwc_attrs,
65
    description="The liquid water content measurements")  # epsg:32631 corresponds to WGS84/UTM31N
66 1 Rafael Bailon-Ruiz
67 7 Rafael Bailon-Ruiz
fdb.add_collection(lwc_collection)
68
</code></pre>
69 1 Rafael Bailon-Ruiz
70 7 Rafael Bailon-Ruiz
<pre><code class="python">
71
# Define some features from a liquid water content sensor on UAV "200"
72
lwc_feature = feature = ('lwc', (360347.0, 4813681.0, 300.0), {
73
    "t": datetime.datetime(2020, 3, 5, 14, 35, 20, int(123.0 * 1000)),
74
    "producer": "200",
75
    "humidity": 0.0125})
76
lwc_feature2 = ('lwc', (360347.0, 4813681.0, 300.0), {
77
    "t": datetime.datetime(2020, 3, 5, 14, 35, 20, int(123.0 * 1000)),
78
    "producer": "202",
79
    "humidity": 0.0125})
80
lwc_feature3 = ('lwc', (361347.0, 4814681.0, 300.0), {
81
    "t": datetime.datetime(2020, 3, 5, 14, 35, 22, int(123.0 * 1000)),
82
    "producer": "200",
83
    "humidity": 0.0125})
84 1 Rafael Bailon-Ruiz
85 7 Rafael Bailon-Ruiz
# Add them to the database
86
fdb.insert(lwc_feature, lwc_feature2, lwc_feature3)
87
</code></pre>
88 8 Rafael Bailon-Ruiz
89 7 Rafael Bailon-Ruiz
<pre><code class="python">
90 1 Rafael Bailon-Ruiz
# Get all featres from the "lwc" collection
91 7 Rafael Bailon-Ruiz
result_iter = fdb.query("lwc") 
92 1 Rafael Bailon-Ruiz
93 7 Rafael Bailon-Ruiz
# The result is an iterator (the actual reading operation is performed 
94 9 Rafael Bailon-Ruiz
# lazily and makes it easier to combine with further filtering code
95 1 Rafael Bailon-Ruiz
# without extra memory usage.
96 8 Rafael Bailon-Ruiz
list_of_lwc = list(result_iter)  # But you can have a list if needed
97 7 Rafael Bailon-Ruiz
98 8 Rafael Bailon-Ruiz
# complex_r is a complex request that requires an sql engine to be processed
99 1 Rafael Bailon-Ruiz
complex_r = list(empty_feature_db.query(
100 7 Rafael Bailon-Ruiz
            "lwc", where="\"producer\" == \"200\"", 
101
            order_by="t", direction="asc"))
102
103 8 Rafael Bailon-Ruiz
104 7 Rafael Bailon-Ruiz
# (minx, miny, minz, maxx, maxy, maxz)
105
bbox = (lwc_feature[1][0] - 0.1, lwc_feature[1][1] - 0.1,
106
        -math.inf,
107
        lwc_feature[1][0] + 0.1, lwc_feature[1][1] + 0.1,
108
        math.inf)
109
# Simple bounding box request. Fast result from the memory database
110
bbox_r = = list(empty_feature_db.query("lwc", bounding_box=bbox))
111
</code></pre>
112 4 Rafael Bailon-Ruiz
113 3 Rafael Bailon-Ruiz
h2. Post-mission analysis
114
115
While GeoPackage .gpkg files generated by CAMS during a mission can be read using this software, it is better to use general purpose geographic information systems or more mature GIS libraries to process the information.
116
117 1 Rafael Bailon-Ruiz
Popular python libraries are "fiona":https://fiona.readthedocs.io/en/stable/README.html &mdash;a pythonic style interface to the popular GDAL/OGR library&mdash; and "geopandas":https://geopandas.org/, extending the python pandas library model to geographic data. "QGIS":https://www.qgis.org/en/site/ is an easy option for non-developpers to visualize geospatial data and visually combine the information with other sources.