Project

General

Profile

Working with the database » History » Version 8

Rafael Bailon-Ruiz, 2020-12-04 12:25

1 1 Rafael Bailon-Ruiz
h1. Working with the database
2
3
The feature database is CAMS managing the storage and access of vector data, pieces of information like sensor measurements that can be described in space with a geometric figure such as a points, lines or polygons. See http://wiki.gis.com/wiki/index.php/Vector_data_modelto read more about vector data models.
4
5 2 Rafael Bailon-Ruiz
The basic piece of information of this data model is the _feature_, defined by a _geometry_ that indicates unequivocally its position and shape in the world, and a set of _attributes_ which are the characteristics associated to that location. Related features sharing a common geometry definition and attribute set are usually grouped together.
6 1 Rafael Bailon-Ruiz
7 3 Rafael Bailon-Ruiz
{{toc}}
8 2 Rafael Bailon-Ruiz
9 1 Rafael Bailon-Ruiz
h2. Data model
10
11
The CAMS database model is takes inspiration from the _OGR data model_ and the _OGC OpenPackage specification_.
12
13
h3. Dataset
14
15
A dataset is encompasses a set of feature collections stored in the same database or file.
16
17
18
h3. Collection
19
20
A collection describes the characteristics of features of the same kind or category. I.e.: "Wind", "UAV state", "Liquid water content", etc. It corresponds roughly to a table in a relational database or a layer in many geographic information models.
21
22
A collection is defined by the following parameters:
23
# A computer id (name_id),
24
# A human-readable name,
25 2 Rafael Bailon-Ruiz
# An coordinate reference system as an "EPSG code":https://en.wikipedia.org/wiki/EPSG_Geodetic_Parameter_Dataset,
26 1 Rafael Bailon-Ruiz
# A geometry type (As of 12/2020 only the "point" geometry is supported),
27 3 Rafael Bailon-Ruiz
# A ordered set of attributes and corresponding types. Attributes can be of type _int_, _str_, _float_, or _datetime_.
28 1 Rafael Bailon-Ruiz
# And, optionally, a long description.
29
30
31
h3. Feature
32
33
!vector_feature.png!
34
35 3 Rafael Bailon-Ruiz
The collection field is used to identify the collection to with a particular field belongs; thus determining the geometry and attribute set. 
36 1 Rafael Bailon-Ruiz
37 3 Rafael Bailon-Ruiz
General attributes, *t* (time) and *producer* , are mandatory for features generated repeatedly by UAV sensors. The time attribute is represented by a date and time (datetime.datetime in python) in Coordinated Universal Time (UTC). The producer attribute is a string.
38 1 Rafael Bailon-Ruiz
39 3 Rafael Bailon-Ruiz
Specific attributes are unique to a particular collection. All features of the same collection must have the same attributes, but features of different collections do not need to share specific attributes unlike general ones. For instance, a _"wind"_ collection can have the _"east"_ and _"west"_ attributes to describe the wind vector components.
40 1 Rafael Bailon-Ruiz
41 4 Rafael Bailon-Ruiz
h2. Code architecture
42 1 Rafael Bailon-Ruiz
43 7 Rafael Bailon-Ruiz
The GeoPacakgeDatabase and MemoryDatabase provide two alternative feature storage strategies for the FeatureDatabase. The first uses the GDAL/OGR library to write and read GeoPackage files and the second implements a memory-backed database tailor-made to provide fast access to common simple queries. Depending on the request complexity, the FeatureDatabase _query_ method chooses one of the storage backends will use, the MemoryDatabase when possible or the GeoPackageDatabase otherwise. 
44
45
The GeoPackageDatabase class use Sqlite transactions that can be slow for writing or reading small pieces of data. When writing features it is advised to use the _register_features_ method to delay disk I/O operations and reduce the number of transactions. Anyway, fetching information from the database triggers a write transaction beforehand to ensure data integrity.
46
47
The DataServer class receives AircraftStatus and SensorSample objects from the add_sample and add_status events and converts them to database features.
48
49
50 4 Rafael Bailon-Ruiz
h3. Class diagram
51
!db%20diagram.png!
52 1 Rafael Bailon-Ruiz
53 7 Rafael Bailon-Ruiz
h2. Code examples
54 8 Rafael Bailon-Ruiz
55 7 Rafael Bailon-Ruiz
nephelae_base/unittests/test_feature_database.py provides many examples on using the CAMS database.
56 1 Rafael Bailon-Ruiz
57 7 Rafael Bailon-Ruiz
<pre><code class="python">
58
fdb = FeatureDatabase("database.gpkg")  # Create a FeatureDatabase with memory and geopackage storage backends
59 1 Rafael Bailon-Ruiz
60 7 Rafael Bailon-Ruiz
lwc_attrs = (("t", "datetime"), ("producer", "str"), ("humidity", "float"))
61
lwc_collection = CollectionSchema(
62
    "lwc", "Liquid Water Content", 32631, "point", lwc_attrs,
63
    description="The liquid water content measurements")  # epsg:32631 corresponds to WGS84/UTM31N
64 1 Rafael Bailon-Ruiz
65 7 Rafael Bailon-Ruiz
fdb.add_collection(lwc_collection)
66
</code></pre>
67 1 Rafael Bailon-Ruiz
68 7 Rafael Bailon-Ruiz
<pre><code class="python">
69
# Define some features from a liquid water content sensor on UAV "200"
70
lwc_feature = feature = ('lwc', (360347.0, 4813681.0, 300.0), {
71
    "t": datetime.datetime(2020, 3, 5, 14, 35, 20, int(123.0 * 1000)),
72
    "producer": "200",
73
    "humidity": 0.0125})
74
lwc_feature2 = ('lwc', (360347.0, 4813681.0, 300.0), {
75
    "t": datetime.datetime(2020, 3, 5, 14, 35, 20, int(123.0 * 1000)),
76
    "producer": "202",
77
    "humidity": 0.0125})
78
lwc_feature3 = ('lwc', (361347.0, 4814681.0, 300.0), {
79
    "t": datetime.datetime(2020, 3, 5, 14, 35, 22, int(123.0 * 1000)),
80
    "producer": "200",
81
    "humidity": 0.0125})
82 1 Rafael Bailon-Ruiz
83 7 Rafael Bailon-Ruiz
# Add them to the database
84
fdb.insert(lwc_feature, lwc_feature2, lwc_feature3)
85
</code></pre>
86 8 Rafael Bailon-Ruiz
87 7 Rafael Bailon-Ruiz
<pre><code class="python">
88 1 Rafael Bailon-Ruiz
# Get all featres from the "lwc" collection
89 7 Rafael Bailon-Ruiz
result_iter = fdb.query("lwc") 
90 1 Rafael Bailon-Ruiz
91 7 Rafael Bailon-Ruiz
# The result is an iterator (the actual reading operation is performed 
92
# lazily and makes it easier to combien with further filtering code
93 1 Rafael Bailon-Ruiz
# without extra memory usage.
94 8 Rafael Bailon-Ruiz
list_of_lwc = list(result_iter)  # But you can have a list if needed
95 7 Rafael Bailon-Ruiz
96 8 Rafael Bailon-Ruiz
# complex_r is a complex request that requires an sql engine to be processed
97 1 Rafael Bailon-Ruiz
complex_r = list(empty_feature_db.query(
98 7 Rafael Bailon-Ruiz
            "lwc", where="\"producer\" == \"200\"", 
99
            order_by="t", direction="asc"))
100
101 8 Rafael Bailon-Ruiz
102 7 Rafael Bailon-Ruiz
# (minx, miny, minz, maxx, maxy, maxz)
103
bbox = (lwc_feature[1][0] - 0.1, lwc_feature[1][1] - 0.1,
104
        -math.inf,
105
        lwc_feature[1][0] + 0.1, lwc_feature[1][1] + 0.1,
106
        math.inf)
107
# Simple bounding box request. Fast result from the memory database
108
bbox_r = = list(empty_feature_db.query("lwc", bounding_box=bbox))
109
</code></pre>
110 4 Rafael Bailon-Ruiz
111 3 Rafael Bailon-Ruiz
h2. Post-mission analysis
112
113
While GeoPackage .gpkg files generated by CAMS during a mission can be read using this software, it is better to use general purpose geographic information systems or more mature GIS libraries to process the information.
114
115 1 Rafael Bailon-Ruiz
Popular python libraries are "fiona":https://fiona.readthedocs.io/en/stable/README.html &mdash;a pythonic style interface to the popular GDAL/OGR library&mdash; and "geopandas":https://geopandas.org/, extending the python pandas library model to geographic data. "QGIS":https://www.qgis.org/en/site/ is an easy option for non-developpers to visualize geospatial data and visually combine the information with other sources.