Working with the database » History » Version 8
Rafael Bailon-Ruiz, 2020-12-04 12:25
1 | 1 | Rafael Bailon-Ruiz | h1. Working with the database |
---|---|---|---|
2 | |||
3 | The feature database is CAMS managing the storage and access of vector data, pieces of information like sensor measurements that can be described in space with a geometric figure such as a points, lines or polygons. See http://wiki.gis.com/wiki/index.php/Vector_data_modelto read more about vector data models. |
||
4 | |||
5 | 2 | Rafael Bailon-Ruiz | The basic piece of information of this data model is the _feature_, defined by a _geometry_ that indicates unequivocally its position and shape in the world, and a set of _attributes_ which are the characteristics associated to that location. Related features sharing a common geometry definition and attribute set are usually grouped together. |
6 | 1 | Rafael Bailon-Ruiz | |
7 | 3 | Rafael Bailon-Ruiz | {{toc}} |
8 | 2 | Rafael Bailon-Ruiz | |
9 | 1 | Rafael Bailon-Ruiz | h2. Data model |
10 | |||
11 | The CAMS database model is takes inspiration from the _OGR data model_ and the _OGC OpenPackage specification_. |
||
12 | |||
13 | h3. Dataset |
||
14 | |||
15 | A dataset is encompasses a set of feature collections stored in the same database or file. |
||
16 | |||
17 | |||
18 | h3. Collection |
||
19 | |||
20 | A collection describes the characteristics of features of the same kind or category. I.e.: "Wind", "UAV state", "Liquid water content", etc. It corresponds roughly to a table in a relational database or a layer in many geographic information models. |
||
21 | |||
22 | A collection is defined by the following parameters: |
||
23 | # A computer id (name_id), |
||
24 | # A human-readable name, |
||
25 | 2 | Rafael Bailon-Ruiz | # An coordinate reference system as an "EPSG code":https://en.wikipedia.org/wiki/EPSG_Geodetic_Parameter_Dataset, |
26 | 1 | Rafael Bailon-Ruiz | # A geometry type (As of 12/2020 only the "point" geometry is supported), |
27 | 3 | Rafael Bailon-Ruiz | # A ordered set of attributes and corresponding types. Attributes can be of type _int_, _str_, _float_, or _datetime_. |
28 | 1 | Rafael Bailon-Ruiz | # And, optionally, a long description. |
29 | |||
30 | |||
31 | h3. Feature |
||
32 | |||
33 | !vector_feature.png! |
||
34 | |||
35 | 3 | Rafael Bailon-Ruiz | The collection field is used to identify the collection to with a particular field belongs; thus determining the geometry and attribute set. |
36 | 1 | Rafael Bailon-Ruiz | |
37 | 3 | Rafael Bailon-Ruiz | General attributes, *t* (time) and *producer* , are mandatory for features generated repeatedly by UAV sensors. The time attribute is represented by a date and time (datetime.datetime in python) in Coordinated Universal Time (UTC). The producer attribute is a string. |
38 | 1 | Rafael Bailon-Ruiz | |
39 | 3 | Rafael Bailon-Ruiz | Specific attributes are unique to a particular collection. All features of the same collection must have the same attributes, but features of different collections do not need to share specific attributes unlike general ones. For instance, a _"wind"_ collection can have the _"east"_ and _"west"_ attributes to describe the wind vector components. |
40 | 1 | Rafael Bailon-Ruiz | |
41 | 4 | Rafael Bailon-Ruiz | h2. Code architecture |
42 | 1 | Rafael Bailon-Ruiz | |
43 | 7 | Rafael Bailon-Ruiz | The GeoPacakgeDatabase and MemoryDatabase provide two alternative feature storage strategies for the FeatureDatabase. The first uses the GDAL/OGR library to write and read GeoPackage files and the second implements a memory-backed database tailor-made to provide fast access to common simple queries. Depending on the request complexity, the FeatureDatabase _query_ method chooses one of the storage backends will use, the MemoryDatabase when possible or the GeoPackageDatabase otherwise. |
44 | |||
45 | The GeoPackageDatabase class use Sqlite transactions that can be slow for writing or reading small pieces of data. When writing features it is advised to use the _register_features_ method to delay disk I/O operations and reduce the number of transactions. Anyway, fetching information from the database triggers a write transaction beforehand to ensure data integrity. |
||
46 | |||
47 | The DataServer class receives AircraftStatus and SensorSample objects from the add_sample and add_status events and converts them to database features. |
||
48 | |||
49 | |||
50 | 4 | Rafael Bailon-Ruiz | h3. Class diagram |
51 | !db%20diagram.png! |
||
52 | 1 | Rafael Bailon-Ruiz | |
53 | 7 | Rafael Bailon-Ruiz | h2. Code examples |
54 | 8 | Rafael Bailon-Ruiz | |
55 | 7 | Rafael Bailon-Ruiz | nephelae_base/unittests/test_feature_database.py provides many examples on using the CAMS database. |
56 | 1 | Rafael Bailon-Ruiz | |
57 | 7 | Rafael Bailon-Ruiz | <pre><code class="python"> |
58 | fdb = FeatureDatabase("database.gpkg") # Create a FeatureDatabase with memory and geopackage storage backends |
||
59 | 1 | Rafael Bailon-Ruiz | |
60 | 7 | Rafael Bailon-Ruiz | lwc_attrs = (("t", "datetime"), ("producer", "str"), ("humidity", "float")) |
61 | lwc_collection = CollectionSchema( |
||
62 | "lwc", "Liquid Water Content", 32631, "point", lwc_attrs, |
||
63 | description="The liquid water content measurements") # epsg:32631 corresponds to WGS84/UTM31N |
||
64 | 1 | Rafael Bailon-Ruiz | |
65 | 7 | Rafael Bailon-Ruiz | fdb.add_collection(lwc_collection) |
66 | </code></pre> |
||
67 | 1 | Rafael Bailon-Ruiz | |
68 | 7 | Rafael Bailon-Ruiz | <pre><code class="python"> |
69 | # Define some features from a liquid water content sensor on UAV "200" |
||
70 | lwc_feature = feature = ('lwc', (360347.0, 4813681.0, 300.0), { |
||
71 | "t": datetime.datetime(2020, 3, 5, 14, 35, 20, int(123.0 * 1000)), |
||
72 | "producer": "200", |
||
73 | "humidity": 0.0125}) |
||
74 | lwc_feature2 = ('lwc', (360347.0, 4813681.0, 300.0), { |
||
75 | "t": datetime.datetime(2020, 3, 5, 14, 35, 20, int(123.0 * 1000)), |
||
76 | "producer": "202", |
||
77 | "humidity": 0.0125}) |
||
78 | lwc_feature3 = ('lwc', (361347.0, 4814681.0, 300.0), { |
||
79 | "t": datetime.datetime(2020, 3, 5, 14, 35, 22, int(123.0 * 1000)), |
||
80 | "producer": "200", |
||
81 | "humidity": 0.0125}) |
||
82 | 1 | Rafael Bailon-Ruiz | |
83 | 7 | Rafael Bailon-Ruiz | # Add them to the database |
84 | fdb.insert(lwc_feature, lwc_feature2, lwc_feature3) |
||
85 | </code></pre> |
||
86 | 8 | Rafael Bailon-Ruiz | |
87 | 7 | Rafael Bailon-Ruiz | <pre><code class="python"> |
88 | 1 | Rafael Bailon-Ruiz | # Get all featres from the "lwc" collection |
89 | 7 | Rafael Bailon-Ruiz | result_iter = fdb.query("lwc") |
90 | 1 | Rafael Bailon-Ruiz | |
91 | 7 | Rafael Bailon-Ruiz | # The result is an iterator (the actual reading operation is performed |
92 | # lazily and makes it easier to combien with further filtering code |
||
93 | 1 | Rafael Bailon-Ruiz | # without extra memory usage. |
94 | 8 | Rafael Bailon-Ruiz | list_of_lwc = list(result_iter) # But you can have a list if needed |
95 | 7 | Rafael Bailon-Ruiz | |
96 | 8 | Rafael Bailon-Ruiz | # complex_r is a complex request that requires an sql engine to be processed |
97 | 1 | Rafael Bailon-Ruiz | complex_r = list(empty_feature_db.query( |
98 | 7 | Rafael Bailon-Ruiz | "lwc", where="\"producer\" == \"200\"", |
99 | order_by="t", direction="asc")) |
||
100 | |||
101 | 8 | Rafael Bailon-Ruiz | |
102 | 7 | Rafael Bailon-Ruiz | # (minx, miny, minz, maxx, maxy, maxz) |
103 | bbox = (lwc_feature[1][0] - 0.1, lwc_feature[1][1] - 0.1, |
||
104 | -math.inf, |
||
105 | lwc_feature[1][0] + 0.1, lwc_feature[1][1] + 0.1, |
||
106 | math.inf) |
||
107 | # Simple bounding box request. Fast result from the memory database |
||
108 | bbox_r = = list(empty_feature_db.query("lwc", bounding_box=bbox)) |
||
109 | </code></pre> |
||
110 | 4 | Rafael Bailon-Ruiz | |
111 | 3 | Rafael Bailon-Ruiz | h2. Post-mission analysis |
112 | |||
113 | While GeoPackage .gpkg files generated by CAMS during a mission can be read using this software, it is better to use general purpose geographic information systems or more mature GIS libraries to process the information. |
||
114 | |||
115 | 1 | Rafael Bailon-Ruiz | Popular python libraries are "fiona":https://fiona.readthedocs.io/en/stable/README.html —a pythonic style interface to the popular GDAL/OGR library— and "geopandas":https://geopandas.org/, extending the python pandas library model to geographic data. "QGIS":https://www.qgis.org/en/site/ is an easy option for non-developpers to visualize geospatial data and visually combine the information with other sources. |