System Design

The system design emphasizes first, data integrity, and second, low long-term cost. The database engine chosen, PostgreSQL, supports concurrent multi-user data entry and retrieval, which minimizes the amount of inter-user coordination required and enhances usability.

These design goals lead to the following design elements.

Data integrity is enforced within the database. This allows any program to be used to interact with the database and update database content. Costs are kept down because generic Open Source user interfaces may be used to interact with the database. No matter the tool used, the integrity of the data is maintained.

A web-based user interface, including a wiki which allows rapid web-page development, maximizes availability. The primary method of interacting with the database is SQL, the industry standard relational query language. The amount of SQL which must be learned can be, depending on the amount of development effort spent, reduced to an absolute minimum through the use of views – in short, pre-packaged queries.

Data is kept secure though industry-standard practices. These include the encryption of communications, the association of accounts with individuals, the secure authentication required for account access, and the use of in-database access controls to limit the permissions of user accounts. Because individual people are granted direct, but controlled, access to the database itself there is no “middleware” which, when bypassed, has unlimited access to the data.

A minimal number of bespoke programs limits the amount of code development required, and, even more significantly, limits the long-term maintenance costs. The Open Source licensing of the SokweDB system minimizes cost by sharing long-term development of those portions of the system used by more than one institution.

Costs are kept down by minimizing the amount of user-interaction available through bespoke programs. Interacting with a person, particularly reporting errors which arise, requires a lot of programming. Therefore the system is designed around bulk input and output.

Individual accounts are given their own, private, workspaces (schemas). This separates private from shared data, which allows for better long-term data maintenance.

About Databases

In PostgreSQL a database is a stand-alone data store. Queries can easily interact with and combine all data kept within a single database. Access to data outside a database, from within the database, is possible but requires additional work that depends upon the data source.

About Tables

Databases store data in tables. Related singleton datum, such as a single name, a single birthdate, a mother, are kept together in a single row of a table constructed to hold this particular kind of data. Data of the same kind kept within a single table are are found in a column of the table. Columns have names, like “name”, “birthdate”, and “mother_id”.

So a table is a grid containing (classically) a single value in each cell of the grid. Each row of the table represents a physical thing, such as a chimpanzee, or an abstract thing, such as the distance to some designated chimpanzee. E.g., a row with the 2 columns: a distance in meters, and the id of the 2nd chimpanzee. Each column of a table is expected to contain the same “kind” of data; a name should not go in the “birthdate”column.

The SokweDB design endeavors to name tables in the plural, as they hold multiple rows. Column names are singular, as each column of each table holds a single value.

About Views

Views appear to be tables but they are not. Views are virtual, when queried they deliver the results of a query run against the database’s actual tables. An SQL query can freely intermix the use of tables and views. When setup to do so, changing the data in a view can change data in the database’s underlying tables.

Views make it easy to reuse complex or commonly used queries, or portions of queries. They allow a database designed around the capabilities of the computer to be interacted with in a fashion that makes sense to people. Although the views do not appear in the entity relationship diagrams that document the underlying database, and so are omitted from the high level overview these diagrams provide, most users will greatly benefit if they take the time to understand how the views fit into the overall database. Where views exist, most will usually find it easier to work with the views than with the underlying tables.

Views that have the structure, the corresponding columns, of the data after collection in the field and entry into electronic form, are used to upload data into the database. Inserting data into these views distributes the uploaded data into the underlying tables. These sorts of views may or may not be useful when retrieving data from the database for analysis. Investigate to see if some other view or query is better suited rather than automatically using a view created for data upload to do analysis.

About Schemas

Schemas partition databases.[1] [2] They work like directories or folders do in filesystems, but can be only one level deep. A schema cannot contain another schema.

Schemas organize database content. One purpose is to allow a user to focus on the content of some schema(s) and ignore what is in other schemas.

Footnotes

Page last generated: 2025-04-01 22:52:03 UTC