RESTFul APIs: Persistency & MongoDB
July 17th, 2020
This is the fourth article in a five part series on RESTful APIs.
While building apps, we often need to save the application state and we cannot rely just on the volatile memory because the server might crash or restart. Just like with a video game in which we want to continue playing where we left-off yesterday, we need to persist the state of an application.
To do that, we need to move server data from process memory to disk. Using a file might suffice for a simple application, but for more complex applications, a database might be needed.
We will use MongoDB, a No-SQL database that is fast, flexible, scalable, and simple to use.
What is MongoDB?
MongoDB is a database application with a client-server architecture. The core concepts in MongoDB are:
A document represents a single entity and is similar to a row from SQL databases.
A collection represents a set of documents and it is similar to a table from SQL databases.
A database represents a set of collections and it is similar to a database from SQL databases.
MongoDB server (mongod binary) handles the data saving that it receives from clients. A client can be a node app, a ruby app, or even a CLI app. MongoDB has a default CLI that comes when it is installed. The client communicates with the server via MongoDB proprietary protocol.
Every document in MongoDB has a default '_id' property which is used to uniquely identify a document in the database.
Mongo by default is schemaless, meaning that one collection can hold document with different structures, properties, or sizes.
Because Mongo is JSON based, there are no object mappings between the documents in the database and the object in the application.
MongoDB is an ACID-compliant database. ACID represents a set of properties for database transactions. ACID-compliant databases are desirable because they guarantee the validity of database state in case of exceptions, power failures, errors. ACID is an acronym for atomicity, consistency, isolation, durability.
Atomicity is the property that a transaction is not divisible, either all of the transaction occurs or none of it, so that we never have the database in an invalid state.
Consistency means that only the data targeted by the operation/transaction is changed and with a predefined set of rules.
Isolation is the level of integrity visibility to other users/systems.
Durability means that data will survive permanently.
Due to the flexibility of MongoDB, there are two types of data modeling: embedded and normalized.
In embedded mode, you have all the data that is related to each other and usually used together in a single document. The advantages are that we don’t compute complicated queries and it's easier to debug. The disadvantage is that it may take more space. Space is cheap nowadays so the tradeoff is easy. ACID properties are respected with atomic single-document operations.
In the normalized mode, you get the space optimizations but you lose simplicity. It is similar to SQL. The document instead of having all the data embedded it has just a reference to another document id. This way when you need to get the data you perform multiple queries to have it complete. ACID properties are respected with multi-document transactional integrity is available in MongoDB 4.0.
Let’s say we have to design a blog. A post should have an id, title, description, URL, likes, author, timestamps. A comment should have a post_id, text, author, timestamps, likes, and comments. We also want to add tags, a tag list should have an id, post_id, and tag. This might be done with at least three tables in an SQL database. But it can be done very easily with a collection and a document for each post in MongoDB.
Always design the schema according to the problem you are trying to solve. Combine objects if you plan to use them together. Don’t be afraid to duplicate the data. Optimize for most frequent use cases.
We use mongoose which is another package for Node.js for defining some structure to our database.
Mongoose is an ORM (object-relational mapper) that maps models in our application to the elements in the database. This way it's simpler to query the database. Validation is done by default by mongoose, so we don’t need to worry too much about this aspect.
Mongoose logic happens in the application layer, not in the database, we can still keep the flexibility of Mongo if needed.
Mongoose maintains a global connection so there is no need to pass around the connection object.
Creating a schema setups the rules for your data, state types, and rules. (created_now, default, date).
That's all for now, next Security Tips.