In this articles, I will talk about how to query documents in Apache CouchDB via Views.
What is Apache CouchDB?
A short introduce about CouchDB first for those who don't know. Apache CouchDB is an open-source document-oriented NoSQL database, implemented in Erlang. It is very easy to use as CouchDB makes use of the ubiquitous HTTP protocol and JSON data format. Do check out their Official Website for more detail. ๐
Alright, back to our main topic today.โ
First of all, before we talk about what is view, I need to introduce 2 important things in CouchDB.
Query Server
The first thing to introduce is CouchDB Query Server. What is Query Server? Based on the official documentation:
The Query server is an external process that communicates with CouchDB by JSON protocol through stdio interface and processes all design functions calls, such as JavaScript views.
By default, CouchDB has a built-in Javascript query server running via Mozilla SpiderMonkey. That's mean we can define a javascript function to tell CouchDB what documents you want to query.
Note: If you are not comfortable with Javascript, You can use other programming languages query server such as Python, Ruby, Clojure and etc. You can find the query server configuration here
Ooookay, then where to define the javascript function?๐ค
which is the second thing to introduce.
Design Document
Design Document is a special document within a CouchDB database. You can use design document to build indexes, validate document updates, format query results, and filter replications. Below is an example of the design document structure.
{
"_id": "_design/example",
"views": {
"view-number-one": {
"map": "function (doc) {/* function code here */}"
},
"view-number-two": {
"map": "function (doc) {/* function code here */}",
"reduce": "function (keys, values, rereduce) {/* function code here */}"
}
},
"updates": {
"updatefun1": "function(doc,req) {/* function code here */}",
"updatefun2": "function(doc,req) {/* function code here */}"
},
"filters": {
"filterfunction1": "function(doc, req){ /* function code here */ }"
},
"validate_doc_update": "function(newDoc, oldDoc, userCtx, secObj) { /* function code here */ }",
"language": "javascript"
}
Let's us break down chunk by chunk.
1. CouchDB's document ID.
Underscore id is a reserved property key for representing the ID of the JSON document you save in the database. If the document starts with _design/ in front, meaning it is a design document.
"_id": "_design/example",
2. View functions
We can define our views query logic here. Mostly driven by Javascript function as Javascript is default query server language. Later we will go more detail on the view function.
"views": {
"view-number-one": {
"map": "function (doc) {/* function code here */}"
},
"view-number-two": {
"map": "function (doc) {/* function code here */}",
"reduce": "function (keys, values, rereduce) {/* function code here */}"
}
},
3. Update functions
Update functions are functions logic that saved in CouchDB server and then we can request to invoke to create or update a document.
"updates": {
"updatefun1": "function(doc,req) {/* function code here */}",
"updatefun2": "function(doc,req) {/* function code here */}"
},
4. Filter functions
Filter functions use to filter database changes feed.
"filters": {
"filterfunction1": "function(doc, req){ /* function code here */ }"
},
5. Validate Document Update Function
As named, you can define validation rules in this function to validate the document when you post into CouchDB.
"validate_doc_update": "function(newDoc, oldDoc, userCtx, secObj) { /* function code here */ }",
6. Language
Language property is telling CouchDB which programming language query server of this design document belongs to.
"language": "javascript"
I wont dive deep on Update function, Filter function and Validate document function as our focus today is view function. If you are interested, you may leave a message below let me know๐, then I can share a post about how to use update functions too.
โBack to Views๐ฌ
What is Views?
View in Apache CouchDB actually is a little bit similar to normal SQL database view.
A database view is a subset of a database and is based on a query that runs on one or more database tables.
The difference is CouchDB view is based on Map Reduce.
As example design document above, we can see that actually view function consists of 2 property keys (map & reduce), one is map function, another one is reduce function. (Reduce function is Optional)
1. Map function ๐
Map functions accept a single document as the argument and (optionally) emit() key/value pairs that are stored in a view.
Let's say we have a list of blog post documents saved in our CouchDB database.
[
{
_id: "c2ec3b79-d9ac-45a8-8c68-0f05cb3adfac",
title: "Post One Title",
content: "Post one content.",
author: "John Doe",
status: "submitted",
date: "2021-10-30T14:57:05.547Z",
type: "post"
},
{
_id: "ea885d7d-7af2-4858-b7bf-6fd01bcd4544",
title: "Post Two Title",
content: "Post two content.",
author: "Jane Doe",
status: "draft",
date: "2021-09-29T08:37:05.547Z",
type: "post"
},
{
_id: "4a2348ca-f27c-427f-a490-e29f2a64fdf2",
title: "Post Three Title",
content: "Post three content.",
author: "John Doe",
status: "submitted",
date: "2021-08-02T05:31:05.547Z",
type: "post"
},
...
]
If we want to query posts by status, we can create a javascript map function as below:
function (document) {
emit(document.status, document);
}
For the whole design document will look like this:
{
"_id": "_design/posts",
"views": {
"byStatus": {
"map": "function (document) { emit(document.status, document); }"
}
},
"language": "javascript"
}
After we saved this design document into CouchDB, CouchDB will start building the view. That's it, we have create a CouchDB view successfully.๐๐ฅณ
To use the view, just send a GET method http request with the url below:
http://{YOUR_COUCHDB_HOST}:5984/{YOUR_DATABASE_NAME}/_design/posts/_view/byStatus
If we want to get all the posts with status "draft", then we call the http request with parameters key="draft", it will return us all the posts with status "draft" only.
http://{YOUR_COUCHDB_HOST}:5984/{YOUR_DATABASE_NAME}/_design/posts/_view/byStatus?key="draft"
Result:
Let say another map function emit document by date:
function (document) {
emit(document.date, document);
}
Then we can query blog posts by date range.
http://{YOUR_COUCHDB_HOST}:5984/{YOUR_DATABASE_NAME}/_design/posts/_view/byDate?startkey=""&endkey="2021-09-29\uffff"
As query above, I defined a start date via startkey and end date via endkey , then CouchDB will return we the posts within the startkey and endkey. However my startkey is empty string, meaning that I don't care about start date, just give me the first post document until the date of the endkey.
Tips: If you want to reverse the return result, you can just add a parameter "descending=true"
2. Reduce/Rereduce โ
Reduce function is optional to a view, it is based on the map function result then you can perform SUM, COUNT or custom logic with to filter or derive into any desire result.
Let's say we have a map result shows (month, expenses):
function (document) {
emit(document.month, document.expenses);
}
If we want to get february expenses only, then we will put a parameter key="february", then it will return us february expenses only.
Based on the map result, we can add a reduce function to help us to sum the february expenses amount.
function(keys, values, rereduce) {
return sum(values);
}
Result for key="february" after reduce :
That's it. We can instantly get the sum result no matter how many documents you have in the database. This is the power of Map Reduce. You can even rereduce, meaning perform second time reduce logic based on the first reduce result. For more detail, you may check out the official documentation here
In Conclusion
CouchDB views is very powerful, flexible and super fast to query a result like Hadoop. However, CouchDB only supports one layer map reduce derivation. If you do not understand what is Map Reduce, you may check out this Youtube video.
Thank you for your reading.๐
Top comments (0)