Shout out to @dequanchen who figured out most of the material below.
Overview
Apache HBase provides the ability to perform realtime random read/write access to large datasets. HBase is built on top of Apache Hadoop and can scale to billions of rows and millions of columns. One of the capabilities of Apache HBase is a REST server previously called Stargate. This REST server provides the ability to interact with HBase from any programming language. As features get added to HBase, they are they implemented in the REST API.
Apache HBase and Atomic Operations
Atomic operations in Apache HBase are important since it reduces the amount of round trip calls between the client and server. It also prevents complicated locks that are required if there are multiple clients. Two atomic operations (checkAndPut
and checkAndDelete
) were added to the REST server as part of HBASE-4720. These atomic operations are one of the undocumented features in the REST server. This has been an open HBase JIRA, HBASE-7129, since November 2012. Recently my team figured out how the atomic operations capability works and will be providing a patch back to the Apache HBase community.
Using the Apache HBase REST API
The Apache HBase REST API has historically been documented in two places. The Apache HBase Reference Guide and the JavaDocs. The documentation has been mostly moved to the reference guide for current versions of Apache HBase.
The REST API supports multiple different formats:
- Plain Text -
application/octet-stream
- XML -
text/xml
- JSON -
application/json
- Protocol Buffers -
application/x-protobuf
Each of these formats can be specified as part of the Accept
header. This blog post will focus on XML and JSON since they are easiest to work with directly.
Many of the Apache HBase REST API endpoints require the use of base64
encoding. base64
encoding ensures that the data can be transmitted across REST without any issues. Keep in mind that newlines and other characters affect the output of base64
. Both checkAndPut
and checkAndDelete
require that the check value match exactly so be careful to avoid extra characters.
When using the Apache HBase REST APIs the literal value in the URL typically has to match the base64
encoded value in the request body. The examples in this blog show what values need to be base64
encoded.
Example Base64 Encoded Values
The base64
encoded values below are used in the examples in this blog post.
Literal Text | Base64 Encoded |
---|---|
rowkey |
cm93a2V5 |
columnfamily:qualifier |
Y29sdW1uZmFtaWx5OnF1YWxpZmllcg== |
checkvalue |
Y2hlY2t2YWx1ZQ== |
newvalue |
bmV3dmFsdWU= |
Apache HBase REST API - checkAndPut
checkAndPut
checks the value of the latest version of a cell and if there is a match puts new data into the same cell.
checkAndPut Single Qualifier
This checkAndPut
call will check a specific qualifier value as specified in the request body and put qualifier value specified in the request body for the rowkey specified in the URL. Below is the HTTP method and endpoint followed by an example request body with explanation. This is followed by specific curl
examples for XML and JSON.
# HTTP Method and Endpoint
PUT http://localhost:8084/namespace:table/rowkey/?check=put
# Example XML Request Body with Explanation
<CellSet>
<Row key="Base64 Encoded RowKey">
<Cell column="Base64 column family : qualifer">Base64 new value</Cell>
<Cell column="Base64 column family : qualifer">Base64 check value</Cell>
</Row>
</CellSet>
Content-Type: text/xml
curl -i -H 'Accept: text/xml' \
-XPUT 'http://localhost:8084/namespace:table/rowkey/?check=put' \
-d '
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CellSet>
<Row key="cm93a2V5">
<Cell column="Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==">bmV3dmFsdWU=</Cell>
<Cell column="Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==">Y2hlY2t2YWx1ZQ==</Cell>
</Row>
</CellSet>'
Content-Type: application/json
curl -i -H 'Accept: application/json' \
-XPUT 'http://localhost:8084/namespace:table/rowkey/?check=put' \
-d '
{
"Row": [
{
"key": "cm93a2V5",
"Cell": [
{"column": "Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==", "$": "bmV3dmFsdWU="},
{"column": "Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==", "$": "Y2hlY2t2YWx1ZQ=="}
]
}
]
}'
Apache HBase REST API - checkAndDelete
checkAndDelete
checks the value of a cell and if it matches delete the specific version of a qualifier, all versions of a qualifier, column family, or row.
checkAndDelete Qualifier Single Version
This checkAndDelete
call will check a specific qualifier value as specified in the request body and delete the single version of the qualifier specified in the URL. Below is the HTTP method and endpoint followed by an example request body with explanation. This is followed by specific curl
examples for XML and JSON.
# HTTP Method and Endpoint
DELETE http://localhost:8084/namespace:table/rowkey/columnfamily:qualifier/version/?check=delete
# Example XML Request Body with Explanation
<CellSet>
<Row key="Base64 Encoded RowKey">
<Cell column="Base64 column family : qualifer">Base64 check value</Cell>
</Row>
</CellSet>
Content-Type: text/xml
curl -i -H 'Accept: text/xml' \
-XDELETE 'http://localhost:8084/namespace:table/rowkey/columnfamily:qualifier/version/?check=delete' \
-d '
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CellSet>
<Row key="cm93a2V5">
<Cell column="Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==">Y2hlY2t2YWx1ZQ==</Cell>
</Row>
</CellSet>'
Content-Type: application/json
curl -i -H 'Accept: application/json' \
-XDELETE 'http://localhost:8084/namespace:table/rowkey/columnfamily:qualifier/version/?check=delete' \
-d '
{
"Row": [
{
"key": "cm93a2V5",
"Cell": [
{"column": "Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==", "$": "Y2hlY2t2YWx1ZQ=="}
]
}
]
}'
checkAndDelete Qualifier All Versions
This checkAndDelete
call will check a specific qualifier value as specified in the request body and delete all the versions of a qualifier specified in the URL. Below is the HTTP method and endpoint followed by an example request body with explanation. This is followed by specific curl
examples for XML and JSON.
# HTTP Method and Endpoint
DELETE http://localhost:8084/namespace:table/rowkey/columnfamily:qualifier/?check=delete
# Example XML Request Body with Explanation
<CellSet>
<Row key="Base64 Encoded RowKey">
<Cell column="Base64 column family : qualifer">Base64 check value</Cell>
</Row>
</CellSet>
Content-Type: text/xml
curl -i -H 'Accept: text/xml' \
-XDELETE 'http://localhost:8084/namespace:table/rowkey/columnfamily:qualifier/?check=delete' \
-d '
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CellSet>
<Row key="cm93a2V5">
<Cell column="Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==">Y2hlY2t2YWx1ZQ==</Cell>
</Row>
</CellSet>'
Content-Type: application/json
curl -i -H 'Accept: application/json' \
-XDELETE 'http://localhost:8084/namespace:table/rowkey/columnfamily:qualifier/?check=delete' \
-d '
{
"Row": [
{
"key": "cm93a2V5",
"Cell": [
{"column": "Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==", "$": "Y2hlY2t2YWx1ZQ=="}
]
}
]
}'
checkAndDelete Column Family
This checkAndDelete
call will check a specific qualifier value as specified in the request body and delete the column family specified in the URL. Below is the HTTP method and endpoint followed by an example request body with explanation. This is followed by specific curl
examples for XML and JSON.
# HTTP Method and Endpoint
DELETE http://localhost:8084/namespace:table/rowkey/columnfamily/?check=delete
# Example XML Request Body with Explanation
<CellSet>
<Row key="Base64 Encoded RowKey">
<Cell column="Base64 column family : qualifer">Base64 check value</Cell>
</Row>
</CellSet>
Content-Type: text/xml
curl -i -H 'Accept: text/xml' \
-XDELETE 'http://localhost:8084/namespace:table/rowkey/columnfamily/?check=delete' \
-d '
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CellSet>
<Row key="cm93a2V5">
<Cell column="Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==">Y2hlY2t2YWx1ZQ==</Cell>
</Row>
</CellSet>'
Content-Type: application/json
curl -i -H 'Accept: application/json' \
-XDELETE 'http://localhost:8084/namespace:table/rowkey/columnfamily/?check=delete' \
-d '
{
"Row": [
{
"key": "cm93a2V5",
"Cell": [
{"column": "Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==", "$": "Y2hlY2t2YWx1ZQ=="}
]
}
]
}'
checkAndDelete Row
This checkAndDelete
call will check a specific qualifier value as specified in the request body and delete the row specified in the URL. Below is the HTTP method and endpoint followed by an example request body with explanation. This is followed by specific curl
examples for XML and JSON.
# HTTP Method and Endpoint
DELETE http://localhost:8084/namespace:table/rowkey/?check=delete
# Example XML Request Body with Explanation
<CellSet>
<Row key="Base64 Encoded RowKey">
<Cell column="Base64 column family : qualifer">Base64 check value</Cell>
</Row>
</CellSet>
Content-Type: text/xml
curl -i -H 'Accept: text/xml' \
-XDELETE 'http://localhost:8084/namespace:table/rowkey/?check=delete' \
-d '
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CellSet>
<Row key="cm93a2V5">
<Cell column="Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==">Y2hlY2t2YWx1ZQ==</Cell>
</Row>
</CellSet>'
Content-Type: application/json
curl -i -H 'Accept: application/json' \
-XDELETE 'http://localhost:8084/namespace:table/rowkey/?check=delete' \
-d '
{
"Row": [
{
"key": "cm93a2V5",
"Cell": [
{"column": "Y29sdW1uZmFtaWx5OnF1YWxpZmllcg==", "$": "Y2hlY2t2YWx1ZQ=="}
]
}
]
}'
What is next?
As stated above, my team will be putting together a documentation patch for HBASE-7129 to improve the HBase REST server documentation. Besides checkAndPut
and checkAndDelete
the REST API supports CheckAndMutate
, IncrementColumnValue
, and AppendValue
. These three endpoints aren’t supported in the version of HBase we use and also aren’t documented. If we upgrade we may look at them. We hope others will benefit from this capability since the atomic operations support has been in the REST API for over 5 years now.
Top comments (0)