2. Motivation
• Distributed and parallel computation over data stored within HBase/Bigtable.
• Architecture: {HBase + MapReduce} vs. {HBase with Coprocessor}
– {Loosely coupled} vs. {Built in}
– E.g., simple additive or aggregating operations like summing, counting, and the like –
pushing the computation down to the servers where it can operate on the data
directly without communication overheads can give a dramatic performance
improvement over HBase’s already good scanning performance.
• To be a framework for both flexible and generic extension, and of distributed
computation directly within the HBase server processes.
– Arbitrary code can run at each tablet in each HBase server.
– Provides a very flexible model for building distributed services.
– Automatic scaling, load balancing, request routing for applications.
3. Motivation (cont.)
• To be a Data-Driven distributed and parallel service platform.
– Distributed parallel computation framework.
– Distributed application service platform.
• High-level call interface for clients
– Calls are addressed to rows or ranges of rows and the coprocessor client library
resolves them to actual locations;
– Calls across multiple rows are automatically split into multiple parallelized RPC.
• Origin
– Inspired by Google’s Bigtable Coprocessors.
– Jeff Dean gave a talk at LADIS’09
• http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf, page 66-67
4. HBase vs. Google Bigtable
• It is a framework that provides a library and runtime
environment for executing user code within the HBase
region server and master processes.
• Google coprocessors in contrast run co-located with the
tablet server but outside of its address space.
– https://issues.apache.org/jira/browse/HBASE-4047
6. Overview of HBase Coprocessor
• Tow scopes
– System : loaded globally on all tables and regions.
– Per-table: loaded on all regions for a table.
• Two types
– Observers
• Like triggers in conventional databases.
• The idea behind observers is that we can insert user code by overriding upcall methods provided by the
coprocessor framework. The callback functions are executed from core HBase code when certain events occur.
– Endpoints
• Dynamic PRC endpoints that resemble stored procedures.
• One can invoke an endpoint at any time from the client. The endpoint implementation will then be executed
remotely at the target region or regions, and results from those executions will be returned to the client.
• Difference of the tow types
– Only endpoints return result to client.
7. Observers
• Currently, three observers interfaces provided
– RegionObserver
• Provides hooks for data manipulation events, Get, Put, Delete, Scan, and so on. There is
an instance of a RegionObserver coprocessor for every table region and the scope of
the observations they can make is constrained to that region
– WALObserver
• Provides hooks for write-ahead log (WAL) related operations. This is a way to observe
or intercept WAL writing and reconstruction events. A WALObserver runs in the context
of WAL processing. There is one such context per region server.
– MasterObserver
• Provides hooks for DDL-type operation, i.e., create, delete, modify table, etc. The
MasterObserver runs within the context of the HBase master.
• Multiple Observers are chained to execute sequentially by order of
assigned priorities.
9. Observers: Example Code
package org.apache.hadoop.hbase.coprocessor;
import java.util.List;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.client.Get;
// Sample access-control coprocessor. It utilizes RegionObserver
// and intercept preXXX() method to check user privilege for the given table
// and column family.
public class AccessControlCoprocessor extends BaseRegionObserver {
@Override
public void preGet(final ObserverContext<RegionCoprocessorEnvironment> c,
final Get get, final List<KeyValue> result) throws IOException
throws IOException {
// check permissions..
if (!permissionGranted()) {
throw new AccessDeniedException("User is not allowed to access.");
}
}
// override prePut(), preDelete(), etc.
}
10. Endpoint
• Resembling stored procedures
• Invoke an endpoint at any time from the client.
• The endpoint implementation will then be executed
remotely at the target region or regions.
• Result from those executions will be returned to the client.
• Code implementation
– Endpoint is an interface for dynamic RPC extension.
11. Endpoints:
How to implement a custom Coprocessor?
• Have a new protocol interface which extends CoprocessorProtocol.
• Implement the Endpoint interface and the new protocol interface . The
implementation will be loaded into and executed from the region
context.
– Extend the abstract class BaseEndpointCoprocessor. This convenience class
hide some internal details that the implementer need not necessary be
concerned about, such as coprocessor class loading.
• On the client side, the Endpoints can be invoked by two new HBase
client APIs:
– Executing against a single region:
• HTableInterface.coprocessorProxy(Class<T> protocol, byte[] row)
– Executing against a range of regions:
• HTableInterface.coprocessorExec(Class<T> protocol, byte[] startKey, byte[]
endKey, Batch.Call<T,R> callable)
12. Endpoints: Example
Client Code
new Batch.Call (on all regions) Region Server 1
Endpoint
Batch.Call<ColumnAggregationProtocol, Long>() tableA, , 12345678
{ ColumnAggregationProtocol
.)
(..
public Long call(ColumnAggregationProtocol instance)
or
throws IOException
ss
Endpoint
e
oc
{ tableA, bbbb, 12345678
pr
return instance.sum(FAMILY, QUALIFIER);
Co
ColumnAggregationProtocol
ec
}
ex
}
HTable Region Server 2
Endpoint
Map<byte[], Long> sumResults = tableA, cccc, 12345678
table.coprocessorExec(ColumnAggregationProtocol.class, ColumnAggregationProtocol
startRow, endRow)
Endpoint
tableA, dddd, 12345678
ColumnAggregationProtocol
Batch Results
Map<byte[], Long> sumResults
• Note that the HBase client has the responsibility for dispatching parallel endpoint invocations to the
target regions, and for collecting the returned results to present to the application code.
• Like a lightweight MapReduce job: The “map” is the endpoint execution performed in the region
server on every target region, and the “reduce” is the final aggregation at the client.
• The distributed systems programming details behind a clean API.
13. Step-1: Define protocol interface
/**
* A sample protocol for performing aggregation at regions.
*/
public interface ColumnAggregationProtocol extends CoprocessorProtocol
{
/**
* Perform aggregation for a given column at the region. The aggregation
* will include all the rows inside the region. It can be extended to allow
* passing start and end rows for a fine-grained aggregation.
*
* @param family
* family
* @param qualifier
* qualifier
* @return Aggregation of the column.
* @throws exception.
*/
public long sum(byte[] family, byte[] qualifier) throws IOException;
}
14. Step-2: Implement endpoint and the interface
try
public class ColumnAggregationEndpoint extends {
BaseEndpointCoprocessor List<KeyValue> curVals = new ArrayList<KeyValue>();
implements ColumnAggregationProtocol boolean done = false;
{ do
@Override {
public long sum(byte[] family, byte[] qualifier) throws curVals.clear();
IOException done = scanner.next(curVals);
{ KeyValue kv = curVals.get(0);
// aggregate at each region sumResult +=
Scan scan = new Scan(); Bytes.toLong(kv.getBuffer(), kv.getValueOffset());
scan.addColumn(family, qualifier); } while (done);
long sumResult = 0; }
finally
InternalScanner scanner = {
((RegionCoprocessorEnvironment) scanner.close();
getEnvironment()).getRegion() }
.getScanner(scan); return sumResult;
}
}
15. Step-3 Deployment
• Two chooses
– Load from configuration (hbase-site.xml, restart HBase)
– Load from table attribute (disable and enable table)
• From shell
16. Step-4: Invoking
HTable table = new HTable(util.getConfiguration(), TEST_TABLE);
• On client Map<byte[], Long> results;
// scan: for all regions
side, invoking results =
table.coprocessorExec(ColumnAggregationProtocol.class,
the endpoint ROWS[rowSeperator1 - 1], ROWS[rowSeperator2 + 1],
new Batch.Call<ColumnAggregationProtocol, Long>()
{
public Long call(ColumnAggregationProtocol instance)
throws IOException
{
return instance
.sum(TEST_FAMILY, TEST_QUALIFIER);
}
});
long sumResult = 0;
long expectedResult = 0;
for (Map.Entry<byte[], Long> e : results.entrySet())
{
sumResult += e.getValue();
}
17. Server side execution
• Region Server public interface HRegionInterface
provide extends VersionedProtocol,
environment to Stoppable,Abortable
execute custom
coprocessor in {
region context. …
• Exec ExecResult execCoprocessor(byte[]
– Custom protocol regionName, Exec call) throws
name IOException;
– Method name
…
– Method
parameters }
18. Coprocessor Manangement
• Build your own Coprocessor
– Write server-side coprocessor code like above example, compiled and
packaged as a jar file.
• CoprocessorProtocol (e.g. ColumnAggregationProtocol)
• Endpoint implementation (e.g. ColumnAggregationEndpoint)
• Coprocessor Deployment
– Load from Configuration (hbase-site.xml, restart HBase)
• The jar file must be in classpath of HBase servers.
• Global for all regions of all tables (system coprocessors).
– Load from table attribute (from shell)
• per table basis
• The jar file should be put into HDFS or HBase servers’ classpath firstly, and set in
the table attribute.
19. Future Work based on Coprocessors
• Parallel Computation Framework (our first goal!) • Others
– Higher level of abstraction
– E.g. MapReduce APIs similar. – External Coprocessor Host
– Integration and implementation Dremel and/or dremel (HBASE-4047)
computation model into HBase.
• separate processes
• Distributed application service platform (our second – Code Weaving (HBASE-2058)
goal !?) • protect against malicious actions
– Higher level of abstraction or faults accidentally introduced
– Data-driven distributed application architecture. by a coprocessor.
– Avoid building similar distributed architecture repeatedly.
– …
• HBase system enhancements
– HBase internal measurements and statistics for administration.
• Support application like percolator
– Observes and notifications.