Kafka REST Proxy Now Available

Instaclustr is pleased to announce the availability of the Kafka REST Proxy as an option when deploying Kafka on the Instaclustr Managed Platform.

The Kafka REST Proxy allows applications to connect and communicate with a Kafka cluster over HTTP. The service exposes a set of REST endpoints to which applications can make REST API calls to connect, write and read Kafka messages. This capability enables organisations to use these standard REST interfaces to quickly start using Kafka without having to invest a lot of time in implementing and integrating a dedicated Kafka client into their technology stack.

Instaclustr offers this support using Confluent’s implementation of Kafka REST Proxy which is an open source Apache 2.0 licensed implementation. We used the latest stable version of this repo at the time we started developing this capability, that is, version 5.0.0 which is built using Kafka 2.0 client libraries. Confluent recently announced that the future releases of their Kafka REST proxy implementation will use a more restrictive license. So, Instaclustr has taken a fork of the latest repo and will continue to maintain it under the Apache 2.0 license. The repo can be found here — https://github.com/instaclustr/kafka-rest. We welcome all interested community members to contribute and enhance its capabilities.

Provisioning of the Kafka REST Proxy uses Instaclustr’s standard approach for provisioning and security HTTPS based interfaces:

  • DNS entries in the cnodes.com domain are automatically created for the end points
  • Fully trusted certificates for the end points are automatically generated using the Let’s Encrypt public certificate authority. A dedicated certificate is issued for each Kafka cluster.
  • Nginx running on the node is used as a standard proxy to terminate the TLS connection and provide HTTP authentication

The Kafka REST Proxy

There are a few reasons why organisations may want to use the Kafka REST proxy interface over a dedicated Kafka client.

  • Legacy applications which need to use Kafka but integrating a dedicated Kafka client into its stack is difficult and takes too much time and effort.
  • Small teams who may not have the necessary programming skills to integrate a dedicated Kafka client but wants to use Kafka.
  • Building an MVP to validate the technical feasibility of an application/product/feature and to receive early feedback from customers on those new capabilities. At this stage, scalability and performance may not be a priority yet. But the application may eventually need to use dedicated Kafka clients as it scales.
  • An application which is not intended to use Kafka for its performance and scalability benefits but needs to connect and communicate with Kafka occasionally or less frequently for non-business-critical use. It may be very convenient to use the Kafka REST proxy interface to reduce development time.

A key disadvantage of using Kafka REST interface over Kafka clients (Java, Python, and others) built and optimised to make the best of Kafka cluster is Performance. Published benchmarks indicate the proxy achieves ~67% of the write throughput and ~50% of the read throughput achieved by Java clients. That’s a significant performance penalty for a business critical real-time application. This performance impact could be offset by using more powerful Kafka nodes but that comes with a price penalty. Additionally, the REST proxy implementation exposes a limited set of client configuration options as compared to the comprehensive configuration control you get with Kafka clients. That limits how your applications produce and consume messages to/from the Kafka cluster. Another quirk with this REST proxy service is that it maintains the state of the Consumers which is a departure from a pure REST implementation. That means, if the node running this proxy goes down, the state of the consumer will be lost and they need to be recreated. However, this compromise comes with the benefit of being a simpler and light-weight implementation with fewer server-side complexities to handle and the responsibility of handling such failures is entrusted upon the Application using the REST interface.

So, in summary, this feature is a nice addition to Instaclustr’s Managed Kafka offering at no extra cost for our customers, but the technology teams should be very cautious and be aware of the trade-offs before choosing to use Kafka REST proxy. If performance and reliability are important for your application and business, then it is worth investing on integrating one of the dedicated Kafka clients into your technology stack.

Since adding Kafka to the Instaclustr Platform early this year, we have been excited by the strong take-up of our Kafka offerings. Customers have recognised the value of bringing the reliability and operational maturity that Instaclustr is famous for in the Cassandra world to managing Kafka. We plan to continue to add many features and capabilities to our Kafka offering in 2019.

Please get in touch at info@instaclustr.com if you have any queries or specific features you are interested in.

Managed platform for open source technologies including Apache Cassandra, Apache Kafka, Apache ZooKeepere, Redis, Elasticsearch and PostgreSQL