Please log in to watch this conference skillscast.
Apache Pulsar is an open-source distributed pub-sub messaging system, developed under the stewardship of the Apache Software Foundation.
This talk will show how its unique architecture enables Pulsar to seamlessly support both streaming and messaging use cases in a single unified platform.
We will also show where Pulsar fits with the broader ecosystem of data streaming technologies and all the interoperability that is available out of the box, making it a particularly good choice for supporting any kind of data platform, where versatility, interoperability and scalability are the key requirements.
Q&A
Question: It looks to me like Pulsar has been designed with elastic infrastructure in mind since you’ve separated out the computer and storage. Unlike other systems which are designed to run on their own cluster. Is this a fair assessment? Do you see people taking advantage of this?
Answer: Yes, it’s part of normal operations. It’s useful at any cluster size.
Other possibilities, like scaling serving and storage independently are more useful only on clusters of a certain size.
Question: What’s the feedback around the cost savings, scalability benefits etc that users are experiencing?
Answer: Yes, the main driving factor for separating the storage layer (bookie) and the serving layer (brokers) is to avoid tying the data of a particular topic to one specific node.
The 2 layers allow for that, in conjunction with segmenting the data.
Not having the data tied to one node is key, because to easy add more nodes, take down nodes, etc.. without the need for expensive rebalancing operations.
Also, the high write availability is only possible if you decouple serving from storage, because it allows, in the presence of storage nodes failures, to immediately switch new writes to healthy nodes.
I think that the benefits of this architecture are more visible on the scalability and operability of the system, rather than directly on the cost saving. Although, yes, the auto scale up/down of the cluster does result in using infrastructure on demand.
YOU MAY ALSO LIKE:
Apache Pulsar and the Streaming Ecosystem
Matteo Merli
Co-creator and PMC Chair
Apache Pulsar