Please log in to watch this conference skillscast.
Databases are everywhere, but did you ever wonder what goes on inside the box? In this talk we’ll dive into the internals of Neo4j - a popular graph database - and see how its designers deal with distributed systems challenges now and in the future. Borrowing heavily from the academic literature, we'll see why computers are far too easy to program and why oppositely distributed systems are far too hard. We'll follow that with some approaches to making distributed systems safer and contrast that with conflicting approaches that make systems more scalable! If that doesn't sound nightmarish enough, we'll finish up by showing how we can build systems that are safe and scalable by borrowing and gluing together a bunch of ideas from folks who are smarter than me. Come experience the last 10 years of my harrowing day job in less than an hour. You might even enjoy it, or at least empathise!
Q&A
Question: How critical is clock sync?
Answer: Clock sync is not important for these classes of algorithms. If we had accurate distributed clocks (and they're coming) then we can make many simplifying assumptions about ordering and perhaps use simpler algorithms where coordination is reserved for worst-case scenarios (see: Google Spanner)
I understand hardware manufacturers are soon able to create accurate GPS clocks nowadays on commodity servers. That was Google-level science fiction until recently.
Question: Awesome progress over the last few years - “amazing and terrible” - tell us the terrible bits?
Answer: Distributed algorithms are really sneaky. You think you designed one that works, but then there's some really, really subtle thing that means in reality the whole thing is unsafe.
Repeat that numerous times.
Of course now we think we have several that work, but that nagging doubt that you've missed something never leaves.
Question: how do you go about testing these?
Answer: We use proofs, which is OK. We'd like to move to using automated model checkers (e.g. TLA+) but we find the effort of doing that is about as high as building the software itself. Perhaps that's because we're unskilled at formal methods?
Question: Has this been used in any real-world application?
Answer: Yeah, the Linear Transactions protocol has been used in a KV store. But the mixture of LT and Raft I think is novel.
One other terrible thing is communicating with management around deadlines. It's hard to explain why getting two computers to agree is so hard and takes so long! honestly it does sound easy to get to computers to agree on a number, right?
Question: Do you think these Spanner-type databases will soon become a more mainstream choice because you no longer need to trade off consistency for scalability and availability?
Answer: I think for cloud providers, yes. They have the hardware, while there definitely are trade-offs, ownership of the full stack means you can actively manage those trade offs.
Question: now we can only hope that Spanner becomes more economical for smaller use cases. but there are other hosted options like Yugabyte and Cockroach.
Answer: Yeah, those are interesting. Cockroach in particular has a novel transactional setup (raft again) that obviates the need for clocks, but trades off Spanner performance in the edge cases.
Question: I love all this stuff... I'm crazy enough for my mind to idly contemplate coordination challenges while going through the motions of life but I'll never be smart enough to contribute to the field. Thank you very much Jim.
Answer: You have the smarts - it's no different from single computer programming in a way, you have to wallow in the field, spend time being petrified, and then take tentative steps into building terrible algorithms. Over time, you get a bit better.
Question: Are the Raft capabilities available via projects like SpringData-Neo4j, or only when working directly with the drivers?
Answer: They're built into the database servers, so however you access the database, this infrastructure is activated (unless you explicitly use a single server). The causal clustering protocol is handled by drivers as you hint, but those drivers are used by the Spring Data, so you're all good.
Question: General graph DB question (for someone who hasn't used them in anger) - have you ever had a situation where you thought - hmmm I can't do this in a graph DB I need to switch back to a common-or-garden-variety DB?
Answer: yeah, graphs are pretty general purpose, but I wouldn't use one for bulk storage, blob storage, etc because there aren't relationships in those kinds of data to be exploited.
Looking back at the times I used RDBMS when I worked for ThoughtWorks (hi, folks) it seems to me that most of those times a graph would have been a more humane and performant tool to use, if only they'd been around.
Question: I guess then conversely: what are the signs to look out for that would let you know that you should be considering a graphDB? like the London Underground example perhaps obviously looks like something with nodes but maybe ppl just don't know when or how to change their view of a problem?
Answer: E.g. If you have a RDB which is join heavy, that's probably a graph.
We call this the "graph problem problem" at Neo4j, and we're trying to solve it through education. E.g. there's a new, free "For Dummies" book out on graph dbs that you can get from the Neo4j web site.
Question:I think the consensus algorithms you went through were general case … Is there any scope for consensus algorithms tailored to specific contexts to achieve better X,Y,Z? Don’t know if I have a specific example.
Answer: I don't know of any off-hand, I suspect you might be able to take advantage of knowing the number of parties to coordinate, or timeliness (e.g. coming from compliance) etc to simplify/streamline the protocol.
Question: I work on the client side of distributed simulation but have thought about what distributed systems concepts go into the simulation algorithm. Consensus around time is important, as is some knowledge about how many messages each connected simulation application has sent.
Answer: Oh very cool. I've never done any distributed simulations, but did some work over the last few years simulating what goes wrong when your consensus (actually your consistency model) is weak. Spoiler: it turns to poop.
In that setup, we used an (expensive) simulation to calibrate a (cheap) numerical approximation for figuring out how quickly your data would rot. Quite useful.
YOU MAY ALSO LIKE:
- A Humane Presentation about Graph Database Internals (SkillsCast recorded in September 2020)
- Scala Days 2023 (Online Conference on 1st - 30th December 2023)
- Teaching Haskell...To High Schoolers! (SkillsCast recorded in December 2022)
- Teaching Haskell...To High Schoolers! (SkillsCast recorded in December 2022)
A Humane Presentation about Graph Database Internals
Jim Webber
Dr. Jim Webber is Chief Scientist with Neo Technology, the company behind the popular open source graph database Neo4j, where he works on R&D for highly scalable graph databases and writes open source software. His proven passion for microservices ecosystems and REST translate into highly engaging workshops that foster collaboration and discussion.