Tuesday, 8th - Wednesday, 9th December, Online Conference

15 experts spoke.

YOW! Conference is designed by developers for developers, and each speaker has been invited because of their development expertise by our independent international program committee. At YOW! Conference you'll get straight tech talk by world-class experts and networking with like-minded developers.

Excited? Share it!


Launching an Internal Developer Community

Community is nothing new to developers. Just look at the open source movement, the numerous conferences, and the growth of collaborative resources like GitHub and Stack Overflow. Developers know the best solutions come from collaboration. In many respects, community is a superpower for innovation and problem solving.

So why is community so difficult to establish inside companies? It would seem like a natural opportunity given the collective nature and shared vision of being part of an organization. Yet when I visit most companies, community and collaboration are sorely lacking, and sometimes even discouraged.

In this talk, I share practices from my journey while at Stack Overflow and in other communities I have over the past decade to help all of us in the journey towards building healthy and thriving internal developer communities.


Question: What do you do if the community grows too big (even if following Metcalfe's law) such that it is no longer engaging like it was when it was small?

Do you look at splitting the community into sub-communities?

Answer: Yes, splitting is ideal when the interactions become distant (people have formed cliques and do not engage as freely). This is something that is often done in the mode of communities of practice to identify specialty interests. There is this thing called the Dunbar Number, which states that the maximal size for any one person to maintain stable / trusted relationships is 150 people. 

This is why bigger communities start to become unwieldy (and companies too). You simply cannot implicitly trust and maintain relationships with so many people. I talked about the trust layer of communities, and it starts to fray once the community gets too large. The way to address that is to create sub communities, still connected to the global community, but smaller so that trusted relationships can be maintained. As an example, I had a meetup several years ago that would regularly get 70 to 80 people. Then I had an event of over 200 people. It was a completely different feel and the social interactions were actually muted. Much less serendipity.

The point is not so much the specific number, but the context of how Dunbar impacts your community / organization. For example, my "Dunbar Number" for in-person events was around 75. Anything over 100 people became unwieldy and lowered the serendipitous engagements between attendees. So the real Dunbar Number can be very different for you based on what outcomes you are seeking to drive from your community.


Question: I was wondering about community burn out?  At what point should you change things up?

Answer: Good call out on burn out, something I have faced. On burnout, you need to make sure you are recruiting volunteers to get involved, they will be the next generation of leaders over time. Also, never launch a community on your own, get colleagues to help.

Question: Obviously, StackOverflow does a private offering. Are there any other good knowledge-sharing/community-building platforms that you've seen work well for developer communities?

Answer: Yes, I have actually seen internal Slack and Discord groups in companies work. Also I like Discourse for knowledge sharing. Discord is free and often used in the gaming world. Discourse is a SaaS offering (launched by Jeff Atwood, the other co-founder of Stack Overflow). I break it down this way, Slack is often already in use or in place in many companies (if not a Microsoft shop that is), so adopting is easier. Discord is great and totally free, but may lose people that do not like the interaction aspects of the UX. Discourse is for canonical knowledge and conversations you want to keep and find later, so more forum or knowledge base oriented.

Question: Any specific tips on creating a global community (say within a company that has offices around the world)?

Answer: For a global company, start local or regional at first. There are language and cultural issues that are faced when going global that if you do not iron out early on can tank adoption and engagement.

Question: I'm curious about what you think the number one thing is that stops internal communities developing?

Answer: Number one thing is culture, some companies are stuck in old modes of management and hierarchic structures. But then again, those organizations often do not have great engineering teams and culture anyway.

Question: Can you talk a little more on getting leadership involvement? Is it different to getting them involved as "fans" or as community managers?

Answer: The best way is to build a business case, treat it like a real program and outline the goals, metrics, and outcomes to be expected. Then shop it to your leadership. In terms of leaders getting involved, once they give their support to the community, get agreement from a few of the senior leaders to participate in structured ways. The challenge of communities is that there are peaks and valleys of engagement, so create "events'' that bring the leader(s) into the community to interact. The best types of events are AMA style talks / discussions, online town halls, and hackathons. Why this work is that these are things that can be put in the leader's calendar. Get access to a leaders calendar and you have a lot of power to shape community involvement.

Question: Some communities start from the idea or effort of one person. How can these communities avoid revolving around these central figures so that they don't collapse when those figures are no longer there?

Answer: It is hard. I made that mistake where my persona overwhelmed the community and stunted the development of other leaders. I talk about this in the book as one of the traps faced. If you make a conscious effort to recruit fellow leaders / co-founders early, you can avoid this "cult of personality" that sometimes happens.

Question: Do you have any specific examples of communities that you feel are doing an awesome job? Those are at the smaller (50-500) end at the moment.

Answer: There was one quant trading firm that I worked with that did an amazing job and had under 200 people. What was the difference maker? Incorporating the community into the onboarding process and a member story of how the community helped that resonated with everyone. The story hit the WIIFM that I talked about and once that was clear (in this case, get answers to any question fast), the community took off.

Question: When developing your community you mentioned the 5-10% high participation. Is there a risk that you end up with only those voices dominating the community? How do you balance keeping their useful contribution to content vs potentially drowning out or discouraging others' contributions?

Answer: This is a common issue, in smaller communities, a few "louder" voices can take over. It will happen and does so in every community. A small number of people will dominate the conversation. That is actually not a bad thing because that ensures there is "liquidity" of engagement and content in the community. What you can do to ensure that others can also engage and feel safe is to have a Code of Conduct and ensure that communication about the community promotes content and contributions from the group of casual community members.

Question: Do you have any recommendations on how to build a productive community around an open source software. By productive I mean : A community that continuously improves the open source's code base, constantly engaged with the roadmap of the open source software.

Answer: For any community, it comes down to "do people really care about it"? That is the first step in what I outlined when building a community. Start there and build your community from the steps I outline, it does not differ whether you are building a dev group for API's, a community of practice on engineering practices, a group for K8s, or a community around an open source project. All the 8 steps to building community remain valid.

Mark Birch

Principal Startup Advocate

O’ Mice An’ Men -- Rescuing a Project Gang Agley

You might have noticed that the world is suffering a pandemic at the moment, and it might have disrupted your software development plans. At least you’ve got a good excuse, though I’ve heard rumors of managers being sacked for not foreseeing the pandemic and including it in their schedules. I’ve not heard that sacking the managers has either made the pandemic go away, or rescued a software development schedule. I’ll leave that first problem to the epidemiologists and virologists, but when circumstances make a laughingstock of your schedule, what can you do? I can’t tell you how to do the impossible, but I can help you make the best of the situation. To do that, we’ll use that much maligned and oft misused tool, estimation. Come with me and we’ll explore ways to use your estimates to guide your response to unforeseen disruptions–meeting near term needs to the extent possible, and future proofing your longer term plans.


Question: How do you deal with leaders who want proxy measures to make them feel comfortable that the team is busy?

Answer: Those leaders need to consider which is more important, busyness or the business. The interpersonal stuff can, of course, be tricky. The last chapter in my book is about that. It's the most important stuff in the book, to my mind. I wrote the book because I hated to see people beating each other up over estimates.

People who want to see full utilization tend to be middle managers rather than senior level. They're just doing what they were told was right and important for a manager. It's really more of a foreman approach than a management approach. Helping them learn and grow can be tricky.

Budget is fine. 100% utilization is madness.

Q. What do you call a highway with 100% utilization?

A. A parking lot.

You need slack to have flow. Progress comes from flow.

Question: Anyway my real point to my question was that those “leaders” are likely seen as leaders in name only. How much hope is there to actually turn such a situation around? How often do you see a significant shift through “helping them learn and grow” and those people (re)-gaining true respect before the productive team members start leaving (or get so disillusioned by life that they decide to become middle managers themselves)?

Answer: Improving such a situation is a job in itself. You need to start with what are the other person's needs? That can be hard to determine, especially if they have positional power over you.  Think about your own goals. If you can't improve things, do you want to stick around? I generally feel it's worth giving it my best shot before leaving. YMMV

Thinking about your own goals is to decide how much you want to put into the situation. To improve the situation, you need to consider your own needs, the other person's needs, and the needs of the context. It's a dynamic balance.

Question: How would you deal with organisations which have a fixed price quote financial model, but rely on agile teams for delivery?

Answer: The fixed-price quote model doesn't really change when using Agile delivery. Estimate the job based on the historical data you have. Then use Agile delivery mechanisms to keep the project running smoothly, and see the progress toward that goal line.

Question: People always estimate for an iteration or more. Then validate or judge the velocity of the team. I feel measuring velocity depends on the nature of work, complexity, skill set, experience, etc. What is the best way to guess the velocity, I am sure, it’s not accurate but at least give some idea to the delivery manager?

Answer: My favorite tool is the burn-up chart. It lets you visualize the progress, and the changes in velocity, in ways that numbers tend to obscure. Mike Cohn's book gave people the impression that everything should be broken down to the story level up front, but

  • that's too much work
  • that locks in the design when you know the least
  • and a number of other difficulties.

If you were asking how to guess velocity before the project started, I don't think you necessarily have to do that. Start working, and see what is the capacity of the team. I talked about this in http://blog.gdinwiddie.com/2011/05/25/avoiding-iteration-zero/

George Dinwiddie

George is a software development consultant and coach with over thirty years of experience creating software ranging from small embedded systems to corporate enterprise systems.

How to Read Complex Code without Getting a Headache

Research shows that on average developers spend about 58 percent of their time on reading code! However, we are not explicitly taught reading code in school or in boot camps, and we rarely practice code reading too.  

Maybe you have never thought about it, but reading code can be confusing in many ways. Code in which you do not understand the variable names causes a different type of confusion from code that is very coupled to other code. In this talk, Felienne Hermans, associate professor at Leiden University, will firstly dive into the cognitive processes that play a role when reading code. She will then show you theories for reading code, and close the talk with some hands-on techniques that can be used to read to any piece of code with more ease and fewer headaches!


Question: My Thesis was on Intelligent Tutoring Systems.

It was a real eye opener that no one teaching method suits everybody

  • some like starting with examples
  • some like syntax first
  • some like a verbose explanation

Have you found the same applies to reading code?

  • do some people prefer to start with the tests
  • do some prefer to start with the main

Answer: Yes I think this is true for code reading too, while it might not be so related to personality style but more to prior knowledge. We know from research that more experienced programmers read following the call stack and beginners tend to read top to bottom.

Question: Do you think we’d naturally write better (more readable) code if education focused more on teaching reading code vs just writing it?

Answer: That is a great question, I do think it would lead to better code and indeed there might also be people like managers or product owners that could just read but not write code which can be helpful. Great additions to the talk!

Question: Is this why REPL's are good, they help with that working memory? Are REPLs a good or bad thing for learning how to read and understand code?

Answer: Wow I do not have a simple answer there! It is a deep question. I think they could be great tools for learning when used deliberately, for example, you get code, first you predict what it does (also a very good way of practicing reading!) and then you verify. But they also encourage “just try it out” which as a culture might harm learning as I said in the talk.

Question: So if used correctly they could support learning? But can also be a crutch. Like an IDE too I guess, using one without understanding is not great for improving your skill in a language.

Answer: That indeed!

Question: Should we be refactoring code so that it contains a summary?

Answer: Yes that’s what I say in the book as well that if you make summaries for cognitive purposes you can commit them back to the code as docs

Question: The only risk with a summary that is not directly linked to code is that they tend to get outdated. Any suggestions on how to ensure the summary is always relevant?

Answer: You can link it in the code of course, place it there or link it. But yeah that is a risk! I’d like to add that even if it is executable it might still decay over time

Question: Are there any natural language preprocessors that could be used to record the summary and it would be compiled into the actual calls to the real code?

 This way you have your summary but it has to be kept up to date in order for the code to still work.

Answer: Interesting idea! But I think the most important thing about the summaries in my talk is that they help people understand the code. Using them as docs is extra/additional. So it can be done many times. You might even imagine that new people coming into the code base make their own summaries, compare them with what is there and update what is needed. (which is helpful for the code base and for the new programmer)

Question: Does your book also give ways to write code so that it reads easier from a cognitive perspective? We already have clean code practices which should help a great deal, but with this framework in mind, there could be more to be done

Answer: Yes, it has a whole chapter on Code Smells and their relation to cognitive processing of code!

Question: Have you incorporated any research around the styles that people can also learn?  Such as visual, audio, feel, doing.  it would be interesting to see the breakdown of longterm, short term etc and what affect different styles also play into how people absorb and learn.  Great talk btw. 

Answer: I am not an expert in learning styles, but what I know the research into them seems to say they do not really work -> https://www.psychologicalscience.org/news/releases/learning-styles-debunked-there-is-no-evidence-supporting-auditory-and-visual-learning-psychologists-say.html

It seems people might be bad at choosing their own style also, shying away from needed practice (e.g. bad readers will choose visual style, leading them to read even less)

Question: I may have missed it but was it posted where we would be able to order the "The Programmer’s Brain"?

Answer: When it comes out (which should be this week!) the link will be on felienne.com/book

Just got the news the book is online!!! See: ttps://www.manning.com/books/the-programmers-brain

Felienne Hermans

Associate professor in software engineering
Delft University of Technology

How to improve Developer Productivity

Wouldn't it be great if we could rank developers based on their productivity, reward the best ones, and make sad noises at the others? Don't be silly, developer productivity isn't an innate ability that you can measure and rank! This talk will discuss how to think about and improve productivity based on the longest-running academically rigorous research investigation into the practices and capabilities that drive high performance in software delivery. Find out what drives productivity, how you can do more of it, and why it's important. Discover how tools and culture both impact productivity, and how to make a research-based argument for reducing technical debt.


Question: How do you do continuous delivery with software that gets installed on the client rather than the server? e.g. native mobile apps, which may need app store approval

Answer: Continuous delivery for apps is the same as CD for services (Remember, CD is about making releases boring, not releasing all the time). Make sure you're doing CI, have automated tests, and comprehensive configuration management in place (https://continuousdelivery.com/foundations/) and implement a deployment pipeline. 

The final stage of the deployment pipeline is going to be the release to the app store. You can (and should) also do things like upgrade testing as part of your deployment pipeline. Back in 2008-2009 I was product manager for go.cd, which is user-installed software - we used CD to build go.cd, even though we only released 3-4x per year. There's also a nice case study for CD for firmware for printers (which again was only released a few times a year): https://continuousdelivery.com/evidence-case-studies/#the-hp-futuresmart-case-study

Question: What are your thoughts around considering “Lead time for changes” as

“the time between merging a PR until it is live in production” vs

“the time between starting to code and the change being live in production”?

ie: Do you track “development + deployment” or just the “deployment” process?

Answer: Lead time is defined from the time you start to code. (at least, for the part of the domain we care about). And the reason for that is we want to optimize the part of the process from writing code to getting it merged too. Our research (which has been reproduced) shows that working in small batches off trunk drives higher performance. In particular, teams do best when they:

  • Have three or fewer active branches in the application's code repository.
  • Merge branches to trunk at least once a day.


Question: I think they’re all important in their own ways, and may reveal bottlenecks in different parts of the full SDLC. I’d even go so far to suggest measuring the time between ideation (hey, I have this idea) and change deployed in production. 

Answer: There are actually two domains we're interested in when doing software delivery

Product Design and Development     Product Delivery

(Build, Testing, Deployment)    

Create new products and services that solve customer problems using hypothesis-driven delivery, modern UX, design thinking.     Enable fast flow from development to production and reliable releases by stadardizing work, and reducing variability and batch sizes.    
Feature design and implementation may require work that has never been performed before.     Integration, test and deployment must be performed continuously as quickly as possible.     
Estimates are highly uncertain.     Cycle times should be well-known and predictable.    
Outcomes are highly variable.     Outcomes should have low variability.    

(from Accelerate p15)

Here we're talking about delivery, and we want to start the clock ticking the moment we start checking into version control.

Question: Do we have a measure of how much "stuff" we've delivered per unit of time?  Lead time to deploy and deploy frequency doesn't seem adequate.

Answer: No, we don't, and that's the biggest difference between software and manufacturing, and the reason why lean needs to be adapted when moving between the two. That's part of what I was getting at when I was talking about lines of code and outcome vs output. Obviously nobody is going to pay for a tiny or half-built car when they're expecting a regular one. But people will absolutely pay for a solution delivered in half the lines of code than an existing one if it better solves their problem. All the ways of measuring "stuff" in software are arbitrary and subjective and, perhaps most importantly, have not been shown to drive organizational outcomes we care about such as profitability, market share, productivity, customer satisfaction. However the four metrics I've shared do drive those outcomes, so we know they matter.

Question: Can you give your opinion of Corporations using tools like BlueOptima to measure personal productivity?

Answer: Yeah as you can imagine I hate it, both because I think it's unethical and awful, and also because it's actually stupid because it won't drive the outcomes organizations want. I've talked a bunch about the importance and measurable impact of psychological safety and a mission culture: I can't think of anything better designed to destroy that than being told you're a cog in a machine and having your output (not outcomes!) continuously monitored by HR. And just to be clear, I've no experience with BlueOptima in particular, I am just strongly opposed to tools that purport to measure individual productivity.

Question: Do you have tips for communicating these lessons to execs in an organisation to try and adjust software development practices? Is there a TL/DR version of the devops reports I can share with execs?

Answer: Well since you asked we do have a book targeted at execs and managers: https://itrevolution.com/book/accelerate/

However we did design the reports so that you could print out the appropriate pages and leave them on the desk of an exec :).

Question: Do you have any more information, reading etc on how to avoid burnout, or to better limit the amount of burnout a team can feel?

Answer: Our research in this area is based on the work of Dr Christina Maslach. You can see a talk she gave at DevOps Enterprise Summit here: https://www.youtube.com/watch?v=SVlL9TnvphA

Question: Are there any tools/techniques that you've observed developers using at a personal level for their own productivity?

Answer: So I don't think I can give a good answer to this because my observation is that it's quite personal, and different things work for different people. I think the journey is as important as the destination, by which I mean it's important to experiment and see what works for you. Sorry I don't have a better answer. What I would say is that in the talk I discuss the things (like search and reduced tech debt) that drive productivity, and there are definitely great and well-known tools to help with those things.

Question: When you say well known tools for search (I was reading about the hours of waste looking for information), are you referring to collaboration tools like wikis and being able to search BitBucket repos etc? I mean for 'external' search there's some fairly obvious tools :-)

Answer:  Here's what we say in the 2019 State of DevOps report: "Internal search: Investments that support document and code creation as well as effective search for company knowledge bases, code repositories, ticketing systems, and other docs contribute to engineering productivity. Those who used internal knowledge sources were 1.73 times more likely to be productive. Providing developers, sysadmins, and support staff with the ability to search internal resources allows them to find answers that are uniquely suited to the work context (for example, using “find similar” functions) and apply solutions faster. In addition, internal knowledge bases that are adequately supported and fostered create opportunities for additional information sharing and knowledge capture." (2019, p62).

What I meant by well known is in the context of tech debt things such as refactoring.

Question: If you were a software development group in the very early stages of the devops journey, what would be the first few things you would focus on?

Answer: Sorry but that's a classic It Depends answer :). I actually co-founded a startup (DORA) where our whole business model was surveying your company to find out where the best place to start was, and it was different for everyone. You can read a case study here: https://services.google.com/fh/files/misc/capitalonecasestudy.pdf. So what I would do is start by discovering your constraint. One useful technique for this is value stream mapping: <a href="https://cloud.google.com/solutions/devops/devops-process-work-visibility-in-value-stream" target="blank">https://cloud.google.com/solutions/devops/devops-process-work-visibility-in-value-stream

Question: As someone who has seen the power of a devops transformation at two different companies (even leading one), I find it hard when I meet people still sceptical of the benefits. Given the work you do with DORA and Google showing the benefits of devops, do you have an opinion on whether we should be looking to encourage mass awareness of these principles (e.g. by including the topic as standard during university degree), or is organic growth the way to go (e.g. company transformation after company transformation, person by person)? (Or option C - something else?)

Answer: It's a good question and a hard one to answer. I certainly share your experience of frustration at people who are skeptical of the benefits, particularly after having spent 6 years on a rigorous program of scientific research. I even have people who still tell me (proper) CI and trunk-based development can't work, what, 20 years in? I am a bit skeptical of teaching it at college because it's such an advanced topic. I teach graduate-level classes at UC Berkeley such as product management, and I'm not sure how I'd put together a class on this (doesn't mean it can't be done). I do wish that we put more of an emphasis on continuing education (rather than certification) in our industry. Part of the problem is systemic: it's possible to spend years in the industry without even coming across some of the capabilities I've talked about. And people think, well, I've done OK so far, it can't be that important! That's partly because our industry is so young and undeveloped in its methods and approaches (manufacturing is about 50 years ahead of us so that's an industry that's interesting to watch) and also because, if I can be permitted to be snarky for a moment, it's an industry full of people who've been told they're the smartest in the room, so it is sometimes hard for us to take a step back and think perhaps what we've been doing isn't actually the best way.

Hidden Features and Traps of C++ Move Semantics

Move semantics, introduced with C++11, has become a hallmark of modern C++ programming. However, it also complicates the language in many ways. Even after several years of support, experienced programmers struggle with all details of move semantics.

While I took the time to write up all the facts and details in my new book "C++ Move Semantics - The Complete Guide" (cppmove.com), I learned a lot I wasn't aware of (note that the final book has 260 pages).

This talk is about a very simple class to demonstrate some remarkable aspects of move semantics. We will see some tricky features and traps to understand C++ better.


Question: Is it a compiler error if you use noexcept but the code could throw?

Answer:  If you specify noexcept, you know what you do. The compiler could check but note that you know better whether exceptions might be thrown. You might already know that enough memory is allocated or that an index is valid. So if the compiler would give a warning you could get a lot of false positives.

If AT RUNTIME there is an exception so that your guarantee is broken, terminate() is called to abort() the program.

Question: "C++ is tricky, I know" - you just said.  As a non-C++ user, it looks terrifying!  Can C++ copy the goodness of Rust and make me safer without so much brain space requirement?

Answer: Well, use Rust if you can and it makes more sense for you! C++ is very very fast but still backward compatible to C (so almost 50 years old now). That comes with a price. Newer programming languages can of course learn from C++ and make it better. And yes, we know that we have things we would better not have in C++, but backward compatibility is an important argument.

So far according to the number of programmers (more than 4 million) and tool support and so on there is still no new general purpose language that replaces C++. But that might change and in your context you might be able to use something else.

Question: I'm at least half serious about "Can C++ absorb Rust?" because I know that C++11 changed how C++ programmers should use C++ and every new version does the same.  So far, nobody has deprecated the original C++ features, but could you take the Rust "Edition" idea and turn off backward compatibility when the source has a "use only new stuff" switch in it?  Then after some time there would only be new C++ code and the amount of things a programmer has to get correct for C++ code to work would be much smaller.  

Answer: Believe me, we all would LOVE to go that path. The problem is that too much code is already written. So we would have at least a period of 10 years of overlap for each replaced feature. We can deprecate and remove things. But everybody asks to keep another thing alive. And where do we stop? It is a huge problem. And we don't want to have the Python effect, havcing de-facto two Pythons now. Especially as our world is far more complex (regarding code dependencies) We started at a few places to replace old stuff. We e.g have a new basic thread type and new mutex types. And the new model of executors will replace old approaches of multithreading with the upcoming versions. We discussed introducing std2 with C++20 ranges, but it turned out to introduce a long chain of consequences (e.g. do we need a new string type and what is with the old one...). IMO, this discussion is over.

Question: I also liked the speed comparison you did between the copy and move.

So often in other languages, developers treat speed as an afterthought.

Answer: I think we in the standard committee know that speed and performance is THE reason to use C++. So, there is definitely no goal to become like Java or Python. The goal is to focus on the strength of the language and at the same time try the best we can to ensure that the language is still usable and not a problem in itself. Yes, sometimes we fail. Usually, for good reasons. You are invited to help. This is a community driven language. We don't have a chief architect. So at the end it is all your fault, you didn't help us to make things better :wink:

The C++ standards committee is open for everybody.

Question: I think some of us have guessed at how much work it would be and have "bravely run away"!

Answer:  Oh yes, it is. But sometimes it's not...And in these times even in C++ there are no easy answers (as in politics). It needs careful work and effort. There is one thing I learned over all these years working on C++ with people all over the world (including Australia): Complex good things need input by everybody and take time.

Nicolai Josuttis

Technical manager, Systems architect, Senior consultant
System Integration

Building Adaptive Systems For a Fast Flow of Change

In a world of rapid changes and increasing uncertainties, organizations have to continuously adapt and evolve to remain competitive and excel in the market. 

In such a dynamic business landscape organizations need to design for adaptability. Organizations need to aim for building systems and team organizations aligned to the business needs and business strategy and evolving them for adaptability to new changes and unknown environments.

In this talk, I am going to highlight how the combination of  Wardley Maps, Domain-Driven Design, and Team Topologies can provide a holistic, powerful toolset to design, build and evolve adaptive systems and team structures for a fast flow of change.


Question: This is a very interesting mash-up of techniques @Susanne Kaiser. Have you implemented this structure at an organisation or is it still in the theory stage?

Answer: At my clients, I am using those different perspectives at different levels - currently mostly Wardley Maps and DDD and next is Team Topologies.

Question: Could you share here the resources you mentioned at the end of your talk?

Answer: Of course, I will share my slides later as well, but here are the resources I mentioned:


  • Eric Evans: "Domain-Driven Design: Tackling Complexity in the Heart of Software"
  • Vaughn Vernon: "Implementing Domain Driven Design" and "Domain-Driven Design Distilled"
  • Vladik Khononov: "What is Domain-Driven Design?"

Wardley Maps:

Team Topologies:

-Matthew Skelton, Manuel Pais: "Team Topologies"

Question: I was wondering if you had an opinion about separating teams across verticals of a domain - for example API versus UI for a single domain/bounded context?

Answer: I am a big fan of micro-frontends that increases the autonomy of the team owning that bounded context including UI. However, my current projects are (due to  "historic" decisions), the most are organized with one UI using several backend APIs though. Hope to practice Micro-Frontend in future projects though. Then I could give a hands-on experience ;) I guess MicroCPH (https://twitter.com/MicroCPH) is planning a Micro-Frontend related conference (online) for next year.

Question: Should a stream-aligned team be responsible for a single domain model or multiple? How important is it to invest in core subdomains in the custom-built stage, particularly one that has already been quite successful, versus ones in the genesis stage?

Answer: A stream-aligned can own multiple domain models/bounded contexts depending on their complexity and its resulting cognitive load per team. If the team's cognitive load would be highly exceeded, then I would recommend to split it.

In general, I would suggest investing in both of your core subdomains (both in genesis, and custom-built): Investing in experimenting/exploring the genesis and watching out how the custom-built core-subdomains might evolve over time, e.g. are their opportunities that custom-built components can go later on in product + rental.

Susanne Kaiser

Tech Consultant

Jepsen 13

We trust databases to store our data, but should we? Jepsen combines generative testing techniques with fault injection to verify the safety of distributed databases. We'll learn the basics of distributed systems testing, and show how those techniques found consistency errors in MongoDB, PostgreSQL, and Redis-Raft. Finally, we'll conclude with advice for testing your own systems.


Question: What do you think of implementing Read Committed using snapshots instead of pessimistic concurrency locking?

Answer: So Postgres does this internally, and I think it's a natural fit--once you choose snapshot isolation it's relatively straightforward. I think it especially makes sense in a distributed systems context because locks generally require some sort of communication. Surprisingly, you can actually build RC in a totally coordination-free way. All you need to do is... not expose uncommitted values to clients. There's no requirement that you show them timely or even isolated values! See Peter Bailis' VLDB paper for more.

Question: Is it possible to plug tools like Elle or Jepsen into an existing cloud infrastructure and discover the inherent flaws in a system?

Answer: Sort of. Elle is intentionally very general: it needs "a concurrent history" which you can record any way you like.Jepsen records those histories. You could point Jepsen at a cloud service like S3 or Dynamo, and that would work just fine! But you wouldn't get fault injection.So you can only measure the happy case, unless you can find a way to either 1.) trigger faults, or 2.) wait long enough to see faults in the wild. (I've been thinking about doing this for S3 myself, actually)

Question: I've always been fascinated by these tools, but they are a bit daunting to use, and infinitely more difficult to go DIY/NIH and would it be easier to inject those faults yourself?

Answer: I'm not entirely sure. Like.... how do you figure out which nodes in S3 even store your data? How do you get into Amazon's network to mess with them? Generally, we don't have any access into cloud services, so fault injection is probably off the table. There are some cases where AWS exposes some APIs for triggering failovers, like in RDS, right? That might be a place to look at a fault. But you're kind of at the mercy of Amazon's designers there--having to trust that the fault is actually meaningful and maps to something... like an accident.

Question: That assumes AWS as the provider. Azure and GCP are two completely different beasts, as well

Answer: Quite right! You'd have to ask each one for permission to mess with their internals.

Question: Not surprised by these issues with NoSQL - Kyle do you have a comparison between Mongo and something like Dynamo?

Answer: I mean, if there's something I'd take away here, it's that SQL is, if anything, a harder problem! I've done several SQL databases and found safety violations in every one. In particular, SQL allows a very expressive range of operations with complex interactions, all (supposedly) at strong safety levels. Predicate reads, for example! Object stores are, I think, easier to implement. Mongo and Dynamo (the paper) are pretty much polar opposites: Mongo requires a majority quorum, Dynamo is totally available. Mongo aims for SI/linearizable, Dynamo is eventually consistent with vclocks. DynamoDB, though, is a whole other beast, right? I actually don't have a lot of insight into its behavior or design.

Question:  Thanks Kyle - yeah I've always seen the push for noSQL usages in general to not worry as much about the data and possible data loss but other issues like crossing read flows sound particularly concerning if they'd happen in any database

Answer: Yeah. I mean... I think you had no choice in some early NoSQL systems but to give up things like serializability, and that's still very true in things like Cassandra today. But there's no reason why object stores can't be transactional, and we're seeing folks like Cockroach, Yugabyte, Mongo work on those problems today. One thing that bugs me is that the traditional SQL databases I love, like Postgres, have... AFAICT, no safe replication story. I'm hoping that gets addressed someday.

Question: You mention above you don’t have much insight into Dynamo. Do you test opaque cloud services like it at all? E.g. do you have more insight into Spanner or CosmosDb?

Answer: Not really, no! I could absolutely test them from the outside, but most of the interesting behavior in Jepsen comes when we do fault injection: partitions, crashes, etc. It's certainly possible that we'd find out that, I dunno, Spanner is actually violating linearizability on a regular basis. Might be worth trying!

Question: Any thoughts on Raft vs Paxos? Would you say one has inherently fewer issues?

Answer: Good question! So like... there's no question that Raft has been revolutionary, right? Diego did the industry a huge service: we've seen an explosion in systems built on Raft instead of winging their own consensus algorithm. That said, Raft isn't necessarily the end-all-be-all of consensus systems. It forces a total order over decisions, for instance, which is great for writing a simple state machine, but not great when those decisions are logically independent. For those, some kind of Paxos might actually be preferable. 

What we see in practice is folks running hundreds of Raft instances, and then building a transaction protocol between them--that addresses the ordering bottleneck by sharding the order into many Raft instances, and writing transactions on top of linearizable/sequential components is a lot easier than doing them from first principles. 

Another issue with Raft is that the reliance on a stable leader means paying an extra round-trip if you're not on the leader itself. If the leader is in Chicago and you're in Beirut, that might add up! But you can imagine that a sliiiight tweak to Paxos would allow you to get away with only one round trip, rather than the 2 that Raft (or Paxos with leaders) requires. 

Heidi Howard clued me into this, and you should definitely look at her papers/talks--she knows way more than I ever will about Paxos, haha.

Question: “What we see in practice is folks running hundreds of Raft instances” - Would you be talking about Consul?

Answer: (Ah, yes, to be clear, these are generally hidden from the end user) Oh, um... if you're running hundreds of Consuls I also have questions, haha. I was thinking specifically about systems like YugaByte, Cockroach, and MongoDB, which run a Raft (or raft-alike) per shard, plus a txn protocol between shards. Users don't see the separate Raft instances--they're managed by the database. 

Question: Do you think one day when every device has an atomic clock in it some of those issues could go away? 

Answer:  Atomic clocks help! But they aren't the end-all-be-all of consensus. Spanner still does Paxos! Truetime just lets them skip going to a timestamp oracle for a txn timestamp. 

The Percolator paper might be good reading--you can kinda see how Spanner may have evolved from it.  

One other thing that might be helpful is a talk I gave on distributed txn architectures: https://www.youtube.com/watch?v=wzYYF3-iSo - I am pretty sure I got the spanner part of this WRONG in some way, so take that with a grain of salt.  It's based on reading the paper and DMing friends a bunch and playing telephone with members of the spanner team so like... ????  But it might be wrong in a somewhat helpful way, is what I'm saying, haha

Question: Have you run into any limitations of Clojure's transactions or other Clojure systems when writing Elle?

Answer: I don't use the Clojure STM (in-memory transactions)... pretty much ever. I do use atoms, promises, and futures heavily, plus some of j.u.concurrent's primitives. Those work really well! Elle is surprisingly fast: it'll do 100K transactions in a minute. If I were to rewrite it, though, I'd be looking to reduce allocation pressure on the GC, tighter structs, etc. I'm sure it could be more efficient, but there's a complexity tradeoff there. The stuff Elle does around provenance (WHY can I prove that these txns are a cycle) relies heavily on persistent data structures and some heavy libraries--they've been well-optimized, but maybe we could do better by going mutable. (I have ALL kinds of feelings about Elle's design BTW, happy to chat about that all day)

I didn't mention this in the talk, but if you're wondering about using Elle to test your own systems, take a look at the repo and paper:

<a href="https://github.com/jepsen-io/elle" target="blank">https://github.com/jepsen-io/elle


And here's the reports on Mongo ( https://jepsen.io/analyses/mongodb-4.2.6), 

Postgres ( https://jepsen.io/analyses/postgresql-12.3), 

and Redis-Raft (https://jepsen.io/analyses/redis-raft-1b3fbf6)

Kyle Kingsbury

Kyle Kingsbury, a.k.a "Aphyr", is a computer safety researcher working as an independent consultant. He is the author of the Riemann monitoring system, the Clojure from the Ground Up introduction to programming, and the Jepsen series on distributed systems correctness. He grills databases in the American Midwest.

Ends in Data

The internet was built on the principle of avoiding deletion. Its origin was a bomb-proof server back-up in the Cold War age. This established a philosophy of protecting data indefinitely, maximizing data, and championing more is better. Now overwhelmed and flooded with data, do we need a more balanced approach? Do we need an end for data?

Joe Macleod talks about a recent project with Markus Buhmann and Ana Lopez Niharra, looking at ends in data. Sharing the benefits and opportunities of data purging as a solution for some of society's biggest problems. The talk provides arguments from a technical, business, and consumer experience perspective.
It recommends a variety of techniques, models, and solutions to help balance the bomb proof data obsession. 


Question: OMG, cartridges. mobile phones are slightly better, but not a lot. (not to mention the printers themselves which seem to die almost more often than the cartridges run out) Has anyone here recycled a printer cartridge? Do you recycle your batteries?
I believe they count as e-waste so in Victoria at least you’re not allowed to dispose of them in the normal rubbish bin anymore, you can take them to your council’s waste transfer stations but they were closed during Stage 4 lockdown.

Answer: It is quite common to have endings framed in legislation. For me this means the ending has failed to be kept inside the consumer experience and has fallen to the responsibility of society. The company (for the printer ink cartridge) has distanced themselves from the relationship beforehand, leaving the consumer to resolve the delivery/off-boarding to society for the disposal.

Question: I know that's what we're not meant to do, but do people actually do this? Or have we been trained into disposable consumerism and just chuck it in the regular bin anyway?

Answer: Between 60-80 % end in Landfill. Which is scary.

Question: Who here has thought about the offboarding experience in your products? (Or are you now suddenly finding yourself contemplating it?) Mind you, for some products, how do you know if your customer has even left you? They bought something from you once, does that mean you consider they're your customer forever?!

Answer: That is a great question. If we had better off-boarding experiences, then we would develop better tools that can give us that important data. Right now so many tools only look and value at usage.

Question: We make it super easy for our customers to end their relationship with us. Because customers certainly think about that before signing on. Nobody likes to be locked in.

Answer: That is very true. I have a whole section in another presentation about how endings increase customer satisfaction, retention and sales. It could be argued that Netflix success over the last couple of years has been because of their no hassle leave and return.

Question: I’ve always wondered what “lifetime access”, “lifetime support” etc mean in a service context. whose lifetime? yours or the vendor’s?

Answer: Just one lifetime would be a good start. :) There was a local bank account promotion in the US in the 70s that promised Free Banking For Life. After decades of mergers in the banking sector that bank ended up with Bank of America. Who decided to end those Free accounts. One person took BOA to court to qualify what lifetime means. BOA lost and had to give all the accounts back.

Question: In tech we also seem very reluctant to sunset digital products/services… or even deleting source code for an old product/service that has been decommissioned (or never used) that is not planned to ever come back again..

Answer: One of the most common questions I get is how to sunset a product. And REALLY surprised how many companies prefer to keep their old site running alongside a new one. To only find their customers don’t want to move ever. Then are trapped running two product generations.

The failure I find isn’t in the execution of this. It happened far earlier in the communication of endings to the customer.

Off-boarding happens across the consumer lifecycle. Not just at the end.

Question: Do you have a mechanism for working out where your endings even are? Using a privacy impact analysis on your data feeds will maybe throw some light on what you should be thinking about, but that's a real bottom up approach. I thought maybe you had some better tooling/ways of thinking to get at the problem from a product-first approach?

Part of our prep work for GDPR involved working out what data we had (we were looking at personal information specifically, but for looking at lifecycle endings you can consider this data as a proxy for your customer), and looking at how it entered the system, where it got stored, in what format, for how long, which elements were shared with downstream vendors and for what purposes (and under what contracts to ensure no misuse!). And then when the customer or our system hit certain customer-triggered or automated actions (did not pay, upgraded plan, went over quota, sent too much spam) then what happened to that information.

From doing this analysis, I was thinking you could work out what some of those non-obvious endings were. When a customer deletes their account, that's a pretty clear ending, but if they haven't logged in for a while, or we think their account has been compromised, or someone contacts us to tell us it's a deceased estate and they have PoA to take control of the account... those are all the other endings which get forgotten, but are interesting for managing data storage and privacy and for the customer offboarding experience.

Answer: Thanks for that Nicola. I have not heard of that.

Because a lot of the end work is on the customer experience level, getting into specifics about a product, or company can be difficult. So it is great to have feedback and examples like this.

Joe Macleod

Joe has decades of product development experience across digital, physical and service sectors. Previously Head of Design at the award-winning studio Ustwo. He then spent 3 years on the Closure Experiences project researching, writing and publishing the Ends book. He is now founder of andEnd, a business helping companies end their customer relationships.

Inside Every Calculus Is A Little Algebra Waiting To Get Out

Because of deep learning, there has been a surge in interest in automatic differentiation, especially from the functional programming community. As a result, there are many recent papers that look at automatic differentiation from a Category Theory perspective. However, Category Theorists have already been looking at differentiation and calculus in general since the late ’60s in the context of Synthetic Differential Geometry, but it seems that this work is largely ignored by those interested in AD. In this talk, we will provide a gentle introduction to the ideas behind SDG, by relating them to dual numbers, and show how it provides a simple axiomatic and purely algebraic approach to (automatic) differentiation and integration. And no worries if you suffer from arithmophobia, there will be plenty of Kotlin code that turns the math into something fun you can play with for real.


Question: Is there a connection between Dual numbers and Complex numbers? Both have a similar form: (a + Ae) and (Real + i Imaginary)

Answer: Absolutely. Instead of i^2 = -1 we have e^2 = 0

One is a pair of numbers a+bi where i^2 = -1, but what is special about 1, why not -1 or even 1.


Question: Could you explain Skolemization again?

Answer: https://www.cs.toronto.edu/~sheila/384/w11/Lectures/csc384w11-KR-tutorial.pdf

If you remember logic programming, it is a technique used there as well.

Basically, you are eliminating existentially quantified variables by a function of the universally quantified variables on which that existentially quantified variable depends.

Question: Can you provide the book names that you showed in the talk

Answer: https://users-math.au.dk/kock/sdg99.pdf


Question: Could you elaborate a bit further on some examples of programming 1.0 (algebra) and 2.0 (calculus), and how the concepts in the talk related to programming? from my current (naive) viewpoint, this feels super abstract.

Answer: Often Software 2.0 is defined as differentiable programming, i.e. all you programs are differentiable. I.e. you do calculus.

Traditional programming is algebra https://books.google.com/books/about/AlgebraofProgramming.html?id=P5NQAAAAMAAJ

Question: Could you explain how these algebraic methods apply to deep learning? What's the link?

Answer: Training a NN uses automatic differentiation. To train a NN you use backpropagation to learn the parameters. Backpropagation is (backwards) differentiation. But you can "learn" any parameter of any (differentiable) program, as long as it is differentiable.

Question: Are you saying one of the points of this talk was to be able to identify the classes of program that can be replaced with a NN?

Answer: Not really, but I would argue that many programs have some of their parameters be learned. 


https://hardmath123.github.io/moire.html ,


Question: When you're talking about software 2.0, are you talking more about self-learning software? For those less initiated with calculus and the topics covered, what would be some examples of using the  maths presented?

Answer: Here is a really nice example https://fluxml.ai/blog/2019/03/05/dp-vs-rl.html

Yes, it is all about learning parameters (read initializing variables) from examples.

Question: This talk inspires me to re-learn calculus. Has anyone here used calculus as part of their day to day software development? If yes, please share.

Answer: Nice! All of deep learning relies on this stuff. But they make a big deal of it. What I am trying to show is that it is not that different from what you already know.

Question: When the building blocks of differentiation are then available, are you progressing to work on higher-level constructs such as defining distributions and then allowing operations on distributions, such as marginalisation of posterior parameters as well as repeated draws such as in tools like Stan?

Answer: Yup, that is Probabilistic programming!

As you know, that is really all about integration. The sad thing here is that the axiomatization allows you to derive all the theory of integration, but it does not suggest an implementation. For that you need to do Monte Carlo simulation.

Question: Is it leading towards a general domain-specific language for probabilistic programming or would be rather more leading towards lower-level building blocks enabling those more domain-specific applications?

Answer: For probabilistic programming, the trend is more towards DSLs because that makes it easier to implement inference efficiently.

We went from adding PPL primitives to Hack/PHP to defining a DSL Bean Machine in Python.

Question:So the underlying methods you're working on will eventually make this a fundamental part of the language instead of a hack at the top of the different libraries? Do you have a view on what is happening in the swift tensor flow api or is that also a hack on top of the other libraries (and perhaps even python)?

Answer: https://ai.facebook.com/blog/paving-the-way-for-software-20-with-kotlin/

We are building something similar of S2TF. Our PPL is more like a DSL https://pgm2020.cs.aau.dk/wp-content/uploads/2020/09/tehrani20.pdf

Erik Meijer

Erik Meijer is a Dutch computer scientist and entrepreneur. He received his Ph.D. from Nijmegen University in 1992 and has contributed to both academic institutions and major technology corporations.

Erik's research has included the areas of functional programming (particularly Haskell) compiler implementation, parsing, programming language design, XML, and foreign function interfaces. He has worked as an associate professor at Utrecht University, adjunct professor at the Oregon Graduate Institute, part-time professor of Cloud Programming within the Software Engineering Research Group at Delft University of Technology, and Honorary Professor of Programming Language Design at the School of Computer Science of the University of Nottingham, associated with the Functional Programming Laboratory.

From 2000 to early 2013 Erik was a software architect for Microsoft where he headed the Cloud Programmability Team. His work at Microsoft included C#, Visual Basic, LINQ, Volta, and the reactive programming framework (Reactive Extensions) for .NET. He founded Applied Duality Inc. in 2013 and since 2015 has been a Director of Engineering at Facebook.

The Science of Queues: Performance Monitoring for Themes Parks and Distributed Systems

Performance monitoring is an important part of running a successful theme park. Like a distributed system, theme parks have separate components (attractions), each with a queue of work to get through. How can we find out which of them are the least efficient? Which ones are slowing us down? Where should we spend time optimizing?

Join Mike for a roller-coaster ride through distributed system performance monitoring. Find out which measurements tell you the most about your system and how to optimize it. As an added bonus, you'll learn how to run a successful theme park! Mike has 20 years of experience developing and monitoring complex systems. In that time, he has visited some of the worlds greatest theme parks.


Question: Given SMTP itself is a (very cheap, lightweight, global scale) store-and-forward (queueing and work-stealing) mechanism, what's the difference between that queue server being unresponsive and the SMTP server being unresponsive?

Answer: Queuing systems are simple and tend not to fail very often. Once we've written to a queue, we can assume that the email will eventually be sent.

Question: I'm a bit more confused then. I can understand if it was: app → smtp (store and forward) ⇶ N x smtp (AV scanned) ⇛ smtp (external mail exchange) But that doesn't change the failure condition of the app if the queue server is down and the message can't make it into a queue (whether a "message queue" system or smtp).

Answer: You are correct. If the queue server is down, then the app server is still unable to handle work. However, we found that the queue server was a lot more robust than the smtp server and this approach would work whether we were dealing with SMTP, a remote web service, a disk, anything.

Question: Architecture diagrams are great for communicating with stakeholders

Answer: Agreed. I highly recommend the work of Simon Brown and his C4 architecture diagramming technique. I did a talk about that about 4 years ago too

Question: Did you use the Universal Scalability Law in your observations and subsequent modelling and optimizations? (e.g. you mentioned "deadlocks", did you measure contention and coherence as well as duration, concurrency, throughput?) 

Answer: I did not at the time. I might have saved an awful lot of time if I had.

Question: “Unlike a theme park, we can’t just close the gates” - good advice for Kmart’s web team - It's far from ideal, but isn't it a viable solution if you don't have the ability to increase capacity in the short term?

Answer: Yes. If it's the only level you have, you need to pull it. The alternative is that the system gets less and less responsive until it appears to have stopped.

Question: Could you also slow the additions to a queue? In the case of Disneyland, if there was a mini attraction in the queue then they take longer to join the main queue. As a result the perceived time spent is reduced and hence increases satisfaction. This has been done at some stores for Santa photos. You queue up briefly. You then go into Santa’s cave and have something to do before joining the main queue. Wait time is similar but you feel better about the wait.

Answer: Yes for sure. There's a lot of psychology that goes into the queues at Disneyland. They do a great job of making it seem like you are closer to the front than you really are. You can do that with a software system as well by identifying a subset of the process or message load and route it to a different endpoint that you tune separately.

I did write a blog post about some of these metrics if you're looking for more (or want to go over it again).

There's also a demo you can download and run (Windows only I am afraid) which allows you to tweak duration and concurrency and see the impact on queue wait time and throughput.

Mike Minutillo

Particular Software.

Hiding the Lead

Information hiding, coupling, and cohesion, microservices-style

The terms coupling and cohesion come from the world of structured programming, but they are also thrown about in the context of microservices. In this session, I look at the applicability of these terms to microservice architecture and also do a deep dive into the different types of coupling to explore how ideas from the 1970s still have a lot of relevance to the types of systems we build today.


Question: How does your desire for explicit schemata (certainly for ingested entities) measure up against Postel's Maxim (the Robustness principle)?

Answer: Postel's maxim describes a situation where we have untrusted parties sending us stuff. In a world where I can't trust the APIs I call to be sensible in changing their APIs, I might consider it, but that doesn't describe most microservice organisations I've worked in. The problem is this places the responsibility on the consumer, not the producer, to maintain backwards compatibility. That is to some extent unworkable if you want to ensure you can safely make a change to a microservice and release it. As the maintainer of a microservice, do I have to test the various consumer's tolerant readers? So I'd consider the use of a tolerant reader with untrusted parties, but I wouldn't adopt this as a generic solution to the evolvability of microservice interfaces within an organisation.

Question: I've been looking at Consumer-Driven Contracts for a while, but the tricky bit is semantics. The vast majority of the times we've had issues due to accidental changes it's been semantics not schema/type things. I haven't really found a good way of doing that

Answer: Honestly, it's just tricky. Fundamentally, the idea is simple - you want the consumer to be able to be explicit about the behaviour they expect of your microservice. The more complex your microservice, or the more nuanced it's behaviour, the more cases you might need to cover to catch all cases. Of course, there are other things you can/should consider doing around your release processes to mitigate accidental breakages. Canary rollouts are one example - at least then if you do encounter a breaking change in prod, its impact will likely be limited.

Question: You suggested only exposing the things that your consumers currently need, and avoid exposing everything in case they need it in the future. How do you avoid the problem where as you get more consumers, you end up adding more and more endpoints that effectively do the same thing, but return slightly different data according to the different new consumer’s needs?

Answer: Being consumer first doesn't mean you do everything a consumer asks for. Treat your service interface like a user interface. It isn't graphical, but in terms of how you evolve it, use similar ideas as to how you design a user interface. You need to try and accommodate everyone, but you have to balance the cost. Minor variations on a theme should result in a change to an existing interface, and such a change should be made in a backwards compatible way (e.g. expansion changes). 

If a type of consumer needs a radically different service interface - perhaps a fundamentally different style of communication (e.g. you provide request/response, but the consumer wants event-driven), then you might need to deliver a totally different style of endpoint.

There are also use cases where you might want to give different types of consumers different endpoints - e.g. internal vs external APIs, or presenting endpoints that don't expose PII to most consumers, but does expose PII to selected consumers.

Question: Any pointers, references, thoughts on specifics around designing interfaces up front for later backward compatibility changes?

Answer: The best advice I can give here is don't. Designing a service interface without knowing who the consumers will be will unfortunately inevitably lead to you a.) Exposing more than you need and still b.) Delivering an interface which isn't exactly what the consumer wants anyway.

If work is going on in parallel - e.g. you're creating the microservice whilst the consumer isn't ready, at the very least attempt to work with the consumer to do the best job you can of coming up with an interface that they want, and work to that, with the understanding that you might get it wrong.

Question: I particularly liked the call-out that coupling and cohesion are closely related.

It's made me realise that lots of discussions I've had in the past few years have mainly been around coupling in isolation, rather than including cohesion in the discussion. This has resulted in some interesting architectures that didn't really help improve cohesion and independent deployments. I'll definitely be making sure that both are discussed together in future!
Answer: Glad it was useful! I also like a quote I heard from Mike Nygard - "Cohesion is coupling we like". We get to make decisions about what code we choose to group together - or don't. But we should factor into that thinking how our software will be worked on. Hope that helps!
Question: The same ideas hold at a class level as well: high cohesion / low coupling is desirable.
Answer: Yep - that's one of the main points I was hoping would come across in the talk - these ideas are not new, they come from work at the code level. I'm just reframing these ideas for service-based interactions.

Question: Do you think, heavy use of design patterns in a single micro service can be a smell of tight coupling? A sign that we may need to revisit the domain modelling?

Answer: Not necessarily a sign of tight coupling, but the heavy use of very overt language around design patterns (or the overly heavy use of a certain pattern) can sometimes make me worried that someone hasn't thought about the domain, they've just tried throwing patterns at the screen till it works. For me, the domain of the system should be more obvious to us than the patterns used to implement it.

So it's a sign of general concern, rather than a specific concern around tight coupling

Question: When we startup microservice,  it is a big challenge to define each of the microservice boundaries.  A clear boundary could help us to reach high cohesion and low coupling :) . May I have your thoughts about the boundary of microservice?

Answer: If you're a startup, my general advice is NOT to create microservices. Stick with a simple single-process monolithic deployment. The monolith, as an architecture, is not inherently a bad thing, in fact I think for most people it's likely the most sensible architectural choice. I have specific concerns though about the use of microservices by startups. 

These concerns are outlined here: https://samnewman.io/blog/2015/04/07/microservices-for-greenfield/ and here: https://www.youtube.com/watch?v=aAbKULcthw0

but I can summarise the concerns thus:

a. As a startup, you have limited resources. Those limited resources should be focused on finding market fit, not building a distributed system

b. As a startup, what you are building is likely going to change, perhaps drastically. Your domain is highly fluid - that makes finding stable boundaries difficult. Expect lots of breaking changes across microservice boundaries with a highly fluid domain

c. Microservices can be useful to help you scale an application that is already successful. Scale in terms of handling more users, or supporting the concurrent development by a larger dev team. But as a startup, you don't know if you'll be successful, nor if you do become successful can you forecast exactly what constraint will need to be addressed. So knowing what sort of microservice architecture might be required is difficult

Now, there are always exceptions, but I think most startups should keep the software simple, and focus energy on finding market fit and making customers happy. And remember, a good product with "bad" tech will outperform a bad product with "good" tech any day of the week. Once you get successful, you'll have plenty of time to migrate to microservices later.

Sam Newman

Sam Newman, a techie interested in cloud, continuous delivery and microservices. Aside from other things he's committed sporadically to open source projects, spoke at more than a few conferences, and wrote some things including the book Building Microservices for O'Reilly.

Solving Problems like a Game Designer

Did you know that the first bullet by an enemy in a first-person action shooter always misses you? Would you like to know why a game once had to stitch a whole train to act as the head of a player character? Did you know that in third person games, the first two-thirds of your health bar is worth fewer points than the last third?

Game Designers are some of the most creative and potent problem solvers in the tech field - many solutions being odd, surprising and most importantly: Focused on the user experience like no other. We are teachers, storytellers, therapists, matchmakers in ways that are surprising and innovative for anybody who needs to solve complex problems for humans or human and tech interaction. This talk aims to talk about and teach some of the ways Game Designers work, go over some of the most fascinating solutions we have found for our games and why they exist, and hopefully reframe ways people in adjacent industries can learn from the approach, just as we learn from other tech fields.


Question: I really wish we did more work with user experience level and brought that into many of our everyday websites/apps

Answer: YES! I wish the question of what we want people to FEEL was more important than what we want them to DO. Because one should lead the other.

Question: Have you done much with translating some of these into good design principles for everyday apps?

Answer: I have not, but I would love to do a co-developed talk with someone who works in that field. I'd feel disingenuous to do this by myself because I'm not an expert in that field.

Question: I've always wondered why games tend to balance guns, etc differently, instead of using a base and just changing the appearance, thereby not having to actually balance anything, and it just being perception, and for example TTK to be the same across the board.

Answer: I can answer this! It wouldn't scale well and sometimes adjusting numbers is easier than making new models and sounds. :)  

I can think of a lot of things that would benefit from focusing on human perception and catering to it, if it's through "fooling" or other means.

Question: How do these mechanics contrast to games such as Demon's Souls, which prides itself in punishing its players for sport?

Answer: Okay so I have given whole other talks about the psychology of difficulty and Souls games are so interesting for that... I don't think Souls games are more difficult than others. I think Souls games have a progression system that is culturally viewed as intimidating: learning by doing. 

Demon Souls requires players to learn how to beat enemies by fighting them and dying their way through it. We perceive that as punishing because learning by doing is an intimidating way to learn. So I think it's inaccurate to just call them more difficult. It's just a different progression system, really.

Question: Are there any interesting studies that you recommend for the psychology of difficulty? It sounds fascinating.

Answer: Not sure about difficulty specifically, but I recommend anything that Celia Hodent writes. She writes books on game UX and player psychology and they are so good. Difficulty is a bit of a "new" topic for people to look into. I've started giving talks on why I believe we should ditch that terminology and talk about experience flavours instead.

Question: How do you feel about the "experience flavours" in TLOU Part 2 - I think that's the first game that's given such a wide range of options for modifying difficulty for a player, and letting you as the player create a tailored difficulty experience. A trend I hope to see more often.

Answer: I think it's a good start but I don't think it's quite there yet. When I've given talks about difficulty, I've argued that we should try and stop guessing what kind of modes people want and instead let people modify it themselves.

Difficulty modes are impossibly difficult to make and predict for hundreds of thousands of players with different perceptions of what is difficult. So instead, I advocate for just giving people options to toggle on and off according to what they struggle with. Pick and choose how much info is shown, pick and choose how AI reacts, toggle god mode etc.

I love deliberate design but I also want to trust my users to know what they need when engaging with my product, especially when it comes to something as personal as difficulty.

Jennifer Scheurle

Jennifer Scheurle is a multi-award-winning game designer most known the Earthlight franchise, which received the Game of the Year award at the Australian Game Developer Awards in 2017. In collaboration with NASA’s Hybrid Reality Lab, parts of her work are used to develop training for astronauts in VR. Jennifer’s work on the physical controller set for Flat Earth Games’ Objects in Space was nominated for the Alt.Ctrl.GDC award in 2017. In 2017 and 2018, she made MCV Pacific’s 30 under 30 list for her passion and public appearances on game design UX, diversity in games and educating audiences on game development processes. Her work on hidden game design has been published by major outlets such as Polygon, Rolling Stone Magazine and Variety. In 2018, Jennifer signed a book deal with CRC Press to write an advanced book on hidden game design techniques.

Scaling Your Architecture With Services and Events

This session is a deep dive into the modern best practices around asynchronous decoupling, resilience, and scalability that allow us to implement a large-scale software system from the building blocks of events and services, based on the speaker's experiences implementing such systems at Google, eBay, and other high-performing technology organizations.

We will outline the various options for handling event delivery and event ordering in a distributed system. We will cover data and persistence in an event-driven architecture. Finally, we will describe how to combine events, services, and so-called "serverless" functions into a powerful overall architecture.

You will leave with practical suggestions to help you accelerate your development velocity and drive business results.


Question: When pulling out tables from a shared database, what would happen to the foriegn keys?

Answer: You manage foreign keys explicitly in the application layer. You don't get automated referential integrity.

Question: In that case, is there a period of time as the saga plays out that you could have a broken data?

Answer: That's what I was trying to get at by talking about explicitly modeling the "intermediate states". You will be able to observe those intermediate ("pending", etc.) states. It's not broken; it's just not completed yet

Question: Is there a good read about use cases for workflows and saga with examples on how to implement one ?

Answer: There are some excellent talks. One by Caitie McCaffrey at J on the Beach in 2018. Many talks by Chris Richardson on the Saga Pattern. Chris Richardson also has an excellent book called Microservices Patterns, which goes over the Saga pattern in detail, and his website has some good stuff on it too: https://microservices.io/patterns/data/saga.html

Question: When distributed, how do you support db restores without losing the events sent from another service in the meantime?

Answer: Don't do that :). Restoring data that loses previously committed work would be a big problem whether you are using events or not. Typical storage patterns at large scale write to multiple disks or replicas to avoid data loss.

Question: For the saga model you showed are there not issues with creating the bi-directional dependencies?

Answer: Not necessarily. Theoretically one could imagine a workflow that went back and forth between two services in a series of "handoffs" of the workflow. I wouldn't recommend this approach, though. If two services were so tightly coupled, they really should be one service.

Question: A question that always sticks with eventual consistency is how to ensure what's important is processed in the right order.  In your customer address example if customer updates address and clicks on purchase, how do we ensure we're not shipping to the old address, as that eventual consistency may get there after the shipment service has taken it's stale copy ?

Answer: Excellent question. For this particular example, I would have a and for the address, since we would probably want to remember the history of someone's address changes. So then you could say that all packages sent after would use the new address. The generalization of this idea is to make it clear this data is valid as of XYZ time. This helps a lot in analytic use-cases where we are showing data that is "as of" a certain date, and helps to make clear that the data might be stale.

Question: Another example here might be that we have some data changed in a CRM system and a processing system directly after, but the processing depends on the data that has been stored in the CRM system.  Any tips on how to handle this?

Answer: I think I'd have to understand the use-case in more detail to make a good recommendation. It sounds like the problem is that you update the CRM, but the CRM is eventually consistent itself, and returns stale data to the processing system? If there is any way to get the CRM system to fire off an event when it has the update, that might help?

Question: Ok, let me explain this better.  So give the use case of onboarding an investor, where we create the investor/person in the CRM service.  We then call an Account Service to create the account for the investor, but due to eventual consistency between the CRM and Account service the investor details are not in the Account service yet.  Hope this makes better sense. Any tips on how to handle this?

Answer: Sounds like InvestorOnboarding should be a workflow instead of a succession of synchronous calls that are all expected to immediately succeed. There is clearly a state machine here: started -> personal-info-available -> account-created -> etc. Asking yourself how you know when the details are available in the CRM will help you answer how to transition from state to state. I hope this helps.

Question: How do you deal with missed events? Do you use event backplanes that can be trusted, or do you deal with it in other ways? e.g. listen to events and maintain a cache, but synchronize periodically, just in case?

Answer: Great question. There are a number of approaches I have used:

  1. Use a "reliable" transport
  2. If you don't trust (1), use a periodic "reconciliation batch" to find and synchronize inconsistencies. Financial institutions do this all the time with each other.

Strong point of view here: You need to be using a reliable transport like Kafka. I have spent seemingly years of my life debugging non-reliable event systems like Rabbit. The fact that it mostly works doesn't help when it doesn't. 

Question: Do you try and put a distributed transaction around the creation of Materialised View records?

Answer: Essentially you use Sagas. I.e. When you see the first event in a workflow you create a “record” in your materialised view, then wait for the child events to come in to fill in the details. You leave a "hole" for the other side of the join to come in and fill.

Question: Are there any ‘event contract’ standards that are winning out as the predominant industry standard the way swagger transformed RESTful API’s ?

Answer: Async API is one of them https://www.asyncapi.com/

Question: Do you have any good practices for tracing events as they go up and down through an entire system? For example, if data enters the system through the APIs at the front gate, do you have any recommendations to be able to trace that data at scale, all the way to the data store at the other end of that pipeline, especially if you have events that number in the billions?

Answer: As others suggest, the way you phrase the question lends itself to a correlation / request id which you pass along.

Question: Any pitfalls to watch out for when making use of CDC?

Answer: Depending on how complex your DB schema is, CDC might be too low-level. If you are writing or updating multiple rows across several tables as part of a single semantic operation, it might be hard to "reconstruct" the real event from those individual DB changes.

On the plus side, CDC is dead simple.

Randy Shoup

VP Engineering and Chief Architect

Tune in to C#

As you were looking the other way, C# became a cross-platform, open-source, high-performance, general-purpose, hyphenated-buzzword programming language. It is also very popular! I’ll take you on a journey of language design nerdery, targeted equally at C# newbies and oldies. Let’s see how some of our recent features take on the null menace, immutability, and value semantics in the context of object-oriented programming, and peek at some of the next ideas we’re tinkering with.


Question: About 6(!) C# versions ago, Anders Hejlsberg swatted down the idea of AOP in C#, and now with Source Generators, even if the code generated is immutable, isn't this dipping into C# metaprogramming in itself? And if developers are going to be allowed to hook new extensions into the C# compiler such as Source Generators, why not do the Full Monty (TM) and open the compiler up for more extensions from the community?

Answer: I get what you're saying, but we really like the separation between programming and metaprogramming. With source generators we think that we got that to a pretty elegant place! To make sure that generated source fits in well with manually written source, we have somewhat extended the "partial methods" feature to better merge ("weave" ;-)) things together.

Question: I assume you can nest records, and if so is it still good comparing parent records?

Answer: You can definitely nest them! Equality on the top record just calls whatever equality is on its nested objects, so if they are also records it will be recursive.

Question: Do C# records have “Copy on write”, like Swift?

Answer: Not really. When you do a "with" expression you immediately get a copy, and any changes in the object initializer immediately applied. Unless you cheated and made the record mutable (which you can), you will have no more changes after that point.

So in a sense, you have the copy and the write immediately!

Question: I was thinking of a deeply nested record structure. If you change one of the properties of the “parent” object (using a “with”), do all of it’s member records get copied too, even if they are not changed?


record Name(String first, String last)
record Person(Name name, Int age)
x = Person(Name("Dominic", "Godwin"), 21) // Not really 21 :slightlysmilingface:
y = x with { age = 22} // (sorry can't remember the exact syntax

Does person.name get copied or does it point to the same object?

Answer: The copy is shallow. It is closely modelled over the copy that happens when a struct is assigned; i.e. reference types are copied by reference, value types by value.

We're thinking about allowing nested non-destructive mutation, but that didn't make it into C# 9.0

Question: Also, is there any point to c# structs, now we have records?

Answer: Structs and records have the same equality semantics, and the copy that happens in a with-expression on records is the same that happens on structs by any assignment.

Structs don't require heap allocation, so they have very different runtime, performance and memory properties.

Question: Anything interesting to consider with using records for EF Core/JSON serializers etc?

Answer: The tricks that allow deserialization into even immutable data models (with readonly fields inside) still work.

The ORM aspect of EF, tracking the state of mutable objects, that really works better with a mutable object model!

Question: Are there any plans to add extension properties?

Answer: Yes. They would come in as part of the extensions I showed in the last part of the talk,

Question: Any plans of adding the javascript spread operator into c#?

Answer: Not currently. Should we?

Question: Here is a small example using javascript syntax.

Wasn’t sure if you could add this sort of syntactic sugar to the c# records?

const person = { firstName: 'Gary', lastName: 'Butler' };
const modifiedPerson = {...person, middleName: 'Fred'}; // Add middleName
const anotherModifiedPerson = {...modifiedPerson, middleName: 'John'}; // Replace middleName
const copyOfPerson = {...person};

Answer: Got it! The interesting trait here is that this use of the spread operator actually creates new types (with more members), and then copies existing values in. This is a great fit for a structural type system like TypeScript's, but kind of grates against a nominal type system like what C# has today.

That said, we have been talking about data transformation scenarios, where record-like data moves through filters and projections. We already have something like that in LINQ (language integrated queries), where we introduced anonymous types to represent "shapes" that are ephemeral and don't need a type name. Those are quite limited today, and can't cross assembly ("module") boundaries, but we are wondering if it's time to generalize things here. Ways of creating shapes from other shapes (such as the spread operator) could be handy there - thanks for bringing that up!

Question: Do Roles == Shapes?

Answer: Roles are an evolution of Shapes, if you look at the GitHub issues on them. Exploration: Roles, extension interfaces and static interface members · Issue #1711 · dotnet/csharplang (github.com)

Question: Does C# have higher-kinded types yet?

Answer: No. It's not super high on our priority list. 

Question: Is there any sort of sugar to make it easier to access an IMonoid that's exported by some library and also access the IntMonoid role at the same time (assuming the library has exported both the interface and the implementations of it for some common types)? Or is the solution just to import both 

Answer: That's a level of design details we haven't gotten to yet. If you look at current extension methods, they are brought in with using directives, and I could imagine bringing in a namespace that has both the interface and its extensions for well known types in one fell swoop.

Question: You spoke at the end about extending/modifying the .NET runtime to be able to support new language features, to what extent is F# involved in this - are we likely to see any language improvements enabled by new runtime capabilities?

Answer: We talk to the F# team regularly, and we always think about how to cross-pollinate the features, especially when they affect metadata or runtime. Same goes for VB.

Mads Torgersen

Mads Torgersen is the language PM for C# at Microsoft, where he runs the C# language design process and maintains the language specification. He is also on the design team for TypeScript and Visual Basic, and he contributes to the Roslyn project, which has reinvented the C# and VB compilers and taken them open source.

Many years ago Mads was a professor, and also contributed to a language starting with J.

Organization - A Tool for Software Architects

Conway's Law, domain-driven design, microservices - the most important modern software architecture approaches use the organization as a tool for architecture. But software architects often have only limited influence on the organization. And teams should be self-organized - so how can you even influence them at all?

This presentation shows what exactly it means to use the organization as a tool for architecture and how software architects can use concretely. Because even if you are a manager: Organizations are people - and you can go out and work with them!


Question: Is there also the problem of questioning everything, if we do that then nothing really gets done? Tends to be the issue I run into in teams I work in with those people who say you must do what I say.

Answer: Yes, you need to stay away from pointless or "religious" debates. But a team that doesn't get anything done is dysfunctional. To me, the question is whether an architect should be the one who is supposed to fix that. Certainly, they can help, but I personally don't think I would be able to solve that problem myself.

Question: I always get sucked in as I am passionate about reaching the "best solution" (lack of a better word)

Answer: I think that is quite natural. However, nowadays I try to figure out whether the decision that was made solves the problem at hand and if it does, it is fine by me. So I prioritize consensus and self-organization over the best technical solution. That is tough, though. It also helps me to realize that I might be wrong and in fact, the chosen solution might not be just "good enough" but actually the best....

Question: Decision-making purely by collaboration can lead to blockages on important decisions though - we have a senior technical staff for a reason.

Answer: Good point. However, often I think that the decision itself is not the problem - but enforcing it. At least I often get the question "How do you enforce the rules" - and I think if you convince people beforehand because they take part in the decision process, that problem might be easier to solve. And also your decision might be better because there is more expertise available. But in some cases that might be overdoing it and sometimes it feels slow and cumbersome.

Question: This assumes some form of collaboration is required--but how do you deal with cases where the culture says "You're the architect", and refuses to make any decisions?

Answer: Yes, that is a great question! I would even argue if you do have the title "software architect" you can't really escape that responsibility. So if there is a decision that in your opinion will make the project fail, you will need to step in and probably overrule it. If you have the responsibility, you will be able to do that. However, it will hurt self organisation and the decision-making process. So it is the last resort. I believe architects overrule decisions even if they don't have that risk so that is why I focus on the advice to do self-organization in the talk.

I have to admit that cases, where teams don’t want to make decisions, is not something that I see frequently. I guess I would try to understand the reasons behind that and consult with e.g. the Scrum master to figure out what to do. IMHO it's a social problem. I would try to stay away from just making the decision myself for as long as possible.

Eberhard Wolff

Fellow INNOQ

Other Years