Q&A with Stefan Priebsch, an IT consultant

"Pretty much everybody today seems to be on a journey from monolithic systems to service-oriented architectures."

05. Sep 2022

Stefan Priebsch is a computer scientist and IT consultant, as well as the co-founder of thePHP.cc, a consultancy for PHP and related open-source technologies. We spoke with him about his business and his upcoming live event on devmio, and he shared some of his expertise with us.

Tell us a bit about yourself and how you came to establish thePHP.cc.

Hello, I'm Stefan. I have been working with computers for about 40 years now, and have been an IT consultant for close to 30 years. The story behind thePHP.cc is actually interwoven with Software & Support Media: once I started speaking at International PHP Conference, I was subsequently asked to write a book for entwickler.press.

When Entwickler Akademie was founded, I believe that was in the year 2007, they asked me whether I would do the PHP-related training for them. After a while, I brought in two other guys, Sebastian Bergmann and Arne Blankerts. That's what got the three of us started to work together. And that collaboration would later become The PHP Consulting Company (thePHP.cc).

You will be hosting a live event on devm.io titled “How To Get Distributed Systems Right.” Could you tell our readers a bit more about what you will be talking about? What will they have learned by the end of the workshop?

Pretty much everybody today seems to be on a journey from monolithic systems to service-oriented architectures. With cloud computing and containerization, this makes perfect sense, of course, and in essence that "just" means applying well-known OOP best practices like encapsulation or single responsibility at an architectural level.

Comparing and discussing various solutions to make distributed systems more resilient will be one of the main topics of my workshop.

My workshop takes the attendees on a journey where we start with a simple code example, and then evolve it into a more complex, distributed, and asynchronous system. We'll discuss the problems that arise, and how to properly solve them. Attendees will leave with a good understanding of the problems that distributed systems bring along, and potential solutions to those problems.

Do you think PHP is the best programming language to use for creating distributed systems? What benefits does it offer compared to other languages?

I love PHP because it may be the most "open-source" language there is. PHP is not owned or controlled by any company, and its development is a true community effort. In my opinion, PHP is well-suited to build web-related systems, and web-related nowadays to a large extent also means distributed. After all, the web itself is a distributed system, and nobody builds stand-alone software on the web today. Every new system needs to be integrated with existing systems in one way or another.

Do I think that PHP is the “best” language for creating distributed systems? I don't think that there can be a single "best" language for anything. All languages have some strengths and also some weaknesses, just like PHP. What "best" means depends largely on the use case and also on personal preferences.

Much more important than the choice of programming language are the developer skills. You can write bad code in every language, and learning to build good solutions takes time and effort. Hence my workshop, it's not at all PHP-specific, but teaches solutions and patterns that help you design and build adequate, working software, regardless of what programming language you prefer to use.

How do synchronous and asynchronous systems differ?

They differ in the way they communicate. If we have a conversation on the phone, we are both on the line at the same time, and each one is waiting for the other's response. We communicate synchronously.

If we have a conversation by email, one sends out an email, and then goes off to do something else. The other one will answer the email by writing a reply at their earliest convenience. This is asynchronous communication: there is no need for both parties to be online at the same time, but error handling gets far more complicated. What if you did not receive my message? If I am waiting for your reply, I never know whether it's going to arrive any minute, or if you will never respond. Asynchronous systems are easier to scale, but are far more difficult to handle, especially when it comes to debugging. In practice, it should be a very conscious and informed decision whether to use synchronous or asynchronous communication, and that decision should be explicitly made for each use case.

What are legacy systems and why are they still around?

There are various definitions for the term "legacy system". One would be "software without tests," meaning systems that are hard to change because there is not enough test automation. Grady Booch once said: "legacy systems are systems that work," suggesting that every system that runs in production can be considered "legacy." Whatever definition we agree upon: there is so much existing software in the world, and it has taken us decades to build all that software. There is no way in the world we could suddenly replace all this existing software, even if we tried.

So legacy systems are still around because nobody can afford to replace them. Even if you had enough money to spend, there would not be enough developers available. The situation is pretty similar to the one we are facing with regards to housing these days: a lot of old houses take a lot of energy to heat, so replacing them with new energy-efficient houses would be a good idea. But we are lacking the resources to do so.

What are some of the most difficult aspects of replacing a legacy system with a new one?

Number one: nobody really knows what the system does. I am not talking about mainstream use cases like "create an invoice," but all the special cases and edge cases that have been built into a system over time. For legacy systems, those special cases are mostly not covered by automated tests, so changing the implementation is risky, because nobody can tell for sure whether the software will still work as expected after a change.

The number two reason is that businesses tend to adjust their daily work to existing systems. A case worker quickly gets used to doing things in the way the software expects. After a while, they do not distinguish between the process and the software any more. So when you consider replacing an existing system and ask a caseworker what they need, the answer often is "make the new system work like the old one." This makes it very difficult to innovate, also because it takes us back to problem number one.

There is so much existing software in the world, and it has taken us decades to build all that software. There is no way in the world we could suddenly replace all this existing software, even if we tried.

On the other hand, asking somebody to completely change the way they do their daily work is challenging. A replacement system might not allow certain workarounds that were used to deal with weird edge cases. And if you have a customer on the phone or in the line, browsing the documentation or calling the help desk is not really a viable option.

So the really difficult part in replacing an existing system is getting everybody on board, and balancing innovation and conservation. Gradual migration is often better than a big bang launch of a new system, and thankfully web software makes this extremely easy to achieve, at least compared with embedded software, for instance. If we can create new software that works alongside the existing systems, and we gradually take over use cases in the new software, that makes it much easier for case workers, because they don't have to relearn everything at once. And as software developers, we get to analyse requirements and implement them in smaller portions, so Agile Development becomes much more feasible than when trying to build a big system at once.

What are the benefits of the Self-contained System (SCS) approach?

To me, a self-contained system is a system that can process one use case independently, without requiring interaction with another system. Others might have a different definition for this term, but this is what I work with.

The obvious benefit is that there is no dependency on other systems when processing a web request. Distributed systems generally become less reliable with each additional service. That is because the more services there are, the less likely it gets that all services are available at a given point in time. If you build systems that depend on a large number of other services at runtime, you are likely to get into trouble in production, for example, because one service is down, or maybe just slow in responding.

Comparing and discussing various solutions to make distributed systems more resilient will be one of the main topics of my workshop.

What is the CQRS principle and what domains is it best used for?

CQRS (Command/Query Responsibility Segregation) is a generalisation of the CQS (Command/Query Separation) principle, which is a fundamental OOP design principle that we use all the time: a method either asks a question, that means it retrieves object state, so it's a query, or it is a mutator that changes object state, that would be a command.

We find the same type of separation in the HTTP protocol: a GET method, for example, retrieve state and is idempotent, and a POST method tells the server to change state. This separation allows for heavy caching of content, which makes the web perform in the first place.

CQRS means to apply this separation - or segregation - principle also to your application design: write code that deals with answering queries, and write different code that deals with state changes (commands). A lot of ORM-centric software that still gets built these days blatantly ignores this separation, leading to a lot of unnecessary follow-up problems like bloat, cache consistency, cache invalidation, and performance in general.

CQRS makes the most sense where you read more often than you write, which usually is true in most web applications. When you write more often than you read, for example when logging, the archetypical CQRS architecture makes less sense, but a clear separation of concerns in the CQRS or CQS sense at the code level still pays off.

I love PHP because it may be the most "open-source" language there is. PHP is not owned or controlled by any company, and its development is a true community effort.

When it comes to PHP, is there anything you hope the language will add in the future?

How about some "always keep your code clean and readable" magic? But wait, that should not only be added to PHP, but to all other programming languages as well (smiles).

Thank you for taking the time to answer our questions! We are looking forward to your live event!

Stefan Priebsch will share his knowledge in a live event on September 15 at 2 PM CEST. With our Full-stack Access, you can join the event for free!

"Pretty much everybody today seems to be on a journey from monolithic syste...

"Pretty much everybody today seems to be on a journey from monolithic systems to service-oriented architectures."

Recommend

The Best Buy Labor Day sale is live — save hundreds on phones, TVs, and more

字节扛起互联网金融大旗？

国内首家全铁液流储能系统解决方案提供商巨安储能完成天使轮融资

实验员的性别会影响氯胺酮对小鼠的抗抑郁效果

消息称苹果先推MR装置再推AR眼镜

国家级权威期刊《中国信息界》发表孙宇晨元宇宙观点文章-品玩

折腾一晚上的事情，明白了一个道理

关系型、非关系型数据库存储选型盘点大全

[17Jun20][Guide] [P/Q]How to root Note10/Note 10+ & Flash TWRP (Exynos only)

The Joy of Learning to Drive at 37 - The Atlantic

About Joyk