Crowdstrike, eBPF, Kafka... What is this, a crossover episode? – JVM Weekly vol. 94
I'm still somewhat in vacation mode, but it's high time to return to publishing. Especially since today we have a heavily crossover episode.
1. A "short" story about what eBPF is and how it relates to Java, inspired by a mishap from CrowdStrike
If you haven't been living under a rock, you probably know that about a week and a half ago, there was a massive global outage affecting major airlines, TV broadcasters, banks, and other critical services. The outage resulted from a defect in the update on CrowdStrike's Falcon platform and the culprit was a faulty software update, causing numerous Blue Screen of Death errors on Windows computers.
CrowdStrike published a Post Mortem of the entire event. CrowdStrike Falcon is an advanced endpoint protection platform that uses signatures and artificial intelligence to protect against cyber threats. It monitors and analyzes endpoint activities in real time using AI and machine learning models to detect threats. The platform integrates various detection techniques, including behavioral heuristics, without needing to change the sensor code. An example is the introduction of a new type of template in sensor version 7.11, which better detects attacks using interprocess communication mechanisms in Windows. However, the problem arose when the update required 21 input parameters, and only 20 were provided, leading to system crashes. The incident prompted CrowdStrike to improve its testing and validation processes.
Microsoft also published a Post Mortem, highlighting the dangers of over-reliance on kernel-mode drivers in security tools. These drivers increase the risk of severe problems, as seen in the recent outage. Microsoft suggests that security vendors can achieve a better balance by limiting kernel space usage and leveraging user-mode security and interfaces offered by Windows. Additionally, Microsoft is working on strengthening Windows security, including support for the Rust programming language in the kernel, to reduce dependence on kernel-mode drivers.
After the security disaster with CrowdStrike on Windows, security experts and developers are seeking safer methods to run low-level security programs. One possible solution is eBPF. Brendan Gregg, a system performance expert and Intel Fellow, in his article No More Blue Fridays, suggests that eBPF can prevent future disasters like the one with CrowdStrike. Acting as a Swiss Army knife, eBPF allows software to run in a virtual machine (VM) within the Linux kernel, providing performance through a JIT compiler and security through a verification mechanism.
eBPF (Extended Berkeley Packet Filter) is a modern tool built into operating system kernels that enables safe and efficient code execution in kernel space. Its main advantage is the ability to monitor and analyze system behaviors in real-time without the risk of crashes. eBPF is used for tasks such as security, network monitoring, and system resource management. Initially available for Linux systems, eBPF support on Windows is being developed, allowing broader application of this technology on Microsoft platforms.
Gregg explains that eBPF programs cannot cause system-wide crashes because they are safety-checked by a software verifier and run in a sandbox. If the verifier finds unsafe code, the program is rejected. Cisco, Google, and Meta already use eBPF in their production systems to detect and stop bad actors.
You might ask, why such a topic in a Java newsletter? Johannes Bechberger published the latest part of their series of articles on eBPF - “Hello eBPF: Developing eBPF Apps in Java”. It aims to explore and document the development of a Java library for writing eBPF programs, which until now have been supported in languages such as C++, Rust, Go, Python, and Lua, but not Java.
Johannes is translating examples from Liz Rice's book Learning eBPF into Java using the new Foreign Function API (Project Panama) and the bcc library, enabling Java programmers to create eBPF programs in their favorite environment. The project's primary goal is to create a comprehensive Java API for eBPF that closely matches the existing BCC API for Python, making it easier to use and transition to Java. The initial implementation focuses on translating Python code into Java, with plans to support the newer libbpf library and provide the API widely via Maven Central.
However, is eBPF really the solution for everyone needing commercial software with kernel drivers? The New Stack analyzes this topic in detail. Currently, eBPF is not ready for production on Windows, but Brendan Gregg believes it is only a matter of time. Other researchers doubt whether eBPF is the ideal security platform. Yashin Manraj, CEO of Pivotal Technologies, emphasizes that while eBPF offers a safer environment for running kernel code, it is not a magic solution. Complex BPF programs can contain unforeseen bugs, leading to service instability even though they do not cause system-wide crashes. Additionally, eBPF programs can be vulnerable to exploits, and debugging them is difficult. Furthermore, Tomer Filiba, CTO of Sweet Security, points out that eBPF requires high privileges (CAP_SYS_ADMIN or "root"), which can lead to accidental deletion of important system files or server configuration changes. eBPF can also write to user memory, causing program crashes.
I know, it's not very Java-like, but I must admit I got pretty deep into the topic, so I wanted to share my findings.
2. Apache Kafka with GraalVM support
Now something unusual, about... Apache Kafka. A few years ago, it was the most fashionable architectural component, today it’s gaining some bad press (albeit in good faith, like the recent viral Kafka is Costing You Years of Engineering Time). Despite criticism and emerging alternatives, Kafka remains a strong player, evolving in interesting directions. E.g. recently, everyone was focused on the deprecation of ZooKeeper in favor of the KRaft algorithm (in Kafka 4.0, its deprecation was announced).
So, why am I writing about this in a Java newsletter? As many probably know, Kafka and Java are closely linked since Apache Kafka is written in Java. This means that platform optimizations have a tangible impact on its operation, which, however, is usually barely noticeable with each release. This time is different, which is why I decided to delve deeper into the topic.
In the latest Kafka 3.8 version, KIP-974: Docker Image for GraalVM based Native Kafka Broker was introduced, with support for GraalVM. Traditionally Java-based Apache Kafka brokers often experience startup times of several seconds. Although this delay doesn't significantly affect long-running production environments, it can be inconvenient for developers needing to start multiple brokers during local tests or in situations where emergency startups are necessary. To address this, an experimental Docker image for Apache Kafka using GraalVM to create a native Kafka broker binary was proposed. This native image aims to significantly reduce startup times to fractions of a second and minimize memory usage, running in KRaft mode – the whole idea inspired by Ozan Gunalp's work on the kafka-native project.
The GraalVM-based Kafka broker shows dramatic improvements in performance metrics compared to its JVM counterpart. Performance tests indicate that the GraalVM broker starts up about nine times faster and uses significantly less memory, with peak memory usage dropping from around 1 GB in the JVM to about 500 MB, or even 250 MB depending on the garbage collector used. The new Docker image will be built, tested, and released alongside the standard JVM-based Kafka image. At the same time, the GraalVM-based image is recommended for development and testing, not for production, due to its experimental nature and the lack of historical community experience with GraalVM-based runtimes.
But that's not all – if you feel like refreshing your knowledge or just want to learn more about Kafka, Baeldung has recently refreshed its Apache Kafka course. For those who don’t know, Baeldung is a popular educational portal specializing in creating courses and technical articles, mainly concerning Java technologies and related tools. Baeldung is valued for its detailed guides and tutorials, offering a rich knowledge base on various frameworks and tools such as Spring, Hibernate, and of course Apache Kafka. I’ve often referred to Baeldung’s tutorials myself.
The course is free and covers a wide range of topics, from basic to more advanced aspects of this technology. It starts with an introduction to Apache Kafka and its integration with Spring, as well as an introduction to Kafka Connectors. For those interested in security, the course also offers SSL configuration in Kafka using Spring Boot. Additionally, the course includes practical guides on configuring Kafka using Docker (which ties in with the main topic of the section). The course also covers managing topics and partitions in Kafka, including handling them with the Java client.
Overall, I recommend it, a great source of knowledge if you encounter Kafka in a new project.
3. Quarkus has found a new house... the Commonhaus
A few editions ago, I informed you that Quarkus was looking for a new home, preferably under the wings of some foundation. Despite its huge growth and community support (700+ extensions), there were still concerns about the project's dependence on Red Hat, which historically had made some controversial decisions.
Quarkus still aims to become the new standard for Java frameworks, which requires addressing these concerns. So, last week, Max Rydahl Andersen, the project coordinator, announced that the project would be under the Commonhaus Foundation.
Commonhaus Foundation is a non-profit organization dedicated to supporting and developing open-source projects by providing a neutral and independent management model. The foundation offers infrastructure, resources, and administrative support, enabling projects to focus on innovation and technological development. Commonhaus operates on principles of autonomy, where each project is managed by a dedicated technical committee composed of community members. This model creates a space where all stakeholders – from individual developers to large organizations – can collaborate on equal terms without the dominance of any single company or interest group.
And Quarkus is in good company. Commonhaus is home to many significant projects in the Java ecosystem, such as Hibernate and Jackson (which I assume everyone knows), EasyMock (the name says it all), Feign (reducing HTTP client binding complexity), JBang (allowing Java applications to be run as scripts), JReleaser (automating the release of Java projects), Morphia (facilitating work with MongoDB documents), Objenesis (creating objects without invoking constructors), OpenRewrite (automating code refactoring), and SDKMAN! (managing SDK versions on Unix systems).
It seems that joining the Commonhaus Foundation will bring many benefits, strengthening Quarkus' position. Thanks to the neutral environment, Quarkus will now be able to attract a broader community of developers and organizations that might have previously feared overall dependence on Red Hat.
Bonus: Most watched Java talks of the first half of the year
To conclude, the latest special edition of Tech Talks Weekly: TTW Extra #6 🔥: All Java talks of 2024 so far..., which summarizes the most important Java presentations that took place at various conferences in the first half of 2024. Among the highlighted events are Spring I/O 2024, various editions of Devoxx, Voxxed Days, JChampions, Devnexus, JCon Europe, DevBcn, JNation, JAX London, and London Java Community meetups. The article lists presentations sorted by the number of views, helping readers see which topics drew the most interest.
Among the most popular presentations at Java conferences in the first half of 2024 were talks covering a wide range of topics, from seemingly basic to more advanced programming concepts. Interestingly, Chris Simon attracted the most attention with his practical approach in TDD & DDD from the Ground Up Live Coding, where he showed how to apply Test-Driven Development and Domain-Driven Design in real projects.
Maciej Walkowiak also gained attention at Spring I/O 2024 with his presentation Implementing Domain Driven Design with Spring. Evidently, DDD is still alive and people want to hear about it, and knowledge on how to integrate these techniques with Spring attracted over 25,000 viewers. Continuing with Spring, Josh Long presentation Bootiful Spring Boot 3.x garnered a similar audience, focusing on the latest features and optimizations in the newest version of Spring Boot.
As we go further, it gets less obvious and more interesting. Next on the list is Roy Van Rijn, who in his presentation titled Pushing Java to the Limits: Processing a Billion Rows in under 2 Seconds talked about the #1BRC challenge from the beginning of the year and demonstrated how to process large amounts of data using Java efficiently. Marcus Hellberg from Vaadin, in Java meets TypeScript: full-stack web app development with Spring Boot and React, discussed how to combine Java and TypeScript technologies to create full-stack applications.
Oracle presentations were also a must. Nicolai Parlog discussed Data Oriented Programming in Java 21, introducing participants to the new data-oriented programming paradigm. Brian Goetz, at Devoxx Greece 2024, in Java Language Update -- a look at where the language is going provided insight into future directions for the Java language, which was crucial for the community following changes in the Java ecosystem.
As always, I encourage you to follow
– you can catch a lot of great things there.PS: Yes, I am aware that the JVM Language Summit is currently taking place, and I hope to dedicate an entire edition to JVM, its new features, and upcoming JEPs next week. I already have a substantial collection of articles published by the community testing these innovations. In the coming week, I plan to take a closer look at these innovations to provide detailed analyses and discuss how they might impact the future of programming on the JVM.