Best of Foojay.io May 2024 Edition – JVM Weekly vol. 83
I promised that this month we would have some unconventional editions, and I am keeping that promise 😉
For some time now, I've been mulling over the idea that this newsletter is a bit too much on the bleeding edge. Don't get me wrong—I love pushing boundaries, but I feel like it's also worth focusing on the more practical side of our work. That's why I've decided that, at least occasionally, I'll bring things down to earth. I promised you some experiments and one of them is joining the Foojay.io family.
Foojay.io is a vibrant, community-driven platform created for OpenJDK users, mainly Java and Kotlin enthusiasts. It offers a wealth of resources, information, and updates for developers at all levels. As a hub for "Friends of OpenJDK," Foojay.io hosts a rich collection of articles written by industry experts and active community members, providing valuable insights into the latest trends, tools, and practices in the OpenJDK ecosystem. There are lots of great authors and plenty of interesting content.
So, instead of picking specific articles and dedicating entire sections to them in the weekly newsletter, once a month I'll focus an edition on selecting a few interesting ones that might be useful to you or at least broaden your horizons by presenting some cool practices or tools.
Let's Start with a Podcast
The Foojay podcast has been running since 2021 and is (mostly) hosted by Frank Delporte. In addition to the usual talks with interesting people from the community, Frank has recently been delving into the history of significant Java User Groups and the people who create them. In this episode, we learn about the JUG from Oberpfalz in Germany. The uniqueness of this episode lies in the fact that the organizers of this local group are also responsible for the JCON conference in Cologne, which is happening as this newsletter is published. The podcast offers insights into the German Java community, their initiatives, and their contributions to the development of this technology, highlighting the importance of local user groups in a global context.
All the previous episodes of the podcast are available here.
Spring AI: How to Write GenAI Applications with Java
This week is definitely dominated by AI, considering conferences like OpenAI presenting GPT-4o, Google I/O, which in 2024 is essentially a consumer AI conference, the release of Claude 3 in Europe, and the announcement of a 4 billion euro investment in "cloud and AI enablement" by Microsoft in France. Just when everyone thought this train was slowing down, it picks up speed again... so we'll start with a tutorial on GenAI.
The article Spring AI: How to Write GenAI Applications with Java by Jennifer Reif is a pretty efficient introduction to the topic of Generative AI and the Spring AI framework. It goes through the basic steps of setting up a GenAI project using Spring AI, including the necessary dependencies (OpenAI API and Neo4j Vector Database), configuration, and (a brief, as LLMs are extremely complex) implementation. It also introduces the Retrieval-Augmented Generation (RAG) technique, which allows for creating content based on provided input data and is used to improve the responses of AI models, serving as a more user-friendly alternative to fine-tuning.
There have been a few introductory texts of this type lately (such as the rather enjoyable 5 steps to develop an AI-powered Telegram bot with Langchain4j in Java, focusing on an alternative to Spring AI), but I think Jennifer Reif's text is worth noting if only for its focus on using RAG.
Continuing on the topic of using LLMs and Langchain4j...
Building a Simple Home Assistant using LangChain4j and Raspberry Pi
...I have another interesting text for you, this time combining LangChain4j with an interesting domain, IoT.
The article Building a Simple Home Assistant using LangChain4j and Raspberry Pi by Jansen Ang shows how to build a simple home assistant using LangChain4j and Raspberry Pi. It demonstrates how, using these tools, it is possible to create an intelligent home assistant capable of conversing, providing the latest news, analyzing the environment with a camera, controlling smart devices, and answering questions based on user data.
The author uses LangChain4j to integrate various AI components such as language models, conversational memory, and tools for executing external actions. The Raspberry Pi serves as the hardware platform, and additional features are implemented through integration with external APIs, such as Amazon Polly and Amazon Transcribe for speech processing, Google’s Gemini Pro Vision for video processing, and Home Assistant for smart device control. The result is a functional home assistant capable of conducting conversations, providing current news, analyzing the environment, and controlling smart home devices. This demonstrates the ease of implementing modern AI technologies into everyday use by a skilled tinkerer and the possibility of integrating APIs from various providers using an abstraction like LangChain4j.
It’s interesting to see what Jansen will be able to conjure up with new APIs from OpenAI and Google. Not that I'm encouraging or anything, but the current feature set is already impressive!
Duplicate Finder for Documentation
We've talked about GenAI, so now let's come down to earth and look at some more mundane problems... documentation.
Duplicate content in documentation is a common problem, especially in large projects. Despite advanced tools and practices, it's hard to avoid duplication, which becomes increasingly apparent as a project grows. Traditional approaches, like the DRY (Don't Repeat Yourself) principle, help limit duplication, but once it occurs, detecting repeated content becomes a more challenging task. Unlike programming code, where IDEs can automatically detect duplicates, similar functionality is not easily available for text documentation, as we are dealing not with a tree-like structure (like code in the form of an AST) that facilitates automatic comparison, but usually unstructured "plain text" content.
The article Duplicate Finder for Documentation introduces a tool for detecting such duplicates. Igor Kulakov is working on its prototype, which will be able to quickly find not only exact but also fuzzy matches. The prototype can already analyze a project with about 6,000 source files in less than 30 seconds and will eventually highlight duplicates in real-time while writing. Once completed, the tool will be available in Writerside from JetBrains, a documentation creation tool currently available for free in the EAP program. According to the FAQ, even after release, the tool will remain free if you choose to use the preview version. Igor has already announced more posts that will include details about the algorithm and benchmark results, so soon you'll be able to get acquainted with the specifics.
Is your Java application ready for the next generation of server CPUs?
The Arm architecture, known mainly from mobile devices, has gained prominence in server and cloud environments thanks to processors like Amazon Graviton, Microsoft Cobalt, and Google Axion. Every Amazon, Azure, or Google Cloud conference brings new developments in this area, and it's no surprise—due to operational costs, it seems like a highly profitable solution for them.
However, I don't see similar enthusiasm in the broader "industry." Migrating Java applications to new generations of server CPUs, particularly to the Arm architecture, presents several challenges for developers, or at least Rumsfeld's "Unknown Unknowns." Unlike application code, which usually runs without modifications, the main issues are adapting the software to the new architecture and optimizing performance, considering differences in processor instructions (e.g., Neon, SVE). Therefore, optimizations to the JVM and tools are necessary to fully exploit the new processors' capabilities.
The article Is your Java application ready for the next generation of server CPUs? by Michael Hall describes how to prepare Java applications for new generations of server Arm CPUs using the tools and optimizations available in the Java ecosystem. We learn about the details of Aarch64 architecture support available in various JDK distributions, such as OpenJDK, GraalVM, Coretto, and Liberica. The article also discusses tools for CI/CD in this context and highlights the availability of tools like Buildah, Podman, CRI-O, and Containerd, which support building and running containers in Arm environments. Michael presents practical steps for configuring development and deployment environments and using solutions like Arm Virtual Hardware, which allow for testing and optimizing applications on virtual Arm hardware, and encourages using resources available on the Arm Developer Hub, where you can find tutorials, videos, and a community supporting developers in migrating applications to the Arm architecture.
Remotely Recording a JFR Log from a Container (Without Using the Command Line)
And now something much more focused on one particular goal.
As many of you probably know, Java Flight Recorder (JFR) is a tool for recording and analyzing metrics from both the JVM and the system. JFR allows recording detailed logs that provide information about application performance, JVM health, and system stability. Unfortunately, obtaining JFR logs typically requires executing several commands in the command line, which can be problematic for systems running in containers where direct terminal access may be limited.
Therefore, the tutorial Remotely Recording a JFR Log from a Container (Without Using the Command Line) by Matt Van Order explains how to remotely record JFR logs from a JVM running in a container by configuring a JMX connector on the JVM using JDK Mission Control, without using the command line. This tutorial describes how to do this using Azul Mission Control, a distribution of JDK Mission Control from Azul. JDK Mission Control itself was released as open source by Oracle and is managed as a project under the aegis of OpenJDK.
SQL Query Optimization: How to Identify and Optimize Slow SQL Queries
And finally, since it's not all about Java in life... I have an interesting text on SQL optimization.
If things are slow, the first suspect is usually (rightly!) the database. However, even though the culprit is known, identifying and optimizing slow SQL queries that actually impact the user is never straightforward. These difficulties arise from the large number of parallel queries, unusual data sets, and complex queries involving many joins, subqueries, and aggregations. As a result, developers often have to manually analyze query performance, which is time-consuming and complicated, and catching the culprit involves adding logs and waiting for a yeti. An additional challenge is the lack of monitoring tools that automatically detect problematic queries—once we had specialized DBAs with their magical toolboxes, today in the era of *Ops, our dear developer from the scrum team handles everything, but often superficially.
That's why I'm sharing the article SQL Query Optimization: How to Identify and Optimize Slow SQL Queries by Oleksandr Hrebeniuk, which discusses techniques for identifying and optimizing slow SQL queries in relational databases like PostgreSQL, MySQL, or Oracle. It is based on the monitoring tool he is developing called Digma, which automatically identifies problematic queries, allowing developers to quickly optimize code before deploying to production. While the optimization tips are not very deep, Digma looks interesting and might help someone catch problems with slow queries.
I hope you enjoyed this somewhat unconventional edition because I plan to continue such highly targeted issues once a month.