Tag Archives: security

Password Security: Why Hashing is Essential

Password security is an often underestimated but critical topic in software development. Databases containing millions of user logins are repeatedly compromised – and shockingly, often, it turns out that passwords have been stored in plain text. This gives attackers direct access to sensitive account data and opens the door to identity theft, account takeovers and other attacks.

In this blog post, we discuss why passwords should always be stored hashed, the attack methods available, and how you can implement an initial secure implementation with Java in your application. We also examine the differences between PBKDF2, BCrypt, and Argon2 and explain best practices for handling passwords in software development.

Passwords and hashing

Passwords should never be stored in plain text but should always be hashed to ensure the security of user data and prevent misuse. If a database is compromised and passwords are only stored in plain text, attackers have direct access to users’ sensitive credentials. This can have serious consequences, as many people use the same password for multiple services. Hashing passwords significantly reduces this risk because attackers only see the hash values ​​, not the actual ones.

A key advantage of hashing is its one-way function. A hash value can be generated from a password, but inferring the original password is virtually impossible. This makes it extremely difficult to misuse the access data resulting from a data leak. This protection mechanism applies not only to external attacks but also to internal security risks. If passwords are stored in plain text, employees with access to the database, for example, could view and misuse this information. Hashed passwords largely eliminate such insider threats.

In addition, compliance with legal requirements and security standards such as the General Data Protection Regulation (GDPR) or the Payment Card Industry Data Security Standard (PCI-DSS) requires the protection of passwords. Hashing passwords is important to meet these standards and avoid legal consequences.

What are brute force and rainbow table attacks on passwords?

Brute force attacks and rainbow table attacks are two methods attackers use to decrypt passwords or other secrets. A Brute force attack works by systematically trying every possible password combination until finding the right one. We start with the simplest combinations, such as “aaa” or “1234”, and check every possible variant. Although brute force attacks can theoretically always be successful, their effectiveness depends heavily on the complexity of the password and computing power. A short password can be cracked in a few seconds, while a long and complex password increases the effort significantly. Measures such as longer passwords, limiting the number of login attempts per unit of time (e.g. blocking after several failed attempts) and the use of hashing algorithms with many iterations make brute force attacks significantly more difficult and time-consuming.

In contrast, the Rainbow table attacks pre-built tables that contain many hash values ​​and their associated plaintext passwords. Attackers compare the stored hash of a password with the hashes in the table to find the original password. This method is significantly faster than brute force because the passwords must not be re-hashed every time. However, rainbow tables only work if the hashing method does not use a salt value. A salt value is a random value added to each password hash, so even identical passwords produce different hashes. Without salt, attackers could use the same table across many other systems, but this is no longer possible with salt.

Using modern hashing algorithms such as PBKDF2, BCrypt, or Argon2 is crucial to protecting yourself from both attack methods. These algorithms combine salts and multiple iterations to significantly increase attacker effort. Additionally, long and complex passwords make brute-force attacks virtually impossible as the number of possible combinations increases exponentially. In summary, combining strong passwords, salts, and secure hashing algorithms effectively defends against brute force and rainbow table attacks.

And how do I do this now in my application?

To securely hash a password in Java, it is recommended to use a specialised hash function like PBKDF2, BCrypt, or Argon2. These algorithms are specifically designed to make attacks such as brute force or rainbow table attacks more difficult. One way to implement this with the Java standard library is to use PBKDF2.

First, a Salt is generated, a random sequence of bytes generated for each password individually to ensure that two identical passwords have different hashes. The class SecureRandom can be used to create a 16-byte salt. Then, with the class PBEKeySpec, a key is generated based on the password, the salt, the desired number of iterations (e.g. 65,536) and the key length (e.g. 256 bits). With the help of SecretKeyFactory and the algorithm specification “PBKDF2WithHmacSHA256” the password hash is created. The resulting hash is finally encoded in Base64 to store it in a readable form.

The hashed password is often stored along with the salt, separated by a colon (:). This makes it easier to verify later by extracting the salt again and using it for the hash calculation. In addition, there are also external libraries such as Spring Security or BouncyCastle, which allow easier integration of algorithms like BCrypt or Argon2 make possible. These often offer even more security and flexibility. 

However, there are a few points here that you should take a closer look at.

How safe is PBKDF2?

PBKDF2 is a proven and secure password hashing algorithm, but its security depends heavily on the correct implementation and parameters used. Various factors influence the safety of PBKDF2. A key aspect is the number of iterations, representing the work factor. The higher the number of iterations, the more computing power is required for hashing. Experts recommend at least 100,000 iterations, although even higher values ​​are often chosen for modern applications to make brute-force attacks more difficult. Another crucial factor is the salt value, a random and unique value that is regenerated for each password. In addition, PBKDF2 can be configured with a variable key length, for example, 256 bits, which increases security. Modern applications often use hash functions such as SHA-256 or SHA-512.

PBKDF2 has several strengths. The algorithm is proven and has been used for years in security-critical applications such as WPA2 and encrypted storage systems. It is easy to implement as extensive support in many programming languages ​​and libraries is based on a recognised standard defined in RFC 8018. Nevertheless, PBKDF2 also has weaknesses. The algorithm is not optimised explicitly against GPU or ASIC-based attacks, allowing attackers with specialised hardware to efficiently carry out brute force attacks. Compared to modern algorithms such as Argon2, PBKDF2 requires a higher number of iterations to achieve similar levels of security. In addition, the further development of PBKDF2 is progressing more slowly, which means it is considered less adapted to current threats than modern alternatives such as Argon2.

However, PBKDF2 can be used safely if a few key conditions are met: a unique salt value, the iteration count should be at least 100,000 (ideally more), and a modern hash function such as SHA-256 should be used. In addition, strong password guidelines, such as a sufficient minimum length, should complement security.

Argon2 is increasingly recommended as an alternative to PBKDF2. Argon2, the Password Hashing Competition (PHC) winner in 2015, is more modern and better adapted to current threats. Its memory intensity provides better protection against GPU and ASIC attacks and offers flexible configuration options regarding work factor, memory requirements and parallelism.

In summary, PBKDF2 is secure when implemented correctly but has weaknesses compared to specialised hardware. Argon2 is preferable for new applications because it is better suited to modern security requirements.

Using stronger hashing algorithms like Argon2

Argon2 is a modern algorithm for secure password hashing and was developed in 2015 as part of the Password Hashing Competition (PHC). It is considered one of the most secure approaches to storing passwords today and, as already mentioned, offers adequate protection against brute force attacks as well as attacks using GPUs or specialised hardware solutions such as ASICs. This is achieved due to the fact that Argon2 is both memory and compute-intensive, forcing attackers with parallel hardware to expend significant resources.

Argon2 exists in three variants, each optimised for different use cases. The first variant, Argon2i, is designed to be particularly memory-safe. It performs memory-intensive operations independently of the input data and is resistant to side-channel attacks such as timing attacks. This makes Argon2i ideal for applications where data protection is a top priority.

The second variant, Argon2d, is specifically optimised for attack resistance. It works data-dependently, which makes it particularly robust against GPU-based attacks. However, Argon2d is more vulnerable to side-channel attacks and is, therefore, less suitable for securing sensitive data.

The third variant, Argon2id, offers a balanced combination of both approaches. It starts with a memory-safe approach, like that used by Argon2i, and then moves to a data-dependent process that ensures attack resistance. This mix makes Argon2id the preferred choice for most use cases, as it combines data protection and attack resilience.

One of Argon2’s key strengths is its customizability. The algorithm allows developers to configure three primary parameters: memory consumption, computational cost and parallelism. Memory consumption defines how much memory is used during the hashing operation and makes parallel hardware attacks expensive. Computational cost indicates the number of iterations the algorithm goes through, while parallelism determines the number of threads working simultaneously. By adjusting these parameters, the algorithm can be tailored to the specific requirements of an application or the available hardware.

Another advantage of Argon2 is its resistance to modern attacks. The combination of memory and computing-intensive processes makes it difficult for attackers to crack passwords using brute force or specialised hardware. This makes Argon2 ideal for use in safety-critical applications.

Argon2 is used in many modern cryptography and password management libraries. Developers can implement the algorithm in various programming languages ​​, such as Java, Python, or C. 

In summary, Argon2 is one of the safest password-hashing algorithms. The variant Argon2id is especially recommended as a standard for new projects because it provides protection against both side-channel attacks and high attack resistance. 

Application of Argon2 in Java

One Argon2 in Java To use it, you can use libraries like Jargon2 or Bouncy Castle because Java does not natively support Argon2. With Jargon2, Argon2 is very easy to integrate and use. To do this, first, add the following Maven dependency:

The code to create a hash and verify a password might look like this:

If you want to use a more comprehensive cryptography library, you can use Bouncy Castle. The corresponding Maven dependency is:

An example of using Argon2 with Bouncy Castle looks like this:

Jargon2 was developed specifically for Argon2 and is, therefore, easy to use Bouncy Castle for more complex cryptographic requirements. It is recommended in most cases for the pure use of Argon2.

Generate the SALT value.

So far, it has always been mentioned that producing a good SALT value is important. But how can you do that? 

We will examine the topic in detail in the next blog post. Unfortunately, at this point, it would go beyond the scope of this article. 

In this blog post, we will examine a basic initial implementation. You will quickly be confronted with an implementation that could look like this. 

However, there are some minor comments here.

Basically, creating a salt with a newly instantiated SecureRandom is not fundamentally wrong in any method, but it increases the chance that the same seeds are used at very short intervals or under certain circumstances. In practice, that is rarely a problem; however, it is good practice to have a single (static) instance of SecureRandom to be used per application (or per class).

Why?

  • SecureRandom gets its seed (among other things) from the operating system (e.g /dev/urandom in Linux).
  • Each re-creation of one SecureRandom can lead to unnecessary system load and, theoretically, minimal increased risk of repetitions.
  • At just one SecureRandominstance, a pseudo-random number generator is continued internally, making duplicates very unlikely.

This way, (a) is not repeated every time it is called SecureRandom, and (b) reduces the risk of randomly generating identical salts. Of course, purely statistically speaking, collisions can theoretically still occur, but the probability is negligible if the salt length is long enough.

However, this is only the beginning of the discussion; more on that in the second part.

What is the procedure for checking the username-password combination during the login process? 

A standardised procedure is followed to check the combination of user name and password during a login process, ensuring that the data is processed securely. First, the user enters their login details via a form transmitted over HTTPS. The server then looks up the username in the database to retrieve relevant information such as password hash and salt. Care should be taken not to reveal whether a user exists.

The stored password hash is then compared with a recalculated hash of the entered password using the stored salt. If the values ​​match, the login is considered successful; otherwise, a generic error message is returned so as not to reveal additional information.

After successful verification, a session or JSON Web Token (JWT) is created to maintain the user’s authentication. The token does not contain any sensitive information, such as passwords. 

A few more words about strings in Java

If we imagine that a password is in a text field of a (web) application, we will receive this password as a string. Yes, some will say, but what could be bad about that?

To do this, we need to examine briefly how strings are handled in the JVM and where attack vectors can lie.

In Java, strings are immutable, which means that once created, String-Object can no longer be changed. When a string is manipulated through concatenation, a new string object is created in memory while the old one continues to exist until it is removed by the garbage collector (GC). This behaviour can be problematic when sensitive data, such as passwords or cryptographic keys, is stored in strings. Because they cannot be overwritten directly, they may remain in memory longer than necessary and are potentially visible in a memory dump or readable by an attacker. Another problem is that the developer has no control over when the garbage collector removes the sensitive data. This can cause such data to remain in memory for an extended period and possibly become visible in logs or debugging tools.

So what can you do about it now?

A safer approach is to use this instead of strings char[] because they are changeable, and the memory can be specifically overwritten. This helps minimise the retention time of sensitive data, especially in security-critical applications where memory dumps or debugging tools could provide memory access.

The advantage of char[] lies in direct memory management: While the garbage collector decides itself when to remove objects, a char[]-Array is explicitly overwritten and thus immediately deleted. This significantly reduces the risk of unauthorised access.

Next to char[], additional security measures should be taken, such as encrypting sensitive data and using SecretKeySpec from the package javax.crypto.spec. for cryptographic keys. This class allows handling cryptographic keys in byte arrays that can be overwritten after use, but I will write a separate blog post about that…

Conclusion

Secure storage of passwords is an essential part of every application to protect user data from misuse and unauthorised access. Plain text passwords pose a significant risk and should never be saved directly. Instead, modern hashing algorithms such as PBKDF2, BCrypt or Argon2 must be used to ensure security.

Brute force and rainbow table attacks illustrate the importance of salts and sufficiently complex hashing mechanisms. The risk of such attacks can be significantly reduced by using random salts and iterative hashing methods. Argon2id, in particular, offers high resistance to modern attack methods due to its storage and computing intensity and is recommended as the preferred solution.

In addition, passwords within the application should also be processed carefully. Using char[] instead of String to store sensitive data can help prevent unwanted memory leaks. It is also important not to store passwords unsecured in memory or in logs.

Secure password hashing procedures are a technical best practice and a legal requirement in many areas.

Short links, clear architecture – A URL shortener in Core Java

A URL shortener seems harmless – but if implemented incorrectly, it opens the door to phishing, enumeration, and data leakage. In this first part, I’ll explore the theoretical and security-relevant fundamentals of a URL shortener in Java – without any frameworks, but with a focus on entropy, collision tolerance, rate limiting, validity logic, and digital responsibility. The second part covers the complete implementation: modular, transparent, and as secure as possible.

1.1 Motivation and use cases

In an increasingly fragmented and mobile information world, URLs are not just technical addressing mechanisms; they are central building blocks of digital communication. Long and hard-to-remember URLs are a hindrance in social media, emails, or QR codes, as they are not only aesthetically unappealing but also prone to errors when manually entered. URL shorteners address this problem by generating compact representations that point to the original target address. In addition to improved readability, aspects such as statistical analysis, access control, and campaign tracking also play a key role.

Initially popularised by services like TinyURL or bit.ly, URL shorteners have now become integrated into many technical infrastructures – from marketing platforms and messaging systems to IoT applications, where storage and bandwidth restrictions play a significant role. A shortened representation of URLs is also a clear advantage in the context of QR codes or limited character sets (e.g., in SMS or NFC data sets).

A URL shortener is not a classic forwarding platform and is conceptually different from proxy systems, link resolvers, or load balancers. While the latter often operate at the transport or application layer (Layer 4 or Layer 7 in the OSI model) and optimise transparency, availability, or performance, a shortener primarily pursues the goal of simplifying the display and management of URLs. Nevertheless, there are overlaps, particularly in the analysis of access patterns and the configuration of redirect policies.

In this work, a minimalist URL shortener is designed and implemented. It deliberately avoids external frameworks to implement the central concepts in a comprehensible and transparent manner in Core Java. The choice of Java 24 enables the integration of modern language features, such as records, sealed types, and virtual threads, into a secure and robust architecture.

1.3 Objective of the paper

This paper serves a dual purpose: on the one hand, it aims to provide a deep technical understanding of the functionality and challenges associated with a URL shortener. On the other hand, it serves as a practical guide for implementing such a service using pure Java—that is, without Spring, Jakarta EE, or external libraries.

To this end, a comprehensive architecture will be developed, implemented, and continually enhanced with key aspects such as security, performance, and extensibility. The focus is deliberately on a system-level analysis of the processes to provide developers with a deeper understanding of the interaction between the network layer, coding strategies, and persistent storage. The goal is to develop a viable model that can be utilised in both educational contexts and as a basis for productive services.

2. Technical background

2.1 URI, URL and URN – conceptual basics

In everyday language, terms such as “URL” and “link” are often used synonymously, although in a technical sense, they describe different concepts.URI (Uniform Resource Identifier)refers to any character string that can uniquely name or locate a resource. A URL (Uniform Resource Locator) is a special form of a URI that not only identifies but also describes the access path, for example, through a protocol such as https, ftp, or mailto. A URN (Uniform Resource Name), on the other hand, names a resource persistently without referring to its physical address, such as urn:isbn:978-3-16-148410-0.

In the context of URL shorteners, URLs are exclusively concerned with accessible paths, typically via HTTP or HTTPS. The challenge is to transform these access paths in a way that preserves their semantics while reducing their representation.

2.2 Principles of address shortening

The core idea of ​​a URL shortener is to replace a long URL string with a shorter key that points to the original address via a mapping. This mapping is done either directly in a lookup store (e.g., hash map, database table) or indirectly via a computational method (e.g., a hash function with collision management).

The goal is to use the redundancy of long URLs to map their entropy to a significantly shorter string. This poses a trade-off between collision-freeness, brevity, and readability. Conventional methods are based on encoding unique keys in a Base62 alphabet ([0-9a-zA-Z]), which offers 62 states per character. Just six characters can represent over 56 billion unique URLs—sufficient for many productive applications.

The shortcode acts as the primary key for address resolution. It is crucial that it is stable, efficiently generated, and as challenging to guess as possible to prevent misuse (e.g., brute-force enumeration).

2.3 Entropy, collisions and permutation spaces

A key aspect of URL shortening is the question of how many different short addresses a system can actually generate. This consideration directly depends on the length of the generated shortcuts and their character set. Many URL shorteners use a so-called Base62 alphabet. This includes the ten digits from zero to nine, the 26 lowercase letters, and the 26 uppercase letters, for a total of 62 different characters.

For example, if you generate abbreviations with a fixed length of six characters, you get a combinatorial space in which over 56 billion different character strings are possible. Even with this relatively short number of characters, billions of unique URLs can be represented, which is more than sufficient for many real-world applications. For longer abbreviations, the address space grows exponentially.

But the sheer number of possible combinations is only one aspect. How these shortcuts are generated is equally important. If the generation is random, it is essential to ensure that no duplicate codes are created – so-called collisions. These can be managed either by checking for their existence beforehand or by deterministic methods such as hash functions. However, hash methods are not without risks, especially under heavy load: The more entries there are, the higher the probability that two different URLs will receive the same short code, especially if the hash function has not been optimised for this use case.

Another criterion is the distribution of the generated shortcuts. A uniform distribution in the address space is desirable because, on the one hand, it reduces the risk of collisions, and on the other hand, it increases the efficiency of storage and retrieval mechanisms – for example, in sharding for distributed systems or caching in high-traffic environments. Cryptographically secure random numbers or specially designed generators play a crucial role here.

Overall, it can be said that the choice of alphabet, the length of the abbreviations and the way they are generated are not just technical parameters, but fundamental design decisions that significantly influence the security, efficiency and scalability of a URL shortener.

3. Architecture of a URL shortener

The architecture of a URL shortener is surprisingly compact at its core, but by no means trivial. Although its basic function is simply to link a long URL with a short alias, numerous technical and conceptual decisions arise in the details. These include data storage, the structure of API access, concurrency behaviour, and security against misuse. This chapter explains the central components and their interaction, deliberately avoiding external frameworks. Instead, the focus is on a modular, transparent structure in pure Java.

At the heart of the system is a mapping table – typically in the form of a map or a persistent key-value database – that uniquely assigns each generated short code to its corresponding original URL. This structure forms the backbone of the shortener. Crucially, this mapping must be both efficiently readable and consistently modifiable, especially under load or when accessed concurrently by multiple clients.

A typical URL shortener consists of three logically separate units: an input endpoint for registering a new URL, a redirection endpoint for evaluating a short link, and a management unit that provides metadata such as expiration times or access counters. In a purely Java-based solution without frameworks, network access is provided via the HTTP server introduced in Java 18.  com.sun.net.httpserver package. This allows you to define REST-like endpoints with minimal overhead and to communicate with HttpExchange objects.

There are various options for storing mappings. In-memory structures, such as ConcurrentHashMap, offer maximum speed but are volatile and unsuitable for productive applications without a backup mechanism. Alternatively, file-based formats, relational databases, or object-oriented stores such as EclipseStore can be used. This paper will initially work with volatile storage to illustrate the basic logic. Persistence will be added modularly later.

Another key aspect concerns concurrency behaviour. Since URL shorteners are typically burdened by a large number of read accesses, for example, when calling short links, the architecture must be designed to allow concurrent access to the lookup table without locking conflicts. The same applies to the generation of new shortcuts, which must be atomic and collision-free. Java 24 introduces modern language tools, including virtual threads and structured concurrency, which can be utilised to manage server load in a more deterministic and scalable manner.

Last but not least, horizontal extensibility plays a role. A cleanly decoupled design allows the shortener to be easily transferred to distributed systems later. For example, the actual URL resolver can be operated as a stateless service, while data storage is outsourced to a shared backend. Caching strategies and load balancing can also be integrated much more easily in such a setup.

In summary, a URL shortener is much more than a simple string replacement. Its architecture must be both efficient, robust, and extensible—properties that can be easily achieved through a modular structure in pure Java.

4. Implementation with Java 24

4.1 Project structure and module overview

The implementation of the URL shortener follows a modular structure that supports both clarity in the source code and testability, as well as extensibility. The project is structured as a Java module and leverages the capabilities of the Java Platform Module System (JPMS). The goal is to separate the core functionality—that is, the management of URL mappings—from the network layer and persistence. This keeps the business logic independent of specific storage or transport mechanisms.

At the centre is a module called shortener.core, which contains all domain-specific classes: for example, the ShortUrlMapping, the UrlEncoder, as well as the central UrlMappingStore interface with a simple implementation in memory. A module shortener.http, which is based on Java’s internal HTTP server. It implements the REST endpoints and utilises the core module’s components for actual processing. Additional optional modules, such as those for persistence or analysis, can be added later.

To organise the code, a directory structure that clearly reflects the module and layer boundaries is recommended. Within the modules, a distinction should be made between api, impl, util and, if necessary, service.

4.2 URL Encoding: Hashing, Base62 and Alternatives

A central element of the shortener is the mechanism for generating short, unique codes. This implementation uses a hybrid method that generates a consecutive, atomic sequence number and converts it into a human-readable format using a Base62 encoder.

This choice has two advantages: First, it is deterministic and avoids collisions without the need for complex hash functions. Second, generated codes can be efficiently serialised and are easy to read, which is particularly relevant in marketing or print contexts. Alternatively, cryptographic hashes such as SHA-256 can be used when unpredictability and integrity protection are essential, for example, for signed links or zero-knowledge schemes.

The Base62 encoder is implemented as a pure utility class that encodes integer values ​​into a character string, where the alphabet consists of numbers and letters. Inverse decoding is also provided in case bidirectional analysis is required in the future.

4.3 Mapping Store: Interface, Implementation, Synchronisation

For managing URL mappings, a clearly defined interface called UrlMappingStore provides methods for inserting new mappings, resolving short links, and optionally managing metadata. The default implementation, InMemoryUrlMappingStore, is based on a ConcurrentHashMap and utilises AtomicLong for sequence number generation.

This simple architecture is completely thread-safe and allows parallel access without external synchronisation mechanisms. The implementation can be replaced at any time with a persistent variant, for example, based on flat file storage or through integration with an object-oriented storage system such as EclipseStore.

This separation keeps the application core stable while treating storage as a replaceable detail—a classic example of the dependency inversion principle in the spirit of Clean Architecture.

4.4 REST API with pure Java (HTTP server, handler, routing)

The REST interface is implemented exclusively with the built-in tools of the JDK. Java provides the package com.sun.net.httpserver, which offers a minimalistic yet powerful HTTP server ideal for lean services. For the implementation of the API, a separate HttpHandler is defined that responds to specific routes, such as /shorten for POST requests and /{code} for forwarding.

The implementation is based on a clear separation between parsing, processing, and response generation. Incoming JSON messages are parsed manually or with the help of simple helper classes, without the need for external libraries. HTTP responses also follow a minimalist format, characterised by structured status codes, simple header management, and UTF-8-encoded bodies.

Routing is handled by a dispatcher class, which selects the appropriate handler based on the request path and HTTP method. Later extensions, such as CORS, OPTIONS handling, or versioning, are easily possible.

4.5 Error handling, logging and monitoring

In a productive environment, robust error handling is essential. The implementation distinguishes between systematic errors (such as invalid inputs or missing short codes) and unexpected runtime errors (such as IO problems or race conditions). The former are reported with clear HTTP status codes, such as 400 (Bad Request) or 404 (Not Found). The latter leads to a generic 500 Internal Server Error, with the causes being logged internally.

For logging, the JDK’s own java.util.logging This allows for platform-independent logging and can be replaced with SLF4J-compatible systems if needed. Monitoring metrics such as access counts, response times, or error statistics can be made accessible via a separate endpoint or JMX.

5. Security aspects

5.1 Abuse opportunities and protection mechanisms

A URL shortener can easily be used to obscure content. Attackers deliberately exploit the shortening to redirect recipients to phishing sites, malware hosts, or dubious content without the target address being immediately visible. This can pose significant risks, especially for automated distributions via social networks, chatbots, or email campaigns.

An adequate protection mechanism consists of automatically validating all target addresses upon insertion, for example, through syntactical URL checks, DNS resolution, and optionally through a background query (head request or proxy scan) that ensures that the target page is accessible and non-suspicious. Such checks should be modular so that they can be activated or deactivated depending on the environment (e.g., offline operation). Additionally, logging should be performed every time a short link is accessed, making it easier to identify patterns of abuse.

5.2 Rate limiting and IP-based throttling

Another risk lies in excessive use of the service, be it through botnets, targeted enumeration, or simple DoS behaviour. A robust URL shortener should therefore have rate limiting that restricts requests within a given time slot. This can be global, IP-based, or per-user, depending on the context.

In a Java implementation without frameworks, this can be achieved, for example, via a ConcurrentHashMap that maintains a timestamp or counter buffer for each IP address. If a threshold is exceeded, the connection is terminated with a status code of 429 Too Many Requests rejected. This simple throttling can be supplemented with leaky bucket or token bucket algorithms if necessary to achieve a fairer distribution over time. For productive use, logging of critical threshold violations is also recommended.

5.3 Validity period and deletion concepts

Not every short link should remain valid forever. A configurable validity period is essential, especially for security-critical applications, such as temporary document sharing or one-time authentication. A URL shortener should therefore offer the option of defining expiration times for each mapping.

On a technical level, it is sufficient to assign an expiration date to each mapping, which is checked during the lookup. When accessing expired short links, either an error status, such as 410 Gone, is displayed, or the user is redirected to a defined information page. Additionally, there should be periodic cleanup mechanisms that remove expired or unused entries from memory, such as through a time-controlled cleanup process or lazy deletion upon access.

5.4 Protection against enumeration and information leakage

An often overlooked attack vector is the systematic scanning of the abbreviation space – for example, by automated retrieval of /aaaaaa until /zzzzzz. If a URL shortener delivers valid links without any protection mechanisms, potentially confidential information about the existence and use of links can be leaked.

An adequate protection consists in making the shortcuts themselves non-deterministic – for example, by using cryptographically generated, unpredictable tokens instead of continuous sequences. Additionally, access restrictions can be introduced, allowing only authenticated clients to access certain short links or excluding specific IP ranges. The targeted obfuscation of error responses – for example, by consistently issuing 404 Not Found even with blocked or expired abbreviations – makes analysis more difficult for attackers.

A further risk arises when metadata such as creation time, number of accesses, or request origin is exposed unprotected via the API. Such information should only be accessible to authorised users or administrative interfaces and should never be part of the public API output.

6. Performance and optimisation

6.1 Access times and hash lookups

The most common operation in a URL shortener is resolving a shortcode into its corresponding original URL. Since this is a classic lookup operation, the choice of the underlying data structure is crucial. In the standard implementation, a ConcurrentHashMap, which is optimised in Java 24, has fine-grained locking. This offers nearly constant access times – even under high concurrency – and is therefore ideal for read-intensive workloads, such as those typical of a shortener.

The latency of such an operation is in the range of a few microseconds, provided the lookup table is stored in main memory and no additional network or IO layers are involved. However, if data storage is outsourced to persistent systems, such as a relational database or a disk-based key-value store, the access time increases accordingly. Therefore, it is recommended to cache frequently accessed entries – either directly in memory or via a dedicated cache layer.

Performance also plays a role in the creation of new abbreviations. This is where sequence number generation using AtomicLong is used, providing a thread-safe, low-contention solution for linear ID assignment. Combined with Base62 encoding, this creates a fast, predictable, and collision-free process.

6.2 Memory usage and garbage collection

Since a URL shortener must manage a growing number of entries over a longer period, it is worthwhile to examine its storage behaviour. ConcurrentHashMap. While this results in fast access times, it also means that all active mappings remain permanently in memory—unless cleanup is implemented. A simple mapping structure consisting of a shortcode, original URL, and an optional timestamp requires several hundred bytes per entry, depending on the JVM configuration and string length.

With several million entries, heap usage can reach several gigabytes. To improve efficiency, care should be taken to use objects sparingly. For example, common URL prefixes (e.g. https://) are replaced with symbolic constants. Records instead of classic POJOs also help reduce object size and minimise GC load.

In the long term, it is recommended to introduce an active or passive cleanup mechanism, such as TTL-based eviction or access counters, to specifically remove rarely used entries. WeakReference or soft caching should be considered with caution, since the semantics of such structures do not always lead to expected behaviour in the server context.

6.3 Benchmarking: Local tests and load simulation

Systematic benchmarking is essential for objectively evaluating the performance of a URL shortener. At a local level, this can be achieved with simple Java benchmarks that measure sequence number generation, lookup time, and code distribution quality. Tools such as JMH (Java Microbenchmark Harness) can also be used. Although external tools are not used in this paper, a manual microbenchmarking approach using System.nanoTime and a targeted warm-up can provide valuable insights.

For more realistic tests, a load simulation with HTTP clients is suitable, for example, using simple JDK-based multi-thread scripts or tools such as curl. In particular, behaviour under high concurrent access load should be observed, both in terms of response times and resource consumption. Behaviour in the event of failed requests, rapid-fire access, or expired links should also be explicitly tested.

The goal of such benchmarks is not only to validate the maximum transaction rate, but also to verify stability under continuous load. A robust implementation should not only be high-performance but also deterministic in its response behaviour and resistant to out-of-memory errors. Optional profiling—for example, using JDK Flight Recorder—can reveal further optimisation potential.

7. Expansion options and variants

7.1 Custom aliases

A frequently expressed wish in practice is the ability to not only use automatically generated short links, but also to assign custom aliases – for example, for marketing campaigns, internal documents, or individual redirects. A custom alias, such as /travel2025 is much easier to remember than a random Base62 token and can be integrated explicitly into communication and branding.

Technically speaking, this expands the mapping store’s responsibility. Instead of only accepting numerically generated keys, the API must verify that a user-defined alias is syntactically valid, not already in use, and not reserved. A simple regex check, supplemented by a negative list for reserved terms (e.g. /admin, /api), is sufficient to get started. This alias must then be treated equally to the automatically generated codes when stored.

This creates new failure modes, for example, when a user requests an alias that already exists. Such cases should be handled consistently with a 409 Conflict. The API can optionally suggest alternative names—a small convenience feature with a significant impact on the user experience (UX).

7.2 Access counting and analytics

A functional URL shortener is more than just a redirection tool—it’s also an analytics tool. Tracking how often, when, and from where a short link was accessed is particularly relevant in the context of campaigns, product pages, or documented distribution.

To implement this functionality, each successful resolution of a short link must be saved as an event, ​​either by simply incrementing a counter or by fully logging with a timestamp, IP address, and user agent. For the in-memory variant, an additional AtomicLong or a metric structure aggregated via a map. Alternatively, detailed access data can be persisted in a dedicated log file or an external analytics module.

The evaluation can be performed either synchronously via API endpoints (e.g.,/stats/{alias}) or asynchronously via export formats such as JSON, CSV, or Prometheus metrics. Integration with existing logging systems (e.g. via java.util.logging or Logstash) is easily possible.

7.3 QR-Code-Integration

For physical media, such as posters, packaging, or invitations, displaying a short link as a QR code is a useful extension. Integrating QR code generation into the URL shortener enables the direct generation of a visually encoded image of the link from the API.

Since no external libraries are used, QR code generation can be performed using a compact Java-based algorithm, such as one based on bit matrix generation and SVG output. Alternatively, a Base64-encoded PNG file can be delivered via an endpoint URL such as /qr/{alias}. The underlying data structure remains unchanged – only the representation is extended.

This feature not only enhances practical utility but also expands the service’s reach across multiple media channels.

7.4 Integration into messaging or tracking systems

In production architectures, a URL shortener typically operates in conjunction with other components. Instead, it is part of larger pipelines – for example, in email delivery, chatbots, content management systems, or user interaction tracking. Flexible integration with messaging systems such as Kafka, RabbitMQ, or simple webhooks allows every link creation or access to be transmitted as an event to external systems.

In a pure Java environment, this can be done via simple HTTP requests, log files, or asynchronous event queues. Scenarios are conceivable in which a notification is automatically sent to a third-party system for each new short link, for example, to generate personalised campaigns or for auditing purposes. Access to short links can also be mapped via events, which are subsequently statistically evaluated or visualised in dashboards.

Depending on the level of integration, it is recommended to implement a dedicated event dispatcher that encapsulates incoming or outgoing events and forwards them in a loosely coupled manner. This keeps the shortener itself lean and responsibilities clearly distributed.

A URL shortener that logs visits automatically operates within the framework of data protection law. As soon as data such as IP addresses, timestamps, or user agents are stored, it is considered personal information in the legal sense, at least potentially. In the European Union, such data falls under the General Data Protection Regulation (GDPR), which entails specific obligations for operators.

The technical capability for analytics—for example, through access counting or geo-IP analysis—should therefore not be enabled implicitly. Instead, a URL shortener should be designed so that tracking mechanisms must be explicitly enabled, ideally with clear labelling for the end user. A differentiated configuration that distinguishes between anonymised and personal data collection is strongly recommended in professional environments.

Additionally, when storing personal data, a record of processing activities must be maintained, a legal basis (e.g., legitimate interest or consent) must be specified, and a defined retention period must be established. For publicly accessible shorteners, this may mean that tracking remains deactivated by default or is controlled via consent mechanisms. The implementation of such control structures is not part of the core functionality, but is an integral part of data protection-compliant operations.

8.2 Responsibility for forwarding

Another key point is the service provider’s responsibility for the content to which the link is redirected. Even if a shortener technically only implements a redirect, legal responsibility arises as soon as the impression arises that the operator endorses or controls the target content. This is especially true for public or embedded shorteners, such as those found in corporate portals or social platforms.

The challenge lies in distinguishing between technical neutrality and de facto mediation. It is therefore advisable to integrate legal protection mechanisms into the architecture, for example, through a policy that excludes the upload of specific domains, regular URL revalidation, or the use of abuse detection systems. In the event of misuse or complaints, immediate deactivation of individual mappings should be possible, ideally via a separate administration interface.

This responsibility is not only legally relevant but also has a reputational impact: Shorteners used to spread harmful content quickly lose their credibility – and possibly also their access to platforms or search engines.

8.3 Transparency and disclosure of the destination address

A common criticism of URL shorteners is that the destination address is no longer visible to the user. This limits the ability to evaluate whether a link is trustworthy before clicking on it. From an ethical perspective, this raises the question of whether a shortener should offer a pre-check option.

Technically, this can be achieved through a special preview mode, such as via an appendage, by explicitly calling an API or HTML preview page that transparently resolves the mapping, for example, a link like https://short.ly/abc123+. Instead of redirecting immediately, the user first displays an information page that displays the original URL and redirects to the page if desired. This function can be supplemented with information about validity, access statistics, or trustworthiness.

A transparent approach to redirects not only increases user acceptance but also reduces the potential for abuse, especially among security-conscious target groups. In sensitive environments, a mandatory preview page – for example, for all non-authenticated users – can be a helpful measure.

9. Conclusion and outlook

9.1 Lessons Learned

The development of a URL shortener in pure Java, without frameworks or external libraries, has demonstrated how even seemingly trivial web services, upon closer inspection, reveal themselves to be complex systems with diverse requirements. From the basic function of address shortening to security aspects and operational and legal implications, the result is a system that must be architecturally well-structured, yet flexible and extensible.

The importance of a clear separation of responsibilities is particularly important: A stable mapping store, a deterministic encoder, a secure yet straightforward REST API, and understandable error handling form the backbone of a robust service. Modern language tools from Java 24, such as records, sealed types, and virtual threads, enable a remarkably compact, type-safe, and concurrency-capable implementation.

The conscious decision against frameworks not only maximised the learning effect but also contributed to a deeper understanding of HTTP, data storage, thread safety, and API design – a valuable perspective for developers who want to operate in a technology-independent environment.

9.2 Possible further developments (e.g. blockchain, DNSSEC)

Despite their apparent simplicity, URL shorteners represent a fascinating field for technological innovation. There are efforts to move away from centralised management of the mapping between short code and target URL, instead using decentralised technologies such as blockchain. In this case, each link is stored as a transaction, providing resistance to manipulation and historical traceability. In practice, however, this places high demands on latency and infrastructure, which is why such approaches have been used so far rarely in production.

Another development strand lies in integration with DNSSEC-based procedures. This not only signs the shortcode itself, but also cryptographically verifies the authenticity of the resolved host. This could combine trust and verification, especially in security-critical areas such as government services, banks, or certificate authorities.

AI-supported heuristics, such as those for misuse detection or memory cleanup prioritisation, also offer potential. However, the integration of such mechanisms requires a data-efficient, explainable design that is compatible with applicable data protection regimes.

9.3 Importance of URL shorteners in the context of digital sovereignty

In today’s digital landscape, URL shorteners are more than just a convenience feature; they are a valuable tool. They influence the visibility, accessibility, and traceability of content. The question of whether and how a link is modified or redirected has a direct impact on information sovereignty and transparency, and thus on digital sovereignty.

Especially in the public sector, educational institutions, or organisations with strict compliance requirements, URL shorteners should not be operated as outsourced cloud services; instead, they should be developed in-house or at least integrated in a controlled manner. A self-hosted solution not only allows complete control over data flows and access histories but also protects against censorship-like outages or data-driven tracking by third parties.

This makes the URL shortener, as inconspicuous as its function may seem, a strategic component of a trustworthy IT infrastructure. It exemplifies the question: Who controls the path of information? In this respect, a custom shortener is not just a tool, but also a statement of identity.

The next part will be about the implementation itself..

Happy Coding

Creating a simple file upload/download application with Vaadin Flow

Vaadin Flow is a robust framework for building modern web applications in Java, where all UI logic is implemented on the server side. In this blog post, we’ll make a simple file management application step by step that allows users to upload files, save them to the server, and download them again when needed. This is a great way to demonstrate how to build protection against CWE-22, CWE-377, and CWE-778 step by step.

In this example, we’re focusing exclusively on functionality rather than on graphic design. The latter has been intentionally kept very simple to focus on the technical aspects.

The source texts for this article can be found at: https://github.com/Java-Publications/Blog—Secure-Coding-Practices—CWE-022–377–778—A-practical-Demo 

Basic project structure

To begin, we will create a new Vaadin project. The easiest way to do this is via the project starter, which you can find at https://start.vaadin.com/ or by using an existing Maven template. The file structure of our project essentially looks like this:

Screenshot

The file `MainView.java` will be the application’s central entry point. Here, we will implement the user interface and the logic for file uploads and downloads.

Add file upload functionality.

First, we will create a simple user interface that allows users to upload files. In `MainView.java`, our basic setup looks like this:

In this code, we use `MemoryBuffer` to temporarily save the uploaded file and then write it to the `uploads/` directory. If the target directory doesn’t already exist, it will be created automatically. Using `MemoryBuffer` allows easy and secure file management before it is written to the hard disk.

Add download functionality

To list the downloadable files, we’ll add a button that allows users to download any file from the directory. This improves the user experience and ensures users can access their uploaded files easily. Here we’ll extend the user interface:

Using the method `updateFileList()` displays the files stored in the `uploads/` directory as a list, and creates an `anchor` element for each file, which serves as a download link. This makes the interface more intuitive and allows users to manage uploaded files easily.

Best Practices uses Java NIO.

We can use Java NIO (New Input/Output) instead of traditional IO streams to improve efficiency and security when handling files. Java NIO provides non-blocking IO operations, enabling better performance and scalability. It also supports more flexible and secure file system operations.

We adapt our code to use Java NIO classes like `Files` and `Path` to save the uploaded files. Here’s an improved version of the upload code that uses Java NIO:

This version uses the `Files` and `Paths` classes to create directories and store files. This improves code readability and maintainability and leverages the advantages of the NIO classes, such as better exception handling and more flexible path operations. Using Java NIO makes file operations more efficient and secure, especially when working with multiple threads concurrently.

CWE-22: Path Traversal and Protection Measures

CWE-22, also known as “Path Traversal,” is a vulnerability that occurs when users can access unauthorised files and directories in the file system via insecure path names. This is typically done by users inserting special character strings such as `../` into filenames to extend beyond the boundaries of the permitted directory. If this is not controlled correctly, an attacker could gain access to critical system files, potentially leading to a serious security compromise.

In our file management application, path traversal is risky if the filename is used directly and without verification to save or retrieve a file in the file system. Attackers could attempt to manipulate path information and overwrite or read files outside the intended directory.

To avoid vulnerabilities like CWE-22 (Path Traversal) in your application, you should be especially careful with user-supplied filenames. Attackers could attempt to access files outside their intended location using manipulated path names—for example, by entering something like../../etc/passwd. Therefore, checking and cleaning all paths and file names is essential before using them. Here are a few best practices you should implement:

  • Path cleanup: Use methods like Path.normalize()from the Java NIO package to clean paths. This automatically removes dangerous constructs such as double slashes or relative elements (..) that attackers could use to traverse directories.
  • Directory restriction: Make sure that the path you use to save files is within a secure, predefined home directory – e.g. uploads/ You can verify this by comparing the cleaned destination path with the expected base directory. You should allow the upload only if the destination path is truly within the allowed range.
  • File name validation: It’s not enough to check the path alone. The filename itself can also contain dangerous characters. It’s best to allow only simple, non-critical characters, such as letters, numbers, hyphens, and periods. Using a regular expression like[a-za-Z0-9._—], you can specifically remove or replace everything else.

Considering these points makes it significantly more difficult for attackers to control paths and protect your server structure and users’ sensitive data.

Here is a customised version of our application that implements protections against CWE-22:

In this updated version of the application, we’ve added a `sanitizeFileName()` method to ensure the filename doesn’t contain any dangerous characters. We also normalise the path with `Path.normalize()` and verify that the final path is within the desired `uploads` directory. If the path is outside this directory, the upload is aborted and an appropriate error message is displayed.

These changes ensure that attackers cannot gain unauthorised access to the file system by manipulating file names. This keeps the application secure and protects both the server and the data stored on it from misuse.

CWE-377: Insecure temporary files

CWE-377, also known as “Insecure Temporary File,” describes a security vulnerability that occurs when you create temporary files in a way that can be exploited by attackers. Such files are often used to cache content, either during processing or for temporary storage in general. However, if you create them insecurely, attackers could access, tamper with, or even overwrite them.

A typical attack scenario: An attacker creates a symbolic link (symlink) pointing to a critical system file – your application then unknowingly writes data there. Or they manipulate the file while it’s still being processed, thereby introducing unwanted content or malicious code into the system. The consequences can range from data loss and integrity violations to a complete system compromise.

Temporary files are also created in your file management application—for example, when processing uploads. Therefore, you must create these files securely. Here are a few best practices to effectively avoid CWE-377:

  • Safe methods for creating temporary files:In Java, use the Files.createTempFile() method. It automatically creates a temporary file with a unique, random name, reducing the risk of attackers accessing it.
  • Restrict access rights: Only your process can access the temporary file. Set the file permissions so no other users or services have access—this is especially important in shared environments.
  • Use unpredictable file names: Avoid using fixed or easily guessed names for temporary files. Otherwise, attackers could create a file with the same name beforehand or overwrite existing ones.

Considering these points, you can securely handle temporary files and protect your application from an often underestimated attack vector.

In this revised version, you create a secure temporary file with Files.createTempFile(). You use this to temporarily store the uploaded content before moving it to the final destination directory, ensuring that third parties cannot tamper with the temporary file.

Only after the content has been successfully saved do you move the file to the correct location in the file system. This way, you can upload the files in a controlled and secure manner and minimise the risk of temporary files becoming a gateway for attackers.

CWE-778: Insufficient logging

CWE-778, also known as “Insufficient Logging,” describes a security vulnerability that occurs when your application doesn’t log enough to detect and track security-relevant events. If you don’t maintain detailed logging, attempted attacks, unauthorised access, or system errors may go unnoticed for a long time, or you may not have enough information to respond appropriately in an emergency.

Especially in safety-critical applications, you should document all critical actions and errors to understand what happened later clearly. This is the only way to allow yourself or your team to respond quickly to incidents and learn from them.

If you log too little or nothing at all, you may miss signs of attacks, such as suspicious file names (path traversal), repeated failed uploads, or unauthorised access. To prevent this, you should keep a few basic things in mind:

  • Log security-relevant events: Log all critical actions, such as file uploads, file accesses, path checks, or rejected requests, along with timestamps and context.
  • Record errors and exceptions: Make sure that you display a message for all errors and exceptions and record the exact cause in the log. This allows you to search for specific sources of errors later.
  • Use a central logging solution: Use proven logging frameworks like SLF4J in combination with Logback or Log4j2. This ensures that all log messages are recorded consistently, structured, and configurable for your environment.

Below is a customised version of the application, which allows you to insert log messages at security-critical points. This gives you essential insights during operation and will enable you to react quickly to problems.

The revised implementation uses the SLF4J (Simple Logging Facade for Java) logging API in combination with a specific implementation such as Logback. This separation of API and logging engine offers flexibility in choosing the underlying technology and facilitates integration into different runtime environments. The goal is to systematically record security-relevant events to ensure both transparency and traceability of security-critical actions during operation.

In the context of security-conscious web applications—especially those that work with user-generated content such as file uploads—comprehensive and structured logging is essential. You should explicitly record the following event types:

  • Path traversal tests (CWE-22): If a user attempts to access paths outside the intended memory area through manipulated input, you should report this with a WARN or higher log level. The logged information should include both the compromised path and the associated session or IP address to enable later correlation with other events.
  • Successful file uploads: Every completed upload process should be documented in the log, including the cleaned-up file name, the target path in the file system, and optional context information such as the user ID or upload time. This allows for complete traceability and also serves as an audit trail.
  • Upload error: If an exception occurs while saving the file (e.g., due to file system errors, access violations, or corrupted streams), you should log the full stack trace and all relevant metadata (file name, user context, time). This is essential for efficient error diagnosis and allows you to distinguish potential attack attempts from legitimate error cases.
  • Dynamic update of the file view: Logging can also be helpful for internal system actions, such as refreshing the list of available files to track when a file becomes visible or whether problems occurred during processing (e.g., access errors in locked directories).

Implementing these measures consistently allows you to detect and evaluate security-relevant incidents during ongoing operations promptly. Furthermore, a transparent logging strategy ensures that administrators and incident response teams have a solid database to rely on during an investigation. This strengthens your application’s ability to respond to attacks and fulfils key requirements for traceability, auditing, and compliance in security-critical IT systems.

Summary

We developed a functional file management application in Vaadin Flow in just a few steps. Users can upload files securely stored in the server directory and download them again. We used Java NIO to make file handling more efficient and secure. Additionally, we implemented safeguards against CWE-22 (Path Traversal), CWE-377 (Insecure Temporary File), and CWE-778 (Insufficient Logging) to ensure that attackers cannot gain unauthorised access to the file system, that temporary files are created securely, and that security-relevant events are comprehensively logged. This example demonstrates how easy it is to create a powerful user interface and implement server-side logic in Java with Vaadin Flow.

The next steps could be to extend the application, for example, by adding user permissions, customising the file overview, or integrating a search function for uploaded files. Vaadin offers numerous visual components to further improve the usability of your applications, but it is up to the developer to secure these features. Furthermore, we could extend the application with features such as drag-and-drop for file uploads, user role management, or a connection to a database for indexing files and storing metadata. However, with each additional feature, there are also additional attack vectors. It’s always important to find the right balance.

By enhancing security measures such as implementing robust file validation, sufficient logging mechanisms, and using Java NIO, we ensure that the application remains secure and efficient for both the developer and end users.

We’ll explore different aspects in the following parts, so it remains exciting.

Happy Coding

Sven

DNS Attacks – Explained

1. Getting started – trust in everyday internet life

Anyone who enters a web address like “www.example.de” into the browser expects a familiar website to appear within seconds. Whether in the home office, at the university, or in the data center, access to online services is now a given. The underlying technical processes are invisible to most users; even in IT practice, they are often taken for granted. One of these invisible processes is name resolution by the Domain Name System (DNS).

DNS is the phone book of the Internet. It ensures that human-readable domain names are translated into numeric IP addresses so that computers worldwide can communicate with each other. Without DNS, there would be no URLs, emails, or APIs. Nevertheless, DNS has had a shadowy existence in the security discourse for a long time. It is not a purely technical detail but rather a central link in the trust relationship between the user, application, and infrastructure.

Because DNS works in the background, it is a popular target for attackers. DNS attacks aim to disrupt, manipulate, or redirect this process for your own purposes, with potentially serious consequences. Anyone who relies on a domain name always leading to the correct IP address risks falling into a well-disguised trap. The affected applications appear to continue to work perfectly—they just communicate with the wrong counterpart.

This text highlights the diverse forms of attacks on DNS, how they work, and their implications for the operation and development of secure systems. Particular attention is paid to the question of how DNS must be considered from the perspective of modern applications, especially in Java, so as not to become the Achilles heel of an otherwise solid security architecture.

2. The Domain Name System – backbone of the digital world

The Domain Name System has become integral to today’s Internet communication. It was developed in the early 1980s to replace the increasingly confusing web of growing hostnames with a scalable, hierarchical structure. Instead of remembering IP addresses, people can now use spoken domain names, and rely on them reliably pointing to the correct servers.

The basic job of DNS is to translate a string like “www.uni-heidelberg.de“ to an IP address like “129.206.100.168”. This process occurs in several steps and is handled by different types of servers. The process usually begins with the so-called resolver – a system service or external service provider that accepts requests from end devices. This resolver in turn queries root name servers, top-level domain servers (e.g. for .de, .com, .org) and finally authoritative name servers until it finds the IP address it is looking for.

The concept of caching is essential to DNS’s efficiency. Each resolver stores already resolved queries for a defined period of time to speed up re-queries. Operating systems and applications can also maintain their own DNS caches. This behavior contributes significantly to reducing global DNS traffic, but it is also one of the biggest weak points, as later chapters will show.

DNS is based on the User Datagram Protocol (UDP) and is stateless and unencrypted by design. Although extensions such as DNSSEC or DNS-over-HTTPS exist, they are far from being used widely in everyday life. This means that DNS remains an open system, efficient but vulnerable. The assumption that DNS “just works” quickly becomes a dangerous illusion from a security perspective.

3. Invisible Attack Surface – Why DNS is vulnerable

The architecture of the Domain Name System follows the paradigm of openness. This means that requests are standardised, unencrypted, and mostly stateless, which is precisely why they are particularly high-performance. But the very properties that make DNS a robust part of the Internet infrastructure also pose serious risks. The vulnerability of the DNS is not an accidental by-product of modern forms of attack but is rooted in the original system design.

DNS was designed when the Internet was a trusted space characterised by academic collaboration and open standards. Security mechanisms were not part of the design goal at the time. Accordingly, DNS still lacks built-in integrity, authenticity, and confidentiality. Queries are not verified, answers are not signed, and the source is not checked. This creates attack vectors that are still being exploited today by actors of all stripes.

Additionally, DNS requests are typically transmitted over UDP, which makes them easy to forge or intercept. Transaction IDs and ports offer minimal protection against spoofing, but these hurdles are technically easy to overcome with today’s means. Even with TCP-based fallbacks, absolute protection is lacking without additional measures such as DNSSEC.

A particular vulnerability lies in the caching behavior of resolvers and applications. Once saved, answers are considered valid for a certain period, regardless of whether they were correct or manipulated. Attackers can specifically exploit this behavior to inject false answers into the cache and thus redirect many subsequent requests. This so-called cache poisoning is one of the oldest but still most effective attacks on the DNS.

DNS is therefore not a secure anchor of trust, at least not without additional measures. Anyone using DNS in a security-critical architecture, such as modern Java applications, cloud APIs or container infrastructures, should know the inherent weaknesses. DNS is robust, but it can also be manipulated. This makes it an attractive target – and an often underestimated attack surface.

4. DNS cache poisoning – The classic attack

Among the numerous attack methods on the domain name system, DNS cache poisoning—also known as DNS spoofing—is considered particularly perfidious. The aim of this attack is to manipulate the caching of DNS responses. The attacker injects a fake DNS response into a resolver’s cache, with the effect that all subsequent queries to a specific domain name point to a false IP address. The affected users usually do not notice the attack: the website looks as usual, except that the attacker controls it.

The process of such an attack is as technically interesting as it is security-relevant. The attacker does not wait for a resolver to resolve a specific domain but proactively tries to outsmart the resolver with fake answers. To do this, it must guess the transaction ID of the DNS request – a short, 16-bit random number – and respond faster than the legitimate DNS server. If this succeeds, the resolver accepts the incorrect answer as valid and stores it in the cache.

What is particularly critical is that this manipulation persists: As long as the cache entry remains valid (TTL), every new request will return the same fake answer. This way, individual users and entire organisations can be systematically misled – for example, on phishing sites, fake authentication services or malware sources. This can result in a REST API being accessed supposedly successfully in Java applications, even though a completely different infrastructure is responding in the background.

Modern DNS implementations now rely on additional protection mechanisms such as random source ports, rate limiting, or DNSSEC. But many resolvers—especially on local networks or outdated systems—remain vulnerable. Even browsers or libraries with built-in DNS caching can become targets.

For developers, this means that the trust assumption that InetAddress.getByName() always returns the “correct” address is dangerous. Explicit protective measures are required when DNS is used in security-critical contexts, for example, through the use of cryptographically signed DNS responses (DNSSEC), deliberate cache invalidation, or explicit validation of the remote site, for example, via TLS certificates.

DNS cache poisoning is a lesson in how a helpful performance optimization—caching—can turn into the exact opposite. Anyone who wants to operate or use DNS securely must understand how the system works and how easily it can be manipulated at its core.

5. Amplification and Deception – DNS as a vehicle for DDoS and tunneling

In addition to the targeted manipulation of DNS caches, there are other attack scenarios in which DNS serves as a technical vehicle for higher-level goals, such as denial-of-service attacks or secret data transmission. Two particularly relevant variants in this context are DNS amplification and DNS tunneling.

A DNS amplification attack is a form of reflected denial of service (DoS) that exploits the asymmetric structure of the DNS protocol. The attacker sends a small DNS query – often of type ANY – to an open resolver, but spoofs the source address to point to the victim. The resolver responds with a significantly larger amount of data than the request, which is sent directly to the victim. Multiplied across many resolvers, this creates a massive data stream that is explicitly used to overload servers. DNS amplification is so effective because the attacker can cause disproportionate damage with minimal use of resources.

Another gateway is DNS tunneling. This technique uses DNS packets to pass any other data through firewalls or security measures. Since DNS traffic is hardly restricted in many networks, the protocol is ideal for bypassing control authorities. The payload—such as commands to malware or stolen data—is hidden in seemingly legitimate DNS queries or responses. On the receiving end, this information is extracted and evaluated again. DNS tunneling is considered particularly insidious because it is difficult to detect using traditional security tools.

A simple Java example demonstrating how a DNS tunnel could be hidden in a seemingly legitimate HTTP communication can be implemented using Java SE’s on-board tools. In the following scenario, we implement a minimalist REST server with com.sun.net.httpserver.HttpServer, which receives DNS-like encoded data, representative of a possible C2 channel (Command & Control).

HttpServer server = HttpServer.create(new InetSocketAddress(8080), 0);

server.createContext(“/api/lookup”, exchange -> {

    if (“GET”.equals(exchange.getRequestMethod())) {

        URI requestURI = exchange.getRequestURI();

        String query = requestURI.getQuery();

        if (query != null && query.contains(“q=”)) {

            String domainPayload = query.split(“q=”)[1];

            String decoded = domainPayload.replace(“.”, ” “); // simplified payload decoding

            System.out.println(“[DEBUG] Empfangenes DNS-Tunnel-Payload: ” + decoded);

        }

        String response = “OK”;

        exchange.sendResponseHeaders(200, response.length());

        exchange.getResponseBody().write(response.getBytes());

        exchange.close();

    }

});

server.setExecutor(null);

server.start();

In this simplified simulation, a DNS-like query parameter is received via a REST interface – e.g. b. http://localhost:8080/api/lookup?q=cmd.run.download. The server “interprets” the dot as a separator of an encoded DNS structure, as is common in DNS tunneling. In a real attack variant, this mechanism would be used by malware or scripts for bidirectional data exchange. What is critical here is that the transport takes place over seemingly unsuspicious channels – HTTP, DNS, ICMP – while the actual payload is hidden deep in protocol structures.

Both types of attacks have in common that DNS is not a direct target, but rather a tool – a transport system for superimposed attack targets. This results in the need for administrators and developers to monitor DNS traffic functionally and from a security perspective. In Java-based architectures, DNS dependencies should be consciously modeled, logged and – where possible – checked for validation and integrity. A simple forwarding of data via DNS may seem harmless, but it may be the start of a widespread attack.

6. DNS Hijacking – When attackers dictate the path

While DNS cache poisoning focuses on manipulating cached responses, DNS hijacking targets the source itself: the DNS resolver or the infrastructure that controls it. The attack begins when users resolve their domain names – either on the local device, on the home router, on corporate networks, or via an ISP’s resolver.

With DNS hijacking, DNS requests are systematically redirected – for example, through manipulation of network settings, compromised firmware, malicious DHCP servers or malware that specifically changes the configuration of a system. The resolver then no longer returns the IP addresses expected by the user, but instead forwards requests to servers controlled by the attacker. In practice, this means: Even if a user “www.bank.de“ enters something in your browser, you don’t end up on the real online banking portal, but on a phishing page that looks real.

This attack is hazardous because it is often difficult to detect. Unlike classic phishing, which relies on fake links or typos, DNS hijacking makes the actual URL appear completely legitimate. The certificate can also appear valid under certain circumstances—for example, with local man-in-the-middle proxies or if the attacker manages to obtain a certificate via compromised certificate authorities.

Java applications are also potentially affected, primarily if they communicate dynamically with external services. If the host machine’s DNS configuration is compromised, all DNS lookups within the JVM are affected, from REST clients to LDAP connections to email components. The application often does not recognize the problem because the connection is technically established correctly, just to the wrong destination.

The only solution is to take measures that secure DNS on several levels: through fixed resolver configurations, validation of the other side via TLS, use of secure DNS protocols such as DNS-over-HTTPS (DoH) or DNS-over-TLS (DoT), and by monitoring for suspicious name resolutions. In security-critical Java applications, it is recommended to compare the resolved IP addresses with allowlists or even to forego DNS queries if the target systems are known.

DNS hijacking is a remarkably sophisticated form of deception because it attacks trust where it seems most evident: in the integrity of the technical infrastructure itself.

7. Attack detected – but how?

Detecting DNS attacks is one of the most challenging tasks in network security. This is primarily because DNS is a widespread, dynamic and often invisible protocol. Name resolution occurs in the background, usually without direct user contact, and is often only incompletely recorded in logs and monitoring systems. Accordingly, many DNS-based attacks remain undetected for a long time, mainly when they are carried out precisely and subtly.

A central problem: Neither the browser nor the application registers whether a DNS response was legitimate, as long as the response is formally correct. A manipulated result usually does not differ from a real one, unless it is actively checked. This is where anomaly detection methods come into play. They try to identify unusual behavior, such as sudden changes in the IP assignment of known domains, an accumulation of DNS responses with short TTL or unusual destination addresses outside of usual networks.

Intrusion detection systems (IDS) and DNS-specific monitoring solutions such as Zeek, OpenDNS or SecurityOnion can help detect such anomalies. However, the prerequisite is that DNS traffic is completely recorded and evaluated, which brings additional complexity with encrypted DNS (DoH/DoT). In controlled networks, it is therefore recommended to route DNS via dedicated resolvers that can be logged and monitored.

In Java applications, suspicious name resolutions can be detected by logging IP mappings, comparing against expected hostnames, or active whitelisting. Metrics such as the number of different IP addresses per domain over time or the reuse of certain, unexpected IP ranges are also valuable information.

In the long term, comprehensive validation of DNS responses via DNSSEC is essential in ensuring integrity. However, DNSSEC is not retroactive – it only protects when domain owners and resolvers actively work together. Accordingly, awareness of DNS attacks and their detection is a crucial component of any security strategy: Anyone who only sees DNS as a technical infrastructure risks being compromised exactly where trust begins – with name resolution.

8. Protective measures – think and implement DNS securely

8.1 General Network Protection Measures

Detecting DNS attacks is one of the most challenging tasks in network security. This is primarily because DNS is a widespread, dynamic and often invisible protocol. Name resolution occurs in the background, usually without direct user contact, and is often only incompletely recorded in logs and monitoring systems. Accordingly, many DNS-based attacks remain undetected for a long time, especially when they are carried out specifically and subtly.

A central problem: Neither the browser nor the application registers whether a DNS response was legitimate, as long as the reaction is formally correct. A manipulated result usually does not differ from a real one, unless it is actively checked. This is where anomaly detection methods come into play. They try to identify unusual behavior, such as sudden changes in the IP assignment of known domains, an accumulation of DNS responses with short TTL or unusual destination addresses outside of usual networks.

Intrusion detection systems (IDS) and DNS-specific monitoring solutions such as Zeek, OpenDNS or SecurityOnion can help detect such anomalies. However, the prerequisite is that DNS traffic is completely recorded and evaluated, which brings additional complexity with encrypted DNS (DoH/DoT). In controlled networks, it is therefore recommended to route DNS via dedicated resolvers that can be logged and monitored.#### 8.2 Security within Java applications

In Java applications, suspicious name resolutions can be detected by logging IP mappings, comparing against expected hostnames, or active whitelisting. Metrics such as the number of different IP addresses per domain over time or the reuse of certain, unexpected IP ranges are also valuable information.

In the long term, comprehensive validation of DNS responses via DNSSEC is essential in ensuring integrity. However, DNSSEC is not retroactive – it only protects when domain owners and resolvers actively work together. Accordingly, awareness of DNS attacks and their detection is a crucial component of any security strategy: Anyone who only sees DNS as a technical infrastructure risks being compromised exactly where trust begins – with name resolution.

8.3 Advanced protection measures at protocol and application level

Various protection measures can be implemented for Java applications to minimize the risks of DNS-based attacks. A key approach is securing the name resolution process within the JVM. The class InetAddress, for example, uses resolvers close to the operating system, which makes the JVM fundamentally vulnerable to manipulation at the system level. In security-critical applications, it can, therefore, make sense to implement your own resolvers with DNSSEC validation or to use external, verified services.

Additionally, it is recommended to be combined with transport encryption and hostname validation at the application level. The HttpClient from the java.net.http module in Java 11+ enables e.g. B. direct control over TLS certificate checking. By securing via public key pinning, certificate transparency logs or targeted whitelists, manipulated DNS responses can be detected and blocked.

Another security option is to use alternative DNS protocols such as DoH (DNS over HTTPS) or DoT (DNS over TLS) via upstream proxies. These can also be integrated into containerized environments such as Kubernetes as sidecar resolvers. A locally running DNS proxy (e.g., CoreDNS with DNSSEC and DoH support) can be used as a trustworthy source within the application.

A practical example of a hedging measure in Java is using your own X509TrustManager to not rely solely on the system-provided chain of trust for the TLS connection. This is particularly useful if you want to counter DNS manipulations that do not break the TLS channel but subvert a false certificate chain – for example, through a compromised CA certificate or a manipulated trust store.

TrustManager[] trustManagers = new TrustManager[] {

    new X509TrustManager() {

        public X509Certificate[] getAcceptedIssuers() {

            return new X509Certificate[0];

        }

        public void checkClientTrusted(X509Certificate[] certs, String authType) {

            // Application-specific testing – e.g. E.g. fingerprinting or pinning

        }

        public void checkServerTrusted(X509Certificate[] certs, String authType) throws CertificateException {

            for (X509Certificate cert : certs) {

                cert.checkValidity();

                if (!”CN=trusted.example.com”.equals(cert.getSubjectDN().getName())) {

                    throw new CertificateException(“Untrusted CN: ” + cert.getSubjectDN());

                }

            }

        }

    }

};

SSLContext sslContext = SSLContext.getInstance(“TLS”);

sslContext.init(null, trustManagers, new SecureRandom());

HttpClient client = HttpClient.newBuilder()

    .sslContext(sslContext)

    .build();

This example deliberately checks the subject of the server certificate passed. This makes it possible to ensure that the remote site actually has the expected identity, even if a DNS hijacker manipulates the target system. A more detailed check would be necessary in a productive system, for example, via public key pinning, dedicated trust stores or certificate chain validation.

Last but not least: DNS requests should be logged, checked for plausibility and ideally compared with an allowlist. This allows a simple but effective protective layer to be implemented, especially in applications where external communication is security-critical.

9. Conclusion – trust needs protection

Given the central role of the domain name system in virtually every form of modern network communication, it is remarkable how long DNS security played a secondary role in the architecture of many applications. The forms of attack highlighted here – from cache poisoning to DNS hijacking to tunneling and DDoS – clearly show that DNS is much more than a harmless auxiliary component. It is one of the most critical infrastructures of all, and simultaneously one of the most vulnerable.

Anyone who develops applications today that rely on Internet communication must think about DNS functionally and in terms of security. This applies to web portals, microservices, container environments, and IoT devices. In the Java ecosystem, in particular, the platform from version 11 provides numerous APIs and mechanisms to specifically secure DNS dependencies, be it through explicit TLS configurations, validation of target addresses, logging of suspicious resolutions, or the use of alternative resolvers.

DNS is not a static problem. The attack vectors continue to develop, and with them, the requirements for protective mechanisms. At a time when trust is becoming a currency of digital interactions, DNS can no longer remain a blind spot. It is time for DNS security to be established as an integral part of every security-conscious software and infrastructure architecture – in the code, operations and all stakeholders’ minds. logged, checked for plausibility and ideally compared with an allowlist. This allows a simple but effective protective layer to be implemented, especially in applications where external communication is security-critical.

Happy Coding

Sven

Java Cryptography Architecture (JCA) – An Overview

The Java Cryptography Architecture (JCA) is an essential framework within the Java platform that provides developers with a flexible and extensible interface for cryptographic operations. It is a central component of the Java Security API and enables platform-independent implementation of security-critical functions.

At its core, the JCA provides mechanisms for various cryptographic applications, including the calculation of hashes to ensure data integrity, the generation and verification of digital signatures, and methods for encrypting and decrypting sensitive information. Supporting both symmetrical and asymmetrical encryption methods, it ensures a high level of security when processing data. Cryptographic key management is another key aspect that includes secure storage and exchange of keys.

A significant feature of the JCA is the ability to integrate different security providers (so-called providers) that implement different cryptographic algorithms. In addition to the providers that come standard with Java, such as SunJCE, there are alternative implementations, including open source libraries such as Bouncy Castle and hardware-based solutions, that can be integrated via PKCS#11 interfaces.

The JCA’s architecture is characterised by a modular design that allows a separation between the API and the concrete implementations. New cryptographic algorithms can be added via the so-called Service Provider Interface (SPI) without having to adapt existing application code, contributing to the long-term maintainability and security of Java applications.

In addition to classic cryptography, the JCA also supports mechanisms for the secure generation of random numbers, the calculation of Message Authentication Codes (MACs) to ensure data integrity, and the management of certificates and trust stores. In particular, integration with the Java KeyStore API offers a comprehensive basis for the secure handling of X.509 certificates and cryptographic keys.

How is the JCA integrated into the Security API?

The Java Security API is a comprehensive framework for implementing security-related mechanisms in Java applications. It provides various functions that cover various aspects of IT security, including cryptography, authentication, authorization, network security, and cryptographic key management. A central component of this framework is the Java Cryptography Architecture (JCA).

While JCA is primarily responsible for the cryptographic functions, it extends the Java Cryptography Extension (JCE). This concept includes algorithms for symmetric encryption, key agreement and block cipher modes. In addition to cryptography, the Java Security API includes other security-relevant modules. The Java Secure Socket Extension (JSSE) API enables secure communication via network protocols such as TLS and SSL, enabling encrypted connections for web applications and other network services to be realised. This provides authentication and authorisation Java Authentication and Authorization Service (JAAS) API mechanisms available to verify user identities and control access rights. This architecture allows integration with various authentication sources such as user-password combinations, Kerberos or biometric methods.

Another essential component of the framework is the Java Public Key Infrastructure (PKI) API, which deals with the management of digital certificates and public keys. This API plays a central role in the realization of TLS encryption and the implementation of digital signatures. Certificate management is carried out via so-called KeyStores, which enable secure storage of cryptographic keys. In addition, the framework supports digital signatures for XML documents through the Java XML Digital Signature API (XMLSig), which is particularly used in web services and SAML authentication scenarios.

The distinction between the Java Security API and the JCA lies in their respective focus. While the Java Security API provides a comprehensive collection of security-related mechanisms and covers both cryptographic and non-cryptographic security aspects such as authentication, permission management and network security, the JCA focuses exclusively on providing cryptographic operations. In this context, JCA offers basic mechanisms for encryption and digital signatures, but JCE is required for modern symmetric encryption methods. Likewise, although JCA includes basic functions for managing cryptographic keys, more comprehensive management of certificates and public key infrastructure requires the use of the PKI API.

In practice, this means that developers access the JCA and JCE directly to implement cryptographic security mechanisms, while specific modules such as JAAS or JSSE are used for higher security-related functions such as authentication, authorisation or the management of secure network connections. The Java Security API thus forms a coherent but modular security architecture that can be used specifically depending on the application.

In this article we will only deal with the JCA. The other functional units will be gradually added in later articles.

Symmetric encryption

The symmetric encryption within the Java Cryptography Architecture (JCA) offers a powerful and flexible interface for the secure processing of confidential data. It is based on the use of a single secret key for encryption and decryption, which makes it more efficient and faster than asymmetric methods. The JCA provides the Cipher-Class that acts as a central mechanism for symmetric cryptography and enables the implementation of various block and stream ciphers.

A key feature of symmetric encryption in the JCA is its support for modern block ciphers such as AES (Advanced Encryption Standard), which is considered the current industry standard, as well as older methods such as DES (Data Encryption Standard) and its improved version, Triple DES (3DES). The API allows the use of various operating modes, including ECB (Electronic Codebook), which is considered unsafe, as well as safer modes such as CBC (Cipher Block Chaining), CFB (Cipher Feedback Mode) and GCM (Galois/Counter Mode). GCM in particular is preferred because, in addition to confidentiality, it also offers integrity protection through Authenticated Encryption (AEAD).

A central concept of the JCA is the separation between the abstract API and the concrete implementations provided by various cryptography providers. This keeps the actual implementation interchangeable without tying the application to a specific algorithm or library. The initialization of a Cipher-Object is done by specifying the algorithm as well as the desired operating mode and an optional padding method that ensures block-by-block processing of the data. Common padding schemes are PKCS5Padding and NoPadding, the latter requiring manual adjustment of the input data.

The JCA also provides mechanisms for the secure generation and management of keys. About the KeyGenerator-Class, symmetrical keys can be generated based on a defined key length, with key lengths of 128, 192 and 256 bits being particularly common for AES. To ensure the security of the keys, they can be passed through the Java KeyStore (JKS) or external hardware security modules (HSMs). Additionally, the API allows keys to be imported and exported in standardised formats to ensure interoperability with other cryptographic systems.

Another central element of symmetric encryption in the JCA is the treatment of initialisation vectors (IVs), which are crucial for the security of the encryption, especially in modes such as CBC or GCM. IVs must be unique for each encryption operation to prevent attacks such as replay or chosen plaintext attacks. In practice, IVs are often over SecureRandom generated to ensure high entropy and stored or transmitted along with the encrypted data.

The architecture of the JCA allows developers to use symmetric encryption not only for straightforward file and message encryption, but also in more complex scenarios such as Transport Layer Security (TLS), Database encryption and Hardware-assisted encryption. By combining it with other JCA components, such as the Mac class for Message Authentication Codes (e.g. HMAC-SHA256) or Key Derivation Functions (e.g. PBKDF2), additional security features such as authentication and key derivation can be implemented.

The first practical steps

Let’s now move on to a simple example of using symmetric encryption within the Java Cryptography Architecture (JCA) and use the AES algorithm family in combination with a secure operating mode and padding. The implementation relies on the Cipher-Class, a central interface in Java for processing cryptographic operations. Choosing secure key management is crucial, which is why the key is stored using the KeyGenerator class, which is created. SecureRandom is used to ensure high entropy and encryption security for the initialisation data.

Within the method, a character array (char[]) is used instead of a string to ensure that sensitive data does not remain in the string pool and can be specifically overwritten after processing. (Here, I refer to the previous article, in which I described this in more detail.) After the key and IV initialisation, encryption is carried out using a Cypher object in AES-CBC mode with PKCS5Padding. The encrypted string is then stored in a byte array. Decryption is done in reverse order by reinitialising the Cipherobject with the same key and IV, which allows the original characters to be restored from the encrypted data stream.

The following source code demonstrates the implementation of this concept:

Asymmetric encryption

The asymmetric encryption within the Java Cryptography Architecture (JCA) provides a flexible and secure way to encrypt and decrypt data using two different keys: a public key for encryption and a private key for decryption. This method is based on one-way mathematical functions, which make it practically impossible to calculate the private key from the public key. This makes asymmetric cryptography particularly suitable for applications that require secure key distribution and digital signatures, including Transport Layer Security (TLS), Public Key Infrastructure (PKI), and various authentication mechanisms.

The JCA provides with the Cipherclass provides a central API that enables the implementation of asymmetric encryption methods. The algorithms are particularly widespread RSA (Rivest-Shamir-Adleman), Elliptic Curve Cryptography (ECC) as well as  Diffie-Hellman for key exchange. RSA is based on the factorization of large prime numbers and is widely used as a classic public key method. In contrast, ECC uses elliptic curves over finite fields and offers comparable security with a smaller key length, making it particularly preferred for resource-constrained environments such as mobile devices or smart cards.

The key is generated via the KeyPairGenerator-Class that allows to generate asymmetric key pairs with different key lengths. The choice of key length affects security and computing power, with RSA typically operating at 2048 or 4096 bits, while elliptic curves use more efficient 256 or 384 bit keys. The public key generated can be freely distributed, while the private key must be kept secure, for example by storing it in a Java KeyStore (JKS) or a hardware security solution such as a Hardware Security Module (HSM).

A Cipher object is initialised with the desired algorithm and mode for the encryption. The JCA supports various padding mechanisms necessary to adapt input data to the block size of the encryption function used. At RSA, PKCS1Padding and OAEP (Optimal Asymmetric Encryption Padding) are standard methods, with OAEP being preferred due to its increased security, as it protects against adaptive chosen ciphertext attacks. Due to its high computational complexity, asymmetric encryption is primarily suitable for encrypting small amounts of data, such as key material or hash values, rather than large files or streaming data.

Another central application area of ​​asymmetric cryptography in the JCA is the digital signature, which is carried out via the Signature class. The private key creates a signature for a message or file, which can later be verified using the public key. Digital signatures ensure both the authenticity and integrity of the transmitted data by ensuring that the message comes from the specified sender and has not been subsequently altered. In particular, algorithms like RSA with SHA-256, ECDSA (Elliptic Curve Digital Signature Algorithm) or EdDSA (Edwards-Curve Digital Signature Algorithm) are used in modern applications to ensure authenticity and data integrity.

In addition to classic asymmetric encryption, the JCA also supports hybrid methods in which asymmetric cryptography is used to securely transmit a symmetric key, which is then used for data encryption. This procedure is used, for example, TLS (Transport Layer Security) or PGP (Pretty Good Privacy), to combine the advantages of both types of cryptography: the security of asymmetric key distribution and the efficiency of symmetric encryption.

The JCA’s architecture allows developers to flexibly use asymmetric encryption schemes by supporting a variety of security vendors that provide different implementations. By default, the Java platform offers support for RSA, DSA and elliptic curves, while alternative providers such as Bouncy Castle enable additional algorithms and optimisations. 

A practical example

This example uses the RSA algorithm to implement asymmetric encryption, encrypting data with a public key and decrypting it with the corresponding private key. The implementation is done using the Cipher class. The KeyPairGenerator class is used to generate the key pair, which makes it possible to generate an RSA key with a length of 2048 bits. The resulting key pair consists of a public key used for encryption and a private key required for decryption.

The encryption method takes char[] as input to ensure that no sensitive data is included as String remains in memory. The characters are explicitly converted to a byte array before encryption to enable direct storage operations. Afterwards, Cipherobject is put into encryption mode with the public key, and the conversion is performed. Once encryption is complete, the original byte array is securely overwritten to prevent potential reconstruction in memory.

Decryption is carried out using an analogous procedure. The Cipher object is initialised with the private key to return the previously encrypted byte array to its original state. After decryption, the byte array turns back into a char[] converted without doing a String conversion. To protect the decrypted data, the original and decrypted byte arrays are overwritten with null values ​​after processing.

Digital signatures

Digital signatures are cryptographic mechanisms that ensure digital messages or documents’ authenticity, integrity and non-repudiation. They are based on asymmetric cryptography and use a key pair consisting of a private key used to generate the signature and a public key used for verification. The central idea behind digital signatures is to confirm the sender’s identity and ensure that the transmitted data has not been tampered with during transmission or storage.

Creating a digital signature begins with calculating a hash value of the message to be signed using a cryptographic hash function. This hash value represents a unique, compact representation of the original message, with even a minimal change to the message resulting in a completely different hash. This hash value is then encrypted with the sender’s private key, creating the digital signature. This signature is transmitted or stored along with the original message to ensure its authenticity later.

The recipient verifies the digital signature using the sender’s public key. To do this, the received message is first hashed again to calculate a new hash value. In parallel, the digital signature is decrypted using the public key, restoring the original hash value generated by the sender. If both hash values ​​match, the recipient can assume with a high degree of certainty that the message is authentic and has not been changed. If the hash values ​​are different, it means either that the message has been tampered with or that the signature was created with a different key, indicating an unauthorised modification or a forged identity.

The security of digital signatures depends largely on the strength of the underlying mathematical procedures and the secure handling of the private keys. Classic signature algorithms like RSA or DSA (Digital Signature Algorithm) are increasingly being used by more modern processes such as ECDSA (Elliptic Curve Digital Signature Algorithm) or EdDSA (Edwards-Curve Digital Signature Algorithm), as they offer comparable security with shorter keys and higher efficiency. The continuous development of cryptographic standards is essential to keep digital signatures secure in the long term against future threats, such as those from quantum computers.

A practical example 

This digital signature implementation example demonstrates the creation and verification of a signature using the RSA signature algorithm with SHA-256. The Signature-Class carries out the implementation. To generate a valid key pair, the KeyPairGenerator class makes it possible to generate an RSA key pair with a length of 2048 bits. The resulting key pair consists of a private key, which is used to create the signature, and a public key, which is used for later verification.

Signature generation begins by processing an input known as char[] that is present. Before signing, the characters are converted directly into a byte array without creating strings. The Signature instance is initialised with the private key, after which the data is processed and the signature is generated. This is stored in a separate byte array, while the original input array is overwritten with null values ​​after processing to minimise security risks.

The signature is verified by initialising the Signature instance again, this time with the public key. The received data is passed to signature verification in byte form, ensuring that it matches the originally signed data. If the verification is successful, it means that the data has remained unchanged and comes from the specified source. On the other hand, a failure of verification indicates that either the data or the signature was manipulated or that an incorrect public key was used for verification.

The following source code shows an example usage:

Now let’s combine symmetric and asymmetric encryption

Secure communication between a client and a server can be achieved by combining asymmetric cryptography for key exchange and symmetrical encryption for the actual message transmission. In this simulation, a hybrid encryption system is implemented that initially uses RSA to enable secure transmission of a session key, which is then used for encrypted communication using AES. This architecture is comparable to the handshake process in TLS (Transport Layer Security), whereby asymmetric cryptography is only used for the initial key exchange. At the same time, the more efficient symmetric encryption secures the messages.

First, the server generates an RSA key pair, which is used to encrypt and decrypt the session key. The client provides the public key while the private key remains on the server. The client then creates one random AES session key, which is encrypted with and sent to the server’s RSA public key. Once received, the server decrypts the session key using its private key. Both parties now have the same symmetrical key used for subsequent encrypted communication.

The message transmission takes place using the AES algorithm in Galois/Counter Mode (GCM), which, in addition to confidentiality, also ensures integrity protection through integrated authentication. Each message packet sent is given a random initialisation vector (IV) before encryption to ensure that identical messages do not result in identical ciphertexts.

If you remember the login process, you can use this procedure to secure the components transmitted over the network, such as hash values ​​and salt values. At this point, I must, of course, point out again that this is a presentation of the basic principles. For productive use, it is recommended to use existing implementations and protocols. 

Happy Coding

Sven

Cache Poisoning Attacks on Dependency Management Systems like Maven

Cache poisoning on Maven Caches is a specific attack that targets how Maven Caches manages packages and dependencies in a software development process. It’s essential to understand how Maven works before we look at the details of cache poisoning. 

Overview of Maven and its caches

Apache Maven is a widely used build management tool in Java projects. It automates the dependency management, build process, and application deployment. However, some fundamental mechanisms make it necessary to consider the security of repositories and dependencies when using Maven.

Maven uses repositories to manage libraries and dependencies. There are two types of Maven repositories:

Local repository: A copy of all downloaded libraries and dependencies is saved on the local machine.

Remote Repositories: Maven can access various remote repositories, such as the central Maven repository or a company’s custom repositories. 

After downloading them from a remote repository, Maven stores all dependencies in the local repository (cache). This allows dependencies that are needed multiple times to load more quickly because they do not have to be downloaded from a remote repository each time.

What is cache poisoning?

Cache poisoning is a class of attacks in which an attacker fills a system’s cache (in this case Maven caches) with manipulated or malicious content. This results in legitimate requests that should receive the original data instead of the data injected by an attacker. In Maven terms, cache poisoning refers to when an attacker injects malicious artefacts into a developer’s or build’s cache by exploiting a vulnerability in the Maven build process or repository servers.

The attack aims to deliver malicious dependencies that are then integrated into software projects. These poisoned dependencies could contain malicious code to steal sensitive data, take control of the system, or sabotage the project.

Types of cache poisoning on Maven caches

There are several scenarios in which cache poisoning attacks can be carried out on Maven repositories:

Man-in-the-Middle (MITM) Cache Poisoning

A man-in-the-middle attack allows an attacker to intercept and manipulate network traffic between the developer and the remote Maven repository. If communication is not encrypted, an attacker can inject crafted artefacts and introduce them into the local Maven cache. As a result, the developer believes that the dependencies come from a trusted repository, when in fact, they have been tampered with.

Such an attack could be successful if Maven communicates with repositories over unsecured HTTP connections. The central Maven repository (Maven Central) now exclusively uses HTTPS to prevent such attacks, but some private or legacy repositories use HTTP.

Exploit repository vulnerabilities

If an attacker gains access to the remote repository, they can upload arbitrary artefacts or replace existing versions. This happens, for example, if the repository is poorly secured or a vulnerability in the repository management tool (like Nexus or Artifactory) is exploited. In this case, the attacker can inject malware directly into the repository, causing developers worldwide to download the compromised artefact and store it in their Maven cache.

Dependency Confusion

A particularly dangerous attack vector that has received much attention in recent years is the so-called “dependency confusion” attack. This attack is because many modern software projects draw dependencies from internal and private repositories and public repositories such as Maven Central. The main goal of a Dependency Confusion attack is to inject malicious packages via publicly accessible repositories into a company or project that believes it is using internal or private dependencies.

Basics of Dependency Confusion

Many companies and projects maintain internal Maven repositories where they store their own libraries and dependencies that are not publicly accessible. These internal libraries can implement specific functionalities or make adaptations to public libraries. Developers often define the name and version of dependencies in the Maven configuration (`pom.xml`) without realising that Maven prioritises dependencies, favouring public repositories like Maven Central over internal ones unless explicitly configured otherwise.

A dependency confusion attack exploits exactly this priority order. The attacker publishes a package with the same name as an internal library to the public Maven repository, often with a higher version number than the one used internally. When Maven then looks for that dependency, it usually prefers the publicly available package rather than the private internal version. This downloads the malicious package and stores it in the developer’s Maven cache, from where it will be used in future builds.

How Dependency Confusion Was Discovered

A security researcher named Alex Birsan popularised this attack in 2021 when he demonstrated how easy it was to poison dependencies in projects at major tech companies. By releasing packages with the same names as internal libraries of large companies such as Apple, Microsoft, and Tesla, he successfully launched dependency confusion attacks against these companies.

Birsan did not use malicious content in his attacks but harmless code to prove that the system was vulnerable. He was able to show that in many cases, the companies’ build systems had downloaded and used the malicious (in his case harmless) package instead of the real internal library. This disclosure led to massive awareness in the security community about the risks of Dependency Confusion.

Why does Dependency Confusion work so effectively?

The success of a Dependency Confusion attack lies in the default configuration of many build systems and the way Maven resolves dependencies. There are several reasons why this attack vector is so effective:

  • – Automatic prioritisation of public repositories  
  • – Trust the version number
  • – Missing signature verification
  • – Reliance on external code

Typosquatting

Typosquatting is an attack technique that exploits user oversight by targeting common typos that can occur while typing package names in software development, such as in Maven. Attackers release packages with similar or slightly misspelt names that closely resemble legitimate libraries. When developers accidentally enter the wrong package name in their dependency definitions or automated tools resolve these packages, they download the malicious package. Typosquatting is one of the most well-known attack methods for manipulating package managers, such as Maven, npm, PyPI, and others that host publicly available libraries.

Basics of typosquatting

Typosquatting is based on the idea that users often make typos when entering commands or package names. Attackers exploit this by creating packages or artifacts with names that are very similar to well-known and widely used libraries but differ in small details. These details may include minor variations such as missing letters, additional characters, or alternative spellings.

Typical typosquatting techniques

Misspelled package names:

One of the most straightforward techniques is to change or add a letter in the name of a well-known library. An example would be the package `com.google.common`, which is often used. An attacker could use a package named `com.gooogle.common` (with an extra “o”) that is easily overlooked.

Different spellings:

Attackers can also use alternative spellings of well-known libraries or names. For example, an attacker could use a package named `com.apache.loggin` to publish the popular `com.apache.logging` looks similar, but due to the missing letter combination “g” and “n” at “logging” is easily overlooked.

Use of prefixes or suffixes:

Another option is to add prefixes or suffixes that increase the similarity to legitimate packages. For example, an attacker could use the package `com.google.common-utils` or `com.google. commonx` to publish the same as the legitimate package` com.google.common` resembles.

Similarity in naming:

Attackers can also take advantage of naming conventions in the open-source community by publishing packages containing common terms or abbreviations often used in combination with other libraries. An example would be releasing a package like `common-lang3-utils`, which is linked to the popular Apache Commons library `commons-lang3‘ remembers.

Dangers of typosquatting

The threat of typosquatting is grave because it is difficult to detect. Developers often rely on their build tools like Maven to reliably download and integrate packages into their projects. If an incorrect package name is entered, you may not immediately realise that you have included a malicious dependency. Typosquatting is a form of social engineering because it exploits people’s susceptibility to errors.

A successful typosquatting attack can lead to severe consequences:

  • Data loss
  • Malware injection
  • Loss of trust

Maven typosquatting cases

There have also been incidents of typosquatting in the Maven community. In one case, a package named `commons-loggin` was published, corresponding to the legitimate Apache Commons logging package `commons-logging`. Developers who entered the package name incorrectly downloaded and integrated the malicious package into their projects, creating potential security risks.

Typosquatting is a sophisticated and difficult-to-detect attack method that targets human error. Attackers take advantage of the widespread use of package managers such as Maven, npm, and PyPI by publishing slightly misspelt or similar-sounding packages that contain malicious code. Developers and organisations must be aware of this threat and take appropriate protective measures to ensure that only legitimate and trustworthy packages are included in their projects.

Process of a cache poisoning attack on Maven

A typical sequence of a cache poisoning attack on Maven could look like this:

Identification of a target repository: The attacker is looking for a Maven repository used by developers but may have vulnerabilities. This can happen, for example, through outdated versions of publicly available repository management tools.

Handling Artifacts: The attacker manipulates the artefact, e.g. a JAR file, by adding malicious code. This can range from simple backdoors to complex Trojans.

Provision of the poisoned artefact: The manipulated artefact is either uploaded to the public repository (e.g., in the form of a typosquatting package) or injected directly into a compromised target repository.

Download by developer: The developer uses Maven to update or reload the dependencies for his project. Maven downloads the poisoned artefact, which is stored in the local cache.

Compromising the project: Maven will use the poisoned artefact from the cache in future builds. This artefact can then execute malicious code in the application’s context, resulting in system compromise.

Security mechanisms to protect against cache poisoning

Various measures should be implemented on both the developer and repository provider sides to protect against cache poisoning attacks.

Regular updates and patch management

Make sure Maven, its plugins, and all repository management tools are always up to date. Security updates should be applied immediately to address known vulnerabilities.

Using HTTPS

The use of encrypted connections (HTTPS) between Maven and the repository is crucial to ensure that no man-in-the-middle attacks can be performed on the transferring artifacts. Maven Central enforces HTTPS connections, but private repositories should also adhere to this standard.

Signature verification

Another protective measure is the use of cryptographic signatures for artefacts. Maven supports the use of PGP signatures to ensure the integrity of artefacts. Developers should ensure that the signatures of downloaded artefacts are verified to ensure that they have not been tampered with.

Improve repository security

Repository providers should ensure that their repositories are well protected by implementing robust authentication mechanisms, regular patches and updates to repository management tools such as Nexus or Artifactory.

Dependency Scanning and Monitoring

Tools like OWASP Dependency Check or Snyk can scan known dependency vulnerabilities. These tools can help identify malicious or stale dependencies and prevent them from entering the Maven cache.

Version Pinning

“Version pinning” means setting specific versions of dependencies in the `pom.xml` file instead of using dynamic version ranges (`[1.0,)`). This helps prevent unexpected updates and ensures that only explicitly defined versions of the artefacts are used.

Private Maven-Repositories

One approach to maintaining control over dependencies is maintaining a private Maven repository within the organisation. This ensures that only checked artifacts end up in the internal cache, reducing the risk of introducing malicious dependencies into the build process.

Implementation of code reviews and security checks

Conduct regular code reviews to ensure only trusted dependencies are used in projects. Automated security checks can provide additional security.

Understanding CVEs in the context of Maven and cache poisoning

CVE (Common Vulnerabilities and Exposures) is a standardised system for identifying and cataloguing security vulnerabilities in software. Each CVE number refers to a specific vulnerability discovered and documented by security experts.

There are no specific CVEs that exclusively target cache poisoning in the context of Maven and cache poisoning. Instead, various vulnerabilities in Maven itself, its plugins, or the underlying repository management tools (such as Sonatype Nexus or JFrog Artifactory) can be indirectly exploited for cache poisoning attacks. These vulnerabilities could allow attackers to manipulate dependencies, compromise the integrity of downloads, or bypass Maven’s security mechanisms.

Although no CVEs are explicitly classified as cache poisoning, there are several vulnerabilities in Maven and related tools that could potentially be exploited for cache poisoning attacks:

CVE-2020-13949: Remote Code Execution in Apache Maven

  • Description: This vulnerability affected Maven Surefire plugin versions prior to 2.22.2, which could enable remote code execution (RCE). An attacker could use specially crafted POM files to execute malicious code on the build system.
  • Relevance to cache poisoning: By running RCE, an attacker could inject crafted dependencies into the local Maven cache, which could compromise future builds.
  • Reference: CVE-2020-13949

CVE-2021-25329: Denial of Service in Maven Artifact Resolver

  • Description: This vulnerability affected the Apache Maven Artifact Resolver component, which is responsible for resolving and downloading artefacts. A specially crafted POM file could lead to a denial of service (DoS).
  • Relevance to cache poisoning: A DoS attack could impact the availability of Maven repositories, forcing developers to use alternative (possibly insecure) repositories, increasing the risk of cache poisoning.
  • Reference: CVE-2021-25329

CVE-2019-0221: Directory Traversal in Sonatype Nexus Repository Manager

  • Description: This vulnerability allows attackers to access files within the Nexus Repository Manager through a directory traversal attack.
  • Relevance to cache poisoning: By accessing critical files or configurations, attackers could compromise the repository manager and insert malicious artefacts, which are then downloaded by developers and stored in the local Maven cache.
  • Reference: CVE-2019-0221

CVE-2022-26134: Arbitrary File Upload in JFrog Artifactory

  • Description: This vulnerability allowed attackers to upload arbitrary files to a JFrog Artifactory server, which could result in a complete compromise of the server.
  • Relevance to cache poisoning: By uploading arbitrary files, attackers could inject malicious Maven artefacts into the repository, which developers then download and cache.
  • Reference: CVE-2022-26134

CVE-2021-44228: Log4Shell (Log4j)

  • Description: This widespread vulnerability affected the Log4j library and allowed remote code execution by exploiting JNDI injections.
  • Relevance to cache poisoning: Many Maven projects use Log4j as a dependency. A manipulated version of this dependency could enable RCE through cache poisoning.
  • Reference: CVE-2021-44228

Analysis and impact of the mentioned CVEs

The above CVEs illustrate how vulnerabilities in Maven and related tools can potentially be exploited for cache poisoning and other types of attacks:

  • Remote Code Execution (RCE): Vulnerabilities such as CVE-2020-13949 and CVE-2021-44228 allow attackers to execute malicious code on the build system. This can be used to inject manipulated dependencies into the local cache.
  • Denial of Service (DoS): CVE-2021-25329 demonstrates how attackers can impact the availability of Maven repositories. This may result in developers being forced to resort to alternative sources that may be unsafe.
  • Directory Traversal and Arbitrary File Upload: CVE-2019-0221 and CVE-2022-26134 demonstrate how attackers can compromise repository management tools to upload malicious artefacts that developers then unknowingly use in their projects.

Future challenges and continuous security improvements

As the use of open-source libraries in modern software development continues to increase, the threat of attacks such as cache poisoning also increases. Automating the build process and relying on external repositories creates an expanded attack surface that attackers seek to exploit.

It is becoming increasingly important for companies and developers to cultivate a security culture prioritising the safe handling of dependencies and artefacts. This requires not only the implementation of technical protection measures but also the training of developers to raise awareness of the risks of cache poisoning and other dependency attacks.

Conclusion

Cache poisoning attacks on Maven caches are a severe risk, especially at a time when open-source components play an essential role in software development. The attacks exploit vulnerabilities in how dependencies are resolved, downloaded and cached.

Developers and companies must implement best security practices to prevent such attacks. This includes using HTTPS, signing and verifying artefacts, securing repositories, regular vulnerability scanning, and controlling the dependencies introduced into their projects.

Awareness of the risks and implementing appropriate security measures are key to preventing cache poisoning attacks and ensuring the integrity of software development processes.

Happy Coding

Sven

CWE-778: Lack of control over error reporting in Java

Learn how inadequate control over error reporting leads to security vulnerabilities and how to prevent them in Java applications.

Safely handling error reports is a central aspect of software development, especially in safety-critical applications. CWE-778 describes a vulnerability caused by inadequate control over error reports. This post will analyse the risks associated with CWE-778 and show how developers can implement safe error-handling practices to avoid such vulnerabilities in Java programs.

What is CWE-778?

The Common Weakness Enumeration (CWE) defines CWE-778 as a vulnerability where bug reporting is inadequately controlled. Bug reports often contain valuable information about an application’s internal state, including system paths, configuration details, and other sensitive information that attackers can use to identify and exploit vulnerabilities. Improper handling of error reports can result in unauthorised users gaining valuable insight into the application’s system structure and logic.

Exposing such information in a security-sensitive application could have potentially serious consequences, such as the abuse of SQL injection or cross-site scripting (XSS) vulnerabilities. Therefore, it is critical that bug reports are carefully controlled and only accessible to authorised individuals.

Examples of CWE-778 in Java

The following example considers a simple Java application used to authenticate users:

This example displays an error message if the user enters an incorrect password. However, this approach has serious security gaps:

1. The error message contains specific information about the username.

2. The full stack trace is output, allowing an attacker to obtain details about the application’s implementation.

This information can help an attacker understand the application’s internal structure and make it easier for them to search specifically for additional vulnerabilities.

Secure error handling

To minimise the risks described above, secure error handling should be implemented. Instead of providing detailed information about the error, the user should only be shown a general error message:

In this improved version, only a general error message is displayed to the user while the error is logged internally. This prevents sensitive information from being shared with unauthorised users.

Such errors should be logged in a log file accessible only to authorised persons. A logging framework such as Log4j or SLF4J provides additional mechanisms to ensure logging security and store only necessary information.

Example with Vaadin Flow

Vaadin Flow is a Java framework for building modern web applications, and CWE-778 can also be a problem if error reports are mishandled. A safe example of error handling in a Vaadin application could look like this:

The `logError` method ensures that errors are logged securely without sensitive information being visible to the end user. Vaadin Flow enables the integration of such secure practices to ensure that bug reports are not leaked uncontrollably.

Using design patterns to reuse logging and error handling.

To promote the reuse of error handling and logging, design patterns that enable the modularization and unification of such tasks can be used. Two suitable patterns are the Decorator Pattern and the Template Method Pattern.

Decorator Pattern

The Decorator Pattern is a structural design pattern that allows an object’s functionality to be dynamically extended without changing the underlying class. This is particularly useful when adding additional responsibilities, such as logging, security checks, or error handling, without modifying the original class’s code.

The Decorator Pattern works by using so-called “wrappers”. Instead of modifying the class directly, the object is wrapped in another class that implements the same interface and adds additional functionality. In this way, different decorators can be combined to create a flexible and expandable structure.

A vital feature of the Decorator Pattern is its adherence to the open-closed principle, one of the fundamental principles of object-oriented design. The open-closed principle states that a software component should be open to extensions but closed to modifications. The Decorator Pattern does just that by allowing classes to gain new functionality without changing their source code.

In the context of error handling and logging, developers can write an introductory class for authentication, while separate decorators handle error logging and handling of specific errors. This leads to a clear separation of responsibilities, significantly improving code maintainability.

The following example shows the implementation of the Decorator pattern to reuse error handling and logging:

In this example, `BasicAuthenticator` is used as the primary authentication class, while the `LoggingAuthenticatorDecorator` Added additional functionality, namely error logging. This decorator wraps the original authentication class and extends its behaviour. This allows the logic to be flexibly extended by adding more decorators, such as a `SecurityCheckDecorator`, which performs additional security checks before authentication.

An advantage of this approach is combining decorators in any order to achieve tailored functionality. For example, one could first add a security decoration and then implement error logging without changing the original authentication logic. This results in a flexible and reusable structure that is particularly useful in large projects where different aspects such as logging, security checks, and error handling are required in various combinations.

The Decorator Pattern is, therefore, a powerful tool for increasing software modularity and extensibility. It avoids code duplication, promotes reusability, and enables a clean separation of core logic and additional functionalities. This makes it particularly useful in secure error handling and implementing cross-cutting concerns such as logging in safety-critical applications.

The Decorator Pattern can add functionality, such as logging or error handling, to existing methods without modifying their original code. The following example shows how the Decorator Pattern enables centralised error handling:

In this example, the `LoggingAuthenticatorDecorator` is a decorator for the class `BasicAuthenticator`. The Decorator Pattern allows error handling and logging to be centralised without changing the underlying authentication class.

Proxy Pattern

The proxy pattern is a structural design pattern used to control access to an object. It is often used to add functionality such as caching, access control, or logging. In contrast to the decorator pattern, primarily used to extend functionality, the proxy serves as a proxy that takes control of access to the actual object.

The proxy pattern ensures that all access to the original object occurs via the proxy, meaning specific actions can be carried out automatically. For example, the Proxy Pattern could ensure that authorised users can only access a particular resource while logging all accesses.

A typical example of the proxy pattern for encapsulating logging and error handling looks like this:

In this example, `BasicAuthenticator` is wrapped by a `ProxyAuthenticator`, which controls all calls to the `authenticate` method. The proxy adds additional functionality, such as access and error logging, ensuring that all access goes through the proxy before the authentication object is called.

A key difference between the Proxy Pattern and the Decorator Pattern is that the Proxy primarily controls access to the object and its use. The proxy can check access rights, add caching, or manage an object’s lifetime. The Decorator Pattern, on the other hand, is designed to extend an object’s behaviour by adding additional responsibilities without changing the access logic.

In other words, the Proxy Pattern acts as a protection or control mechanism, while the Decorator Pattern adds additional functionality to extend the behaviour. Both patterns are very useful when integrating cross-cutting concerns such as logging or security checks into the application, but they differ in their focus and application.

Template Method Pattern

The Template Method Pattern allows for defining the general flow of a process while implementing specific steps in subclasses. This ensures that error handling remains consistent:

The Template Method Pattern centralises error handling in the `AbstractAuthenticator` class so that all subclasses use the same consistent error handling strategy.

Evaluate log messages for attack detection in real time

Another aspect of secure error handling is using log messages to detect attacks in real-time. Analysing the log data can identify potential attacks early, and appropriate measures can be taken. The following approaches are helpful:

Centralised logging: Use a central logging platform like the ELK Stack (Elasticsearch, Logstash, Kibana) or Splunk to collect all log data in one place. This enables comprehensive analysis and monitoring of security-related incidents.

Pattern recognition: Create rules and patterns that identify potentially malicious activity, such as multiple failed login attempts in a short period. Such rules can trigger automated alerts when suspicious activity is detected.

Anomaly detection: Machine learning techniques detect anomalous activity in log data. A sudden increase in certain error messages or unusual access patterns could indicate an ongoing attack.

Real-time alerts: Configure the system so that certain security-related events trigger alerts in real-time. This allows administrators to respond immediately to potential threats.

Analyse threat intelligence: Use log messages to collect and analyse threat intelligence. For example, IP addresses that repeatedly engage in suspicious activity can be identified, and appropriate action can be taken, such as blocking the address.

Integration into SIEM systems: Use security information and event management (SIEM) systems to correlate log data from different sources and gain deeper insights into potential threats. SIEM systems often also provide tools to automate responses to specific events.

By combining these approaches, attacks can be detected early, and the necessary steps can be taken to limit the damage.

Best practices for avoiding CWE-778

To avoid CWE-778 in your applications, the following best practices should be followed:

Generic error messages: Avoid sharing detailed information about errors with end users. Error messages should be worded as generally as possible to avoid providing clues about the internal implementation.

Error logging: Use logging frameworks like Log4j or SLF4J to log errors securely. This allows bugs to be tracked internally without exposing sensitive information.

No stack traces to users: Make sure stack traces are only visible in the log and are not output to the user. Instead, generic error messages that do not contain technical details should be used.

Access control: Ensure that only authorized users have access to detailed error reports. Error logs should be well-secured and viewable only by administrators or developers.

Regular error testing and security analysis: Run regular tests to ensure that error handling works correctly. Static code analysis tools help detect vulnerabilities like CWE-778 early.

Avoiding sensitive information: To prevent sensitive information such as usernames, passwords, file paths, or server details from being included in error messages, such information should only be stored in secured log files.

Using secure libraries: Rely on proven libraries and frameworks for error handling and logging that have already undergone security checks. This reduces the likelihood of implementation errors compromising security.

Conclusion

CWE-778 poses a severe security threat if bug reports are not adequately controlled. Developers must know the importance of handling errors securely to prevent unwanted information leaks. Applying secure programming practices, such as using design patterns to reuse error-handling logic and implementing centralised logging to detect attacks in real-time, can significantly increase the security and robustness of Java applications.

Secure error handling improves an application’s robustness and user experience by providing clear and useful instructions without overwhelming the user with technical details. The combination of security and usability is essential for the success and security of modern applications.

Ultimately, control over bug reports is integral to a software project’s overall security strategy. Bug reports can either be a valuable resource for developers or, if handled poorly, become a vulnerability for attackers to exploit. Disciplined error handling, modern design patterns, and attack detection technologies are critical to ensuring that error reports are used as a tool for improvement rather than a vulnerability.

Code security through unit testing: The role of secure coding practices in the development cycle

Unit testing is an essential software development concept that improves code quality by ensuring that individual units or components of a software function correctly. Unit testing is crucial in Java, one of the most commonly used programming languages. This article will discuss what unit testing is, how it has evolved, and what tools and best practices have been established over the years.

Definition von Unit Testing

Unit Testing refers to testing a program’s most minor functional units—usually individual functions or methods. A unit test ensures that a specific code function works as expected and typically checks whether a method or class returns the correct output for particular inputs.

In Java, unit testing often refers to testing individual methods in a class, testing whether the code responds as expected to both average and exceptional input.

Example in Java:

A unit test for the `add` method could look like this:

This example tests whether the `add` method correctly combines two numbers. If the test is successful, the process works as expected.

The history of unit testing

The roots of unit testing lie in the early days of software development. As early as the 1960s, programmers recognised that it was essential to ensure that individual parts of a program were working correctly before integrating the entire system. However, unit testing only became primarily standardised in the 1990s with the spread of object-oriented programming languages ​​such as Java and C++.

The 1990s and the emergence of JUnit

One of the most influential events in the history of unit testing was the introduction of the framework JUnit for the Java programming language in 1999. Developed by Kent Beck and Erich Gamma, JUnit revolutionised how Java developers test their software.

JUnit enabled developers to write simple and easy-to-maintain unit tests for their Java programs. Before JUnit, no widely used, standardised tools facilitated unit testing in Java. Programmers had to create their test suites manually, which was tedious and error-prone.

JUnit allowed developers to automate tests, create test suites, and integrate testing into their development workflow. Its introduction contributed significantly to raising awareness about the importance of testing and promoting the adoption of unit testing in the industry.

The 2000s: Agile Methods and Test-Driven Development (TDD)

In the early 2000s, agile development methodologies gained popularity, particularly Extreme Programming (XP) and Scrum. These methods emphasized short development cycles, continuous integration, and, most importantly, testing. One of the core ideas of XP is Test-Driven Development (TDD), a methodology in which tests are created before actual code is written.

In TDD, the development process is controlled by tests:

1. First, the developer writes a unit test that fails because the function to be tested still needs to be implemented.

2. The minimal code required to pass the test is then written.

3. Finally, the code is refined and optimised, with testing ensuring that existing functionality is not affected.

TDD requires developers to engage intensively with unit testing, which drives the entire development process. Unit tests are not just a way to ensure quality but an integral part of the design.

JUnit played a central role during this time, making implementing TDD in Java much more accessible. Developers could write and run tests quickly and easily, making the workflow more efficient.

The 2010s: Integration into CI/CD pipelines and the importance of automation

With the advent of Continuous Integration (CI) and Continuous Delivery (CD) In the 2010s, unit testing became even more critical. CI/CD pipelines allow developers to continuously integrate changes to the code base into the central repository and automatically deploy new versions of their software.

In a CI/CD environment, automated testing, including unit testing, is critical to ensure that new code changes do not break existing functionality. Unit tests are typically the first stage of testing executed in a CI pipeline because they are fast and focused.

For this purpose, numerous tools have been created that integrate unit tests into the CI/CD workflow. Besides JUnit, various build tools, like Maven and Gradle, can run unit tests automatically. Test reports are generated, and the developer is notified if a test fails.

Another significant advancement during this time was integrating code coverage tools like JaCoCo. These tools measure the percentage of code covered by unit tests and help developers ensure that their tests cover all relevant code paths.

Tools and frameworks for unit testing in Java

JUnit: Probably the best-known and most widely used testing framework for Java. Since its introduction in 1999, JUnit has evolved and offers numerous features that make testing easier. With the introduction of JUnit 5, the framework became more modular and flexible.

TestNG: Another popular testing framework that offers similar functionality to JUnit but supports additional features such as dependency management and parallel test execution. TestNG is often used in larger projects that require complex testing scenarios.

Mockito: A mocking framework used to simulate dependencies of classes in unit tests. Mockito allows developers to “mock” objects to control the behaviour of dependencies without actually instantiating them.

JaCoCo: A tool for measuring code coverage. It shows what percentage of the code is covered by tests and helps developers identify untested areas in the code.

Gradle and Maven: These build tools provide native support for unit testing. They allow developers to run tests and generate reports during the build process automatically.

Best practices for unit testing in Java

Isolated testing: Each unit test should test a single method or function in isolation. This means that external dependencies such as databases, networks or file systems should be “mocked” to ensure that the test focuses only on the code under test.

Write simple tests: Unit tests should be simple and easy to understand. You should only test a single functionality and not have any complex logic or dependencies.

Regular testing: Tests should be conducted regularly, especially before every commit or build. This ensures that errors are detected early and that the code remains stable.

Test cases for limits: It is essential to test typical use cases and cover limit values ​​(e.g., minimum and maximum input values) and error cases.

Ensure test coverage: Code coverage tools like JaCoCo help developers ensure that most of the code is covered by tests. However, high test coverage should be one of many goals. The quality of the tests is just as important as the quantity.

The future of unit testing

Unit testing has constantly evolved over the past few decades, and this development is expected to continue in the future. The advent of technologies like Artificial Intelligence (AI) and Machine Learning could open up new ways of generating and executing unit tests.

AI-powered tools could automatically generate unit tests by analysing the code and suggesting tests based on typical patterns. This could further automate the testing process and give developers more time for actual development.

Another trend is the increasing integration of Mutation Testing. This technique involves making minor changes to the code to check whether the existing tests detect the mutated code paths and throw errors. This allows developers to ensure that their tests cover code and detect errors.

Unit testing is an essential part of modern software development and has evolved from an optional practice to a standard. In Java, JUnit has contributed significantly to the proliferation and standardisation of unit testing. At the same time, agile methodologies such as TDD have integrated the role of testing into the development process.

With increasing automation through CI/CD pipelines and the availability of powerful tools, unit testing will continue to play a critical role in ensuring code quality in the future. Developers integrating unit testing into their workflow early on benefit from stable, maintainable and error-free code.

What are the disadvantages of unit testing?

Unit testing has many advantages, especially ensuring code quality and finding errors early. However, there are also some disadvantages and limitations that must be taken into account in practice:

Time expenditure and costs

Creation and maintenance: Writing unit tests requires additional time, especially in the early stages of a project. This effort increases development costs and can seem disproportionate for small projects.

Maintenance: When the code changes, the associated tests often also need to be adapted. This can be very time-consuming for larger code bases. Changes to the design or architecture can lead to numerous test changes.

Limited test coverage

Only tests isolated units: Unit tests test individual functions or methods in isolation. They do not cover the interaction of different modules or layers of the system, which means that they cannot reveal integration problems.

Not suitable for all types of errors: Unit tests are good at finding logical errors in individual methods, but they cannot detect other types of errors, such as errors in interaction with databases, networks, or the user interface.

False sense of security

High test coverage ≠ freedom from errors: Developers may develop a false sense of security when achieving high test coverage. Just because many tests cover the code doesn’t mean it’s bug-free. Unit tests only cover the specific code they were written and may not test all edge or exception cases.

Blind trust in testing: Sometimes, developers rely too heavily on unit testing and neglect other types of testing, such as integration testing, system testing, or manual testing.

Excessive mocking

Mocks can distort reality: When testing classes or methods that depend on external dependencies (e.g. databases, APIs, file systems), mock objects are often used to simulate these dependencies. However, extensive mocking frameworks such as Mockito can lead to unrealistic tests that work differently than the system in a real environment.

Complex dependencies: When a class has many dependencies, creating mocks can become very complicated, making the tests difficult to understand and maintain.

Difficulties in testability of legacy code

Legacy-Code ohne Tests: Writing unit tests can be challenging and time-consuming in existing projects with older code (legacy code). Such systems may have been designed without testability in mind, making unit tests difficult to write.

Refactoring necessary: Legacy code often needs to be refactored to enable unit testing, which can introduce additional risks and costs.

Not suitable for complex test cases

Not suitable for end-to-end testing: Unit tests are designed to test individual units and are not intended to cover end-to-end test cases or user interactions. Such testing requires other types of testing, such as integration, system, or acceptance testing.

Limited perspective: Unit tests often only consider individual system components and not the behaviour of the entire system in real usage scenarios.

Test-driven development (TDD) can lead to excessive focus on details

Design of code influenced by testing: In TDD, the emphasis is on writing tests before writing the code. This can sometimes lead to developers designing code to pass tests rather than developing a more general, robust solution.

Excessive focus on detailed testing: TDD can cause developers to focus too much on small details and isolated components instead of considering the overall system architecture and user needs.

Test maintenance for rapid changes

Frequent changes lead to outdated tests: In fast-paced development projects where requirements and code change frequently, tests can quickly become obsolete and unnecessary. Maintaining these tests can become significant without providing any clear added value.

Tests as ballast: If code is constantly evolving and tests are not updated, outdated or irrelevant tests can burden the development process.

Lack of testing strategies in complex systems

Complexity of test structure: It can be difficult to develop a meaningful unit testing strategy that covers all aspects of very complex systems. This often results in fragmented and incomplete tests or inadequate testing of critical areas of the system.

Testing complexity in object-oriented designs: For highly object-oriented programs, it can be difficult to identify precise units for unit testing, especially if the classes are highly interconnected. In such cases, writing unit tests can become cumbersome and inefficient.

Additional effort without immediate benefit

Cost-benefit analysis for small projects: In small projects or prototypes where the code is short-lived, the effort spent writing unit tests may outweigh the benefits. In such cases, other testing methods, such as manual or simple end-to-end testing, may be more efficient.

Conclusion

Although unit testing has numerous advantages, there are also clear disadvantages that must be considered. The additional time required, limited test coverage, and potential maintenance challenges must be weighed when planning a testing process. Unit testing is not a panacea but should be viewed as one tool among many in software development. However, combined with other types of testing and a well-thought-out testing strategy, unit testing can significantly improve code quality.

Does Secure Pay Load Testing belong to the area of ​​unit testing?

Secure Payload Testing usually belongs to a different area than the traditional area of Unit Testing, which is Security testing and partial Integration Tests. Let’s take a closer look at this to better understand the boundaries.

What is Secure Payload Testing?

Secure Payload Testing refers to testing security-related data or messages exchanged between system components. This particularly applies to scenarios where sensitive data (such as passwords, API keys, encrypted data, etc.) must be protected and handled correctly in communication between systems. It tests whether data is appropriately encrypted, decrypted and authenticated during transmission and whether there are any potential security gaps in handling this data.

Examples of typical questions in secure payload testing are:

  • Is sensitive data encrypted and decrypted correctly?
  • Does data remain secure during transmission?
  • Is it ensured that the payload data does not contain security holes, such as SQL injections or cross-site scripting (XSS)?
  • Is the integrity of the payload guaranteed to prevent manipulation?

Difference between Unit Testing and Secure Payload Testing

Unit Testing focuses on the Functionality of individual program components or methods in isolation. It usually checks whether a method delivers the expected outputs for certain inputs. The focus is on the correctness and stability of the program’s logical units, not directly on the security or protection of data.

An example of a unit test in Java would be testing a simple mathematical function. The unit test would check whether the method works correctly. Security aspects, such as handling confidential data or ensuring encryption, are usually outside of such tests.

In contrast, Secure Payload Testing involves the secure handling and processing of data during transmission or storage. This is often part of security testing, which aims to ensure that data is properly protected and not vulnerable to attacks or data leaks.

Where does Secure Payload Testing fit in?

Integration tests: Secure Payload Testing could be part of integration testing, which tests the interaction between different components of a system, e.g., between a client and a server. Here, one would ensure that the payload is properly encrypted and the transmission over the network is secure. 

Security testing: In more complex systems, secure payload testing belongs more to Security testing, which involves attacks on data security, integrity, and confidentiality of payloads. These tests often go beyond the functionality of individual code units and require special testing strategies, such as penetration testing or testing for known security vulnerabilities.

End-to-End Tests: Since Secure Payload Testing is often related to data transfer, it can also be part of End-to-End tests. The entire system is tested here, from input to processing to output. These tests check whether the data is encrypted correctly at the beginning and decrypted and processed correctly at the end.

Can Secure Payload Testing be part of Unit Testing?

In special cases, an aspect of secure payload testing can be part of unit tests, especially if the security logic is very closely linked to the functionality of the unit under test (e.g., an encryption or decryption method).

An example could be testing an Encryption method in isolation:

Here, you could write unit tests that check whether:

  • The text is correctly encrypted and decrypted.
  • For the same input data, the same output is always produced (in the case of deterministic encryption).
  • The method responds correctly to invalid inputs (e.g. wrong key, invalid data format).

Despite these specific test cases, secure payload testing generally focuses on not only isolated functionality but also security and integrity in the context of other system components.

Secure Payload Testing does not belong to classic Unit Testing. It typically concerns security, integrity, and correct data processing in a broader context, often involving the interaction of multiple system components. It falls more into the realm of security and integration tests.

However, unit tests can test parts of security logic to some extent, especially when encryption or security functions are to be tested in isolation. However, the overall picture of security requirements and protection of payloads is usually determined by more comprehensive testing, such as Integration Tests, Security testing, or End-to-End tests.

Which secure coding practices play a role in connection with unit testing?

Secure Coding Practices are essential to ensure the code is secure against potential attacks and vulnerabilities. These practices are fundamental to ensuring software security, and they can be closely related to Unit Testing. While unit testing primarily aims to verify code’s functionality, secure coding practices can help ensure that code is robust and secure. Here are some of the key Secure Coding Practices related to Unit Testing:

Input validation and sanitisation

Safe practice: Always ensure that input from external sources (user input, API calls, file input) is validated and “sanitised” to avoid dangerous content such as SQL injection or cross-site scripting (XSS).

Connection to unit testing: Unit tests should ensure that methods and functions respond correctly to invalid or potentially dangerous input. Tests should contain inputs such as unexpected special characters, entries that are too long or short, empty fields, or formatting errors.

Example unit test in Java:

This test checks whether the `isValid` method correctly rejects unsafe input as invalid.

Limit value analysis (boundary testing)

Safe practice: Inputs should be tested against their maximum and minimum limits to ensure the code does not crash or be vulnerable to buffer overflows.

Connection to unit testing: Unit tests should ensure the application safely responds to input at the top and bottom of its allowed ranges. This helps prevent typical attacks such as buffer overflows.

Example: If a function only accepts a certain number of characters, the unit test should check how the function responds to inputs that are precisely at or above that limit.

Secure error handling

Safe practice: Error handling should not reveal sensitive information, such as stack traces or details about the application’s internal structure, as attackers can exploit such information.

Connection to unit testing: Unit tests should ensure errors and exceptions are handled correctly without exposing sensitive information. Unit tests can trigger targeted exception situations and check whether only safe and user-friendly error messages are returned.

Example:

Avoiding hard-coded secrets

Safe practice: Never hard-code sensitive information such as passwords, API keys, or tokens in code. Instead, such data should be stored in secure environment variables or configuration files.

Connection to unit testing: Unit tests should ensure that sensitive data is handled securely. They should also check that the code loads external configuration sources correctly and does not accidentally use hard-coded secrets.

Example:

Use of safe libraries and dependencies

Safe practice: Users should take care to use secure libraries and frameworks and update them regularly to avoid known security vulnerabilities.

Connection to unit testing: Unit tests should ensure the libraries are correctly integrated and updated. Testing functionality that depends on external libraries is also essential to ensure that security mechanisms in those libraries are used correctly.

Ensure encryption

Safe practice: Sensitive data should be stored and transmitted in encrypted form to prevent unauthorised access or data leaks.

Connection to unit testing: Unit tests should verify that data is encrypted and decrypted correctly. For example, a test could ensure that the encryption and decryption methods work consistently and without errors.

Example:

Least Privilege Principle

Safe practice: Methods and functions should only be executed with the minimum rights and access required.

Connection to unit testing: Unit tests should ensure that methods only work with the minimum required data and that no unauthorised access to resources is possible. For example, tests could check whether protected resources are only accessed after successful authentication.

Example:

Avoiding race conditions

Safe practice: Race conditions can occur when multiple threads or processes access shared resources simultaneously. They should be avoided to prevent security issues such as unpredictable behavior or data corruption.

Connection to unit testing: Unit tests should ensure that the code is thread-safe and that no race conditions occur. This can be verified by testing code under multiple access or by using mock threads.

Example:

Avoidance of buffer overflows

Safe practice: Buffer overflows occur when a program writes more data into a memory area than it can hold. Although Java is less prone to buffer overflows than C or C++, thanks to automatic memory management, care should still be taken to ensure that arrays and memory structures are used safely.

Connection to unit testing: Unit tests should test edge cases and maximum input values ​​to ensure that overflows do not occur.

Safe use of third-party libraries

Safe practice: Third-party libraries should be used safely and regularly checked for known vulnerabilities.

Connection to unit testing: Tests can ensure that functions and classes from third-party libraries are implemented correctly and used safely. Mocking can be used to simulate external dependencies safely.

Secure Coding Practices play an essential role in UnitTesting, as they ensure that the code is not only functionally correct but also secure against potential attacks. Unit tests should aim to check security-relevant aspects such as input validation, secure error handling, encryption, and rights assignment. By incorporating these practices into the unit testing process, developers can ensure their applications are robust, secure, and protected against many common attack patterns.

CWE-1123: Excessive Use of Self-Modifying Code for Java Developers

Self-modifying code refers to a type of code that alters its own instructions while it is executing. While this practice can offer certain advantages, such as optimisation and adaptability, it is generally discouraged due to the significant risks and challenges it introduces. For Java developers, using self-modifying code is particularly problematic because it undermines the codebase’s predictability, readability, and maintainability, and Java as a language does not natively support self-modification of its code.

Risks

Unpredictable Behavior: Self-modifying code can lead to unexpected program behaviour, making diagnosing and fixing bugs difficult.

Security Vulnerabilities: Code that modifies itself can be a vector for various security attacks, including injection attacks and malware.

Maintenance Difficulty: Such code is difficult to read and understand, making it more difficult to maintain and update.

Performance Issues: Self-modifying code can cause performance degradation due to the additional overhead of modifying and interpreting the changes at runtime.

Examples of Risky Practices

Dynamic Class Loading: Java allows classes to be loaded at runtime using mechanisms such as reflection or custom class loaders. While dynamic class loading itself is not inherently wrong, using it excessively or without apparent necessity can lead to self-modifying behaviour.

Bytecode Manipulation: Using libraries like ASM or Javassist to modify Java bytecode at runtime can lead to self-modifying code. This practice is highly discouraged unless essential.

Reflection: While reflection is a powerful feature, it can be misused to modify private fields, methods, or classes, leading to behaviour that is hard to trace and debug.

Example

An example of risky self-modifying behaviour in Java using bytecode manipulation:

In this example, the TargetClass method targetMethod is modified at runtime to include an additional print statement. This kind of modification can lead to the aforementioned risks.

Mitigation Strategies

Avoid Runtime Code Modifications: Design your system in a way that minimises or eliminates the need for runtime code modifications.

Use Design Patterns: Employ design patterns such as Strategy or State patterns that allow behaviour changes without altering the code at runtime.

Proper Use of Reflection: Use reflection sparingly and only when no other viable solution exists. Document its usage thoroughly.

Static Code Analysis: Use static code analysis tools to detect and prevent the introduction of self-modifying code.

Excessive use of self-modifying code in Java is fraught with risks that can compromise your applications’ security, maintainability, and performance. By adhering to best practices and using design patterns that promote flexibility and adaptability without modifying code at runtime, you can avoid the pitfalls associated with CWE-1123.

A Reflection Example

Reflection in Java allows for introspection and manipulation of classes, fields, methods, and constructors at runtime. While powerful, excessive or improper use of reflection can lead to self-modifying behaviours, which aligns with CWE-1123. This can result in unpredictable behaviour, security vulnerabilities, and maintenance challenges.

Example

Below is an example demonstrating the excessive use of reflection to modify a class’s behaviour at runtime, which can be considered a form of self-modifying code.

In this example, the ReflectionExample class:

Creates an instance of MyClass and prints the original message. It uses reflection to modify the private field message and the private method setMessage of MyClass. Changes the value of the message field and prints the modified message.

This example showcases how reflection can alter an object’s behaviour and state at runtime, leading to the issues outlined in CWE-1123.

Mitigation Strategies

Minimise Reflection Use: Avoid using reflection unless absolutely necessary. Prefer alternative design patterns that allow for flexibility without modifying the code at runtime.

Access Control: Ensure that fields and methods that should not be modified are kept private and final where possible to prevent unintended access.

Static Analysis Tools: Use static analysis tools to detect excessive use of reflection and other risky practices in the codebase.

Code Reviews: Conduct thorough code reviews to identify and mitigate the use of self-modifying code through reflection.

Reflection is a powerful tool in Java, but misuse can lead to the risks associated with CWE-1123. By adhering to best practices and minimising the use of reflection to modify code at runtime, developers can maintain the security, predictability, and maintainability of their applications.

Example of Dynamic Class Loading

Dynamic class loading in Java refers to the ability to load and unload classes at runtime. While this can be useful in specific scenarios, excessive or improper use can lead to self-modifying code behaviours, which align with CWE-1123. This can introduce risks such as unpredictable behaviour, security vulnerabilities, and maintenance challenges.

Below is an example demonstrating the excessive use of dynamic class loading to modify a class’s behaviour at runtime, which can be considered a form of self-modifying code.

In this example, the DynamicClassLoadingExample class:

Loads an original class MyClass and invokes its printMessage method. Dynamically loads a modified version of the class, ModifiedClass, using a custom class loader. Creates an instance of the modified class and invokes its printMessage method, which prints a different message.

This example showcases how dynamic class loading can alter a program’s behaviour at runtime, leading to the issues outlined in CWE-1123.

Mitigation Strategies

Avoid Unnecessary Dynamic Loading: Use dynamic class loading only when it is indispensable and cannot be avoided through other design patterns.

Secure Class Loaders: Ensure custom class loaders are secure and do not load untrusted or malicious classes.

Static Analysis Tools: Use static analysis tools to detect excessive use of dynamic class loading and other risky practices in the codebase.

Code Reviews: Conduct thorough code reviews to identify and mitigate the use of self-modifying code through dynamic class loading.

Java-based CVEs based on CWE-1123

While there might not be specific CVEs (Common Vulnerabilities and Exposures) explicitly labelled as being caused by CWE-1123 (Excessive Use of Self-Modifying Code), several Java-related vulnerabilities can arise from practices associated with self-modifying code. These typically involve dynamic class loading, reflection, and bytecode manipulation issues. Here are some examples of Java-based CVEs that relate to these practices:

CVE-2014-0114

Description: Apache Commons Collections Remote Code Execution Vulnerability

Issue: This vulnerability involves using reflection to manipulate serialised data, leading to arbitrary code execution. It was found in the Apache Commons Collections library, where certain classes could be used to execute arbitrary code when deserialised. This is a form of self-modifying behaviour, as the serialised data could alter the program’s execution flow.

Impact: Attackers could exploit this vulnerability to execute arbitrary commands on the server running the vulnerable application.

CVE-2013-2423

Description: Oracle Java SE Remote Code Execution Vulnerability

Issue: This vulnerability arises from improper handling of certain methods in Java, leading to the execution of arbitrary code. It leverages reflection and class-loading mechanisms to inject and execute malicious code.

Impact: Exploiting this vulnerability allows remote attackers to execute arbitrary code on the affected system, potentially leading to total system compromise.

CVE-2015-1832

Description: Android Remote Code Execution Vulnerability in Apache Cordova

Issue: This vulnerability involves dynamic class loading and improper validation of inputs. It allowed attackers to inject malicious code into an Android application built with Apache Cordova by exploiting the WebView component.

Impact: Successful exploitation could result in arbitrary code execution within the context of the affected application, leading to potential data leakage or further exploitation.

CVE-2012-0507

Description: Oracle Java SE Remote Code Execution Vulnerability

Issue: This vulnerability involves using reflection and dynamic class loading to exploit a flaw in the Java Runtime Environment (JRE). The vulnerability allows an untrusted Java applet to break out of the Java sandbox and execute arbitrary code.

Impact: Exploiting this vulnerability could allow an attacker to execute arbitrary code on the host system with the user’s privileges running the Java applet.

CVE-2019-12384

Description: FasterXML jackson-databind Deserialization Vulnerability

Issue: This vulnerability involves the unsafe handling of deserialisation using the jackson-databind library. By exploiting polymorphic type handling, attackers could inject malicious code that gets executed during deserialisation.

Impact: Successful exploitation could result in arbitrary code execution, leading to potential data breaches and system compromise.

Mitigation Strategies

Avoid Self-Modifying Code Practices: Do not use dynamic class loading, reflection, or bytecode manipulation unless absolutely necessary. When required, ensure proper validation and security measures are in place.

Use Safe Deserialisation: Avoid deserialisation of untrusted data. If deserialisation is necessary, libraries and techniques that enforce strict type checking and validation should be used.

Apply Security Patches: Regularly update and patch libraries and frameworks to protect against known vulnerabilities.

Code Reviews and Static Analysis: Conduct thorough code reviews and use static analysis tools to detect and mitigate the use of risky code practices.

Security Best Practices: To reduce the attack surface, follow security best practices, such as least privilege, input validation, and secure coding guidelines.

What kind of attacks or infection methods are based on CWE-1123?

CWE-1123 (Excessive Use of Self-Modifying Code) can lead to several types of attacks and infection methods due to such code’s unpredictable and dynamic nature. Here are some common attack vectors and infection methods associated with this vulnerability:

Code Injection Attacks

Attackers exploit self-modifying code to inject malicious code into a program. This can occur through various means, such as manipulating input data that gets executed or modifying code at runtime to include harmful payloads.

Example:

SQL Injection: If an application dynamically constructs SQL queries using user input and modifies these queries at runtime, an attacker can inject malicious SQL commands to alter the behaviour of the database operations.

Remote Code Execution (RCE)

Self-modifying code can enable Remote Code Execution by allowing attackers to modify or load classes and methods at runtime. This makes it easier to introduce and execute arbitrary code.

Example:

Deserialization Vulnerabilities: When an application deserialises data without proper validation, an attacker can inject objects that modify the code flow, leading to the execution of arbitrary code.

Privilege Escalation

Attackers can exploit self-modifying code to escalate their privileges within a system. They can bypass security checks and gain higher-level access by dynamically altering the code.

Example:

Reflection Attacks: Using reflection, attackers can modify private fields and methods to escalate privileges, accessing parts of the system that would otherwise be restricted.

Dynamic Code Loading Attacks

Self-modifying code often involves dynamic loading of classes or bytecode manipulation, which can be exploited to load malicious code at runtime.

Example:

Dynamic Class Loading: Attackers can trick the application into loading a malicious class that performs unwanted actions, such as exfiltrating data or modifying system files.

Polymorphic Malware

Self-modifying code is commonly used in polymorphic malware, where the malware changes its code to evade detection by security software.

Example:

Polymorphic Virus: A virus that encrypts its payload and changes its decryption routine with each infection, making it difficult for antivirus programs to detect the malware’s signature.

Evasion of Security Mechanisms

Self-modifying code can be used to evade security mechanisms such as firewalls, intrusion detection systems (IDS), and antivirus software by altering its code structure dynamically.

Example:

Metamorphic Malware: Similar to polymorphic malware, metamorphic malware reprograms itself completely with each infection, ensuring that no two copies of the malware are identical, thus evading signature-based detection.

Backdoors and Rootkits

Attackers can use self-modifying code to install backdoors or rootkits that alter the behaviour of the operating system or application to provide persistent unauthorised access.

Example:

Rootkits: A rootkit can use self-modifying code to hide its presence by altering kernel or application code to prevent detection by security tools.

Tampering with Security Features

Self-modifying code can be used to tamper with security features such as authentication mechanisms, encryption routines, and access controls.

Example:

Tampering with Authentication: By dynamically modifying authentication checks, an attacker can bypass login mechanisms and gain unauthorised access to the system.

By understanding these attack vectors and implementing mitigation strategies, developers and security professionals can reduce the risks associated with self-modifying code and improve the overall security of their applications.

Happy Coding

Sven

Securing Apache Maven: Understanding Cache-Related Risks

What is a Package Manager – Bird-Eye View

A package manager is a tool or system in software development designed to simplify the process of installing, updating, configuring, and removing software packages on a computer system. It automates managing dependencies and resolving conflicts between different software components, making it easier for developers to work with various libraries, frameworks, and tools within their projects.

Package managers typically provide a centralised repository or repositories where software packages are hosted. Users can then use the package manager to search for, download, and install the desired packages and any necessary dependencies directly from these repositories.

Some popular package managers include:

1. APT (Advanced Package Tool): Used primarily in Debian-based Linux distributions such as Ubuntu, APT simplifies installing and managing software packages.

2. YUM (Yellowdog Updater Modified) and DNF (Dandified YUM): Package managers commonly used in Red Hat-based Linux distributions like Fedora and CentOS.

3. Homebrew: A package manager for macOS and Linux, Homebrew simplifies the installation of software packages and libraries.

4. npm (Node Package Manager) and Yarn are package managers for JavaScript and Node.js that manage dependencies in web development projects.

5. pip: The package installer for Python, allowing developers to easily install Python libraries and packages from the Python Package Index (PyPI) repository.

6. Composer: is a dependency manager for PHP, used to manage libraries and dependencies in PHP projects.

Package managers greatly simplify software development by automating the process of installing and managing software dependencies. This reduces errors, improves efficiency, and facilitates collaboration among developers.

What are the pros and cons of Package Managers?

Package managers offer numerous benefits to software developers and users alike, but they also have drawbacks. Here are the pros and cons of using a package manager:

Pros:

Simplified Installation

Package managers automate the process of installing software packages, making it straightforward for users to add new software to their systems.

Dependency Management: 

Package managers handle dependencies automatically, ensuring all required libraries and components are installed correctly. This reduces the likelihood of dependency conflicts and makes it easier to manage complex software ecosystems.

Version Control: 

Package managers keep track of software versions and updates, allowing users to upgrade to newer versions when they become available quickly. This helps ensure that software remains up-to-date and secure.

Centralised Repository: 

Package managers typically provide access to a centralised repository of software packages, making it easy for users to discover new software and libraries.

Consistency: 

Package managers enforce consistency across different environments by ensuring that all installations are performed in a standardised manner. This reduces the likelihood of configuration errors and compatibility issues.

Cons:

Limited Control: 

Package managers abstract away many of the details of software installation and configuration, which can sometimes limit users’ control over the installation process. Advanced users may prefer more manual control over their software installations.

Security Risks: 

Because package managers rely on centralised repositories, malicious actors could compromise them and distribute malicious software packages. Users must trust the integrity of the repository and the software packages hosted within it.

Versioning Issues: 

Dependency management can sometimes lead to versioning issues, primarily when multiple software packages depend on conflicting versions of the same library. Resolving these conflicts can be challenging and may require manual intervention.

Performance Overhead: 

Package managers introduce a performance overhead, as they must download and install software packages and their dependencies. In some cases, this overhead may be negligible, but it can become a concern for large projects or systems with strict performance requirements.

Dependency Bloat: 

Dependency management can sometimes lead to “dependency bloat,” where software projects rely on a large number of external libraries and components. This can increase the size of software installations and introduce additional maintenance overhead.

While package managers offer significant benefits in simplifying software installation and dependency management, users must be aware of the potential drawbacks and trade-offs involved.

What is Maven?

Apache Maven is a powerful build automation and dependency management tool primarily used for Java projects. However, it can also manage projects in other languages like C#, Ruby, and Scala. It provides a consistent and standardised way to build, test, and deploy software projects. Here are some critical details about Maven:

Project Object Model (POM): 

Maven uses a Project Object Model (POM) to describe a project’s structure and configuration. The POM is an XML file that contains information about the project’s dependencies, build settings, plugins, and other metadata. It serves as the central configuration file for the project and is used by Maven to automate various tasks.

Dependency Management: 

One of Maven’s key features is its robust dependency management system. Dependencies are specified in the POM file, and Maven automatically downloads the required libraries from remote repositories such as Maven Central. Maven also handles transitive dependencies, automatically resolving and downloading dependencies required by other dependencies.

Convention over Configuration: 

Maven follows the “convention over configuration” principle, which encourages standard project structures and naming conventions. By adhering to these conventions, Maven can automatically infer settings and reduce the configuration required.

Build Lifecycle: 

Maven defines a standard build lifecycle consisting of phases: compile, test, package, install, and deploy. Each phase represents a specific stage in the build process, and Maven plugins can be bound to these phases to perform various tasks such as compiling source code, running tests, creating JAR or WAR files, and deploying artefacts.

Plugins: 

Maven is highly extensible through plugins, which provide additional functionality for tasks such as compiling code, generating documentation, and deploying artefacts. Maven plugins can be either built-in or custom-developed, and they can be configured in the POM file to customise the build process.

Central Repository: 

Maven Central is the default repository for hosting Java libraries and artefacts. It contains a vast collection of open-source and third-party libraries that can be easily referenced in Maven projects. Additionally, organisations and individuals can set up their own repositories to host proprietary or custom libraries.

Integration with IDEs: 

Maven integrates seamlessly with popular Integrated Development Environments (IDEs) such as Eclipse, IntelliJ IDEA, and NetBeans. IDEs typically provide Maven support through plugins, allowing developers to import Maven projects, manage dependencies, and run Maven goals directly from the IDE.

Transparency and Repeatability: 

Maven uses declarative configuration and standardised project structures to promote transparency and repeatability in the build process. This makes it easier for developers to understand and reproduce builds across different environments.

Overall, Apache Maven is a versatile and widely used tool in the Java ecosystem. It offers powerful features for automating build processes, managing dependencies, and promoting best practices in software development. Its adoption by numerous open-source projects and enterprises underscores its importance in modern software development workflows.

What are typical Securit Risks using Maven?

While Apache Maven is a widely used and generally secure tool for managing dependencies and building Java projects, there are some potential security risks associated with its usage:

Dependency vulnerabilities: 

One of the leading security risks with Maven (as with any dependency management system) is the possibility of including dependencies with known vulnerabilities. If a project relies on a library with a security flaw, it could expose the application to various security risks, including remote code execution, data breaches, and denial-of-service attacks. It’s essential to update dependencies to patched versions regularly and use tools like OWASP Dependency-Check to identify and mitigate vulnerabilities.

Malicious dependencies: 

While Maven Central and other reputable repositories have strict guidelines for publishing artefacts, there is still a risk of including malicious or compromised dependencies in a project. Attackers could compromise a legitimate library or create a fake library with malicious code and upload it to a repository. Developers should only use trusted repositories, verify the integrity of dependencies, and review code changes carefully.

Repository compromise: 

Maven relies on remote repositories to download dependencies, and if a repository is compromised, it could serve malicious or tampered artefacts to unsuspecting users. While Maven Central has robust security measures, smaller or custom repositories may have weaker security controls. Organisations should implement secure repository management practices, such as using HTTPS for communication, signing artefacts with GPG, and restricting access to trusted users.

Man-in-the-middle attacks: 

Maven communicates with remote repositories over HTTP or HTTPS, and if an attacker intercepts the communication, they could tamper with the downloaded artefacts or inject malicious code into the project. To mitigate this risk, developers should use HTTPS for all repository communications, verify SSL certificates, and consider using tools like Maven’s own repository manager or Nexus Repository Manager, which support repository proxies and caching to reduce the reliance on external repositories.

Build script injection: 

Maven’s build scripts (POM files) are XML files that define project configurations, dependencies, and build settings. If an attacker gains unauthorised access to a project’s source code repository or CI/CD pipeline, they could modify the build script to execute arbitrary commands or introduce vulnerabilities into the build process. Organisations should implement proper access controls, code review processes, and security scanning tools to detect and prevent unauthorised changes to build scripts.

By understanding these security risks and implementing best practices for secure software development and dependency management, developers can mitigate potential threats and ensure the integrity and security of their Maven-based projects. Regular security audits, vulnerability scanning, and staying informed about security updates and patches are also essential for maintaining a secure development environment.

How does the dependency-resolving mechanism work in Maven?

Maven’s dependency resolution mechanism is a crucial aspect of its functionality, ensuring that projects have access to the necessary libraries and components while managing conflicts and versioning issues effectively. Here’s how the dependency-resolving mechanism works in Maven:

Dependency Declaration:

Developers specify dependencies for their projects in the project’s POM (Project Object Model) file. Dependencies are declared within the `<dependencies>` element, where each dependency includes details such as group ID, artifact ID, and version.

Dependency Tree:

Maven constructs a dependency tree based on the declared dependencies in the POM file. This tree represents the hierarchical structure of dependencies, including direct dependencies (specified in the POM file) and transitive dependencies (dependencies required by other dependencies).

Repository Resolution:

When a build is initiated, Maven attempts to resolve dependencies by searching for them in configured repositories. By default, Maven Central is the primary repository, but developers can specify additional repositories in the POM file or through Maven settings.

Dependency Download:

If a dependency is not already present in the local Maven repository, Maven downloads it from the remote repository where it’s hosted. Maven stores downloaded dependencies in the local repository (`~/.m2/repository` by default) for future reuse.

Version Conflict Resolution:

Maven employs a strategy for resolving version conflicts when multiple dependencies require different versions of the same library. By default, Maven uses the “nearest-wins” strategy, where the version closest to the project in the dependency tree takes precedence. Developers can explicitly specify versions for dependencies to override the default resolution behaviour. Maven also provides options such as dependency management and exclusions to manage version conflicts better.

Transitive Dependency Resolution:

Maven automatically resolves transitive dependencies, ensuring all required libraries and components are included in the project’s classpath. Maven traverses the dependency tree recursively, downloading and including transitive dependencies as needed.

Dependency Caching:

Maven caches downloaded dependencies in the local repository to improve build performance and reduce network traffic. Subsequent builds reuse cached dependencies whenever possible, avoiding redundant downloads.

Dependency Scope:

Maven supports different dependency scopes (e.g., compile, test, runtime) to control the visibility and usage of dependencies during various phases of the build lifecycle. Scopes help prevent unnecessary dependencies in production builds and improve build efficiency.

Overall, Maven’s dependency resolution mechanism simplifies the management of project dependencies by automating the process of downloading, organizing, and resolving dependencies. This allows developers to focus on writing code rather than manually managing library dependencies.

What attacks target Maven’s cache structure?

Attacks targeting the cache structure of Maven, particularly the local repository cache (`~/.m2/repository`), are relatively rare but can potentially exploit vulnerabilities in the caching mechanism to compromise the integrity or security of Maven-based projects. Here are some potential attacks that could target Maven’s cache structure:

Cache Poisoning:

Description: Cache poisoning attacks involve manipulating the contents of the local repository cache to introduce malicious artefacts or modified versions of legitimate artefacts. Once poisoned, subsequent builds may unwittingly use the compromised artefacts, leading to security vulnerabilities or system compromise.

Attack Vector: Attackers may exploit vulnerabilities in Maven’s caching mechanism, such as improper input validation or insecure handling of cached artefacts, to inject malicious artefacts into the cache.

Mitigation: To mitigate cache poisoning attacks, Maven users should regularly verify the integrity of cached artefacts using checksums or digital signatures. Employing secure repository management practices, such as signing artefacts with GPG and enabling repository managers with artefact validation, can also enhance security.

Cache Exfiltration:

Description: Cache exfiltration attacks involve an attacker extracting sensitive information from the local repository cache. This could include credentials, private keys, or other confidential data inadvertently stored within cached artefacts or metadata.

Attack Vector: Attackers may exploit vulnerabilities in Maven or its plugins to access and extract sensitive information stored in the local repository cache. For example, a compromised plugin could inadvertently leak credentials stored in Maven settings files.

Mitigation: To mitigate cache exfiltration attacks, developers should avoid storing sensitive information in Maven configuration files or cached artefacts. Instead, they should use secure credential management practices, such as environment variables or encrypted credential stores, to prevent the inadvertent exposure of sensitive data.

Cache Manipulation:

Description: Cache manipulation attacks involve modifying the contents of the local repository cache to alter the behaviour of Maven builds or compromise the integrity of projects. To achieve their objectives, attackers may tamper with cached artefacts, metadata, or configuration files.

Attack Vector: Attackers may exploit vulnerabilities in Maven or its dependencies to manipulate the local repository cache. For example, an attacker could modify cached artefacts to include backdoors or malware.

Mitigation: To mitigate cache manipulation attacks, developers should ensure that the local repository cache is stored in a secure location with restricted access permissions. They should also regularly verify the integrity of cached artifacts and configuration files using checksums or digital signatures. Implementing secure software development practices, such as code reviews and vulnerability scanning, can also help detect and prevent cache manipulation attacks.

While attacks targeting Maven’s cache structure are relatively uncommon, developers should remain vigilant and implement security best practices to safeguard against potential vulnerabilities and threats. Regularly updating Maven and its dependencies to the latest versions and maintaining awareness of security advisories and patches is essential for mitigating the risk of cache-related attacks.

Conclusion:

In conclusion, while attacks explicitly targeting Maven’s cache structure are relatively rare, they pose potential risks to the integrity and security of Maven-based projects. Cache poisoning, cache exfiltration, and cache manipulation are possible attack vectors that attackers may exploit to compromise Maven’s local repository cache (`~/.m2/repository`).

To mitigate these risks, developers and organisations should adopt robust security measures and best practices:

Regularly Verify Artifact Integrity: Employ mechanisms such as checksums or digital signatures to verify the integrity of cached artefacts and ensure they haven’t been tampered with.

Secure Credential Management: Avoid storing sensitive information, such as credentials or private keys, in Maven configuration files or cached artefacts. Instead, use secure credential management practices, such as environment variables or encrypted stores.

Access Control and Permissions: Ensure the local repository cache is stored securely with restricted access permissions to prevent unauthorised access or manipulation.

Update and Patch: Regularly update Maven and its dependencies to the latest versions to mitigate potential vulnerabilities. Stay informed about security advisories and apply patches promptly.

Secure Repository Management: Implement secure repository management practices, including signing artefacts with GPG, enabling repository managers with artefact validation, and using HTTPS for repository communication.

By implementing these security measures and remaining vigilant, developers can reduce the likelihood of cache-related attacks and enhance the overall security posture of Maven-based projects. Additionally, fostering a culture of security awareness and education within development teams can help mitigate risks and respond effectively to emerging threats in the software development lifecycle.

« Older Entries