Tag Archives: EclipseStore

EclipseStore High-Performance-Serializer

I will introduce you to the serializer from the EclipseStore project and show you how to use it to take advantage of a new type of serialization.

Since I learned Java over 20 years ago, I wanted to have a simple solution to serialize Java-Object-Graphs, but without the serialization security and performance issues Java brought us. It should be doable like the following…

If you want to know how this is doable with the new Open-Source Project EclipseSerializer? You are right here.

Before we look at the Open-Source project EclipseStore Serializer, I want to recap a bit of the challenges coming from the Java Serialization itself. This will be the background information to see how powerful the project is. 

Java Serialization in a nutshell

Java Serialization is a mechanism provided by the Java programming language that allows you to convert the state of an object into a byte stream. This byte stream can be easily stored in a file, sent over a network, or otherwise persisted. Later, you can deserialize the byte stream to reconstruct the original object, effectively saving and restoring the object’s state.

Here are some key points about Java Serialization:

Serializable Interface: To make a Java class serializable, it needs to implement the `Serializable` interface. This interface doesn’t have any methods; it acts as a marker interface to indicate that the objects of this class can be serialized.

Serialization Process: To serialize an object, you typically use `ObjectOutputStream`. You create an instance of this class and write your object to it. For example:

Deserialization Process: To deserialize an object, you use `ObjectInputStream`. You read the byte stream from a file or network and then use `ObjectInputStream` to recreate the object.

Versioning Considerations: If you change a serialised class’s structure (e.g., adding or removing fields or changing their types), the deserialization of old serialized objects may fail. Java provides mechanisms like serialVersionUID to help with versioning and compatibility.

Security Considerations: Serialization can be a security risk, especially if you deserialize data from untrusted sources. Malicious code can be executed during deserialization. To mitigate this risk, you should carefully validate and sanitize any data you deserialize or consider alternative serialization mechanisms like JSON or XML.

Custom Serialization: In your class, you can customize the serialization and deserialization process by providing `writeObject` and `readObject` methods. These methods allow you to control how the object’s state is written to and read from the byte stream.

In summary, Java Serialization is a valuable feature for persisting objects and sending them across a network. However, it comes with some challenges related to versioning and security, so it should be used cautiously, especially when dealing with untrusted data sources.

What are the security issues

Java Serialization can introduce several security issues, particularly when deserializing data from untrusted sources. Here are some of the security concerns associated with Java Serialization:

Remote Code Execution: One of the most significant security risks with Java Serialization is the potential for remote code execution. When you deserialize an object, the Java runtime system can execute arbitrary code contained within the serialized data. Attackers can exploit this to execute malicious code on the target system. This vulnerability can lead to serious security breaches.

Denial of Service (DoS): An attacker can create a serialized object with a large size, causing excessive memory consumption and potentially leading to a denial of service attack. Deserializing large objects can consume significant CPU and memory resources, slowing down or crashing the application.

Data Tampering: Serialized data can be tampered with during transmission or storage. Attackers can modify the serialized byte stream to alter the state of the deserialized object or introduce vulnerabilities.

Insecure Deserialization: Deserializing untrusted data without proper validation can lead to security issues. For example, if a class that performs sensitive operations is deserialized from untrusted input, an attacker can manipulate the object’s state to perform unauthorized actions.

Information Disclosure: When objects are serialized, sensitive information may be included in the serialized form. If this data is not adequately protected or encrypted, an attacker may gain access to sensitive information.

How To Mitigate Serialization Issues

To mitigate these security issues, consider the following best practices:

Avoid Deserializing Untrusted Data: If possible, avoid deserializing data from untrusted sources altogether. Instead, use safer data interchange formats like JSON or XML for untrusted data. (Or use the EclipseSerializer 😉 )

Implement Input Validation: When deserializing data, validate and sanitize the input to ensure it adheres to expected data structures and doesn’t contain unexpected or malicious data.

Use Security Managers: Java’s Security Manager can be used to restrict the permissions and actions of deserialized code. However, it’s important to note that Security Managers have been deprecated in newer versions of Java.

Whitelist Classes: Limit the classes that can be deserialized to a predefined set of trusted classes. This can help prevent the deserialization of arbitrary and potentially malicious classes.

Versioning and Compatibility: Be cautious when making changes to serialized classes. Use `serialVersionUID` to manage versioning and compatibility between different versions of serialized objects.

Security Libraries: Consider using third-party libraries like Apache Commons Collections or OWASP Java Serialization Security (Java-Serial-Killer) to help mitigate known vulnerabilities and prevent common attacks.

In summary, Java Serialization can introduce serious security risks, especially when dealing with untrusted data. It’s essential to take precautions, validate inputs, and consider alternative serialization methods or libraries to enhance security. Additionally, keeping your Java runtime environment up to date is crucial, as newer versions of Java may include security improvements and fixes for known vulnerabilities.

Why is JSON or XML not the perfect solution for the JVM?

Many papers and lectures recommend circumventing the security risks of serialization by using XML or JSON. This is a structured representation of the data that is to be transferred. There are also security problems, but I will address these in a separate article. However, what should be addressed are two things. First, the data must be converted into a text representation. This usually requires more data volume than with a pure binary model. In addition, data such as the binary data of images must be recoded so that only printable or UTF-8 characters can be transmitted. This process requires a lot of time and usually a lot of memory when transforming it into XML and back from XML into the original format.

The second point that causes problems in most cases is the data structure. In XML and JSON, object references can only be stored in a more manageable manner. This makes processing many times more complicated, slower and more resource-intensive. Even though many solid solutions can be used to convert Java objects into XML or JSON, I recommend looking for new approaches occasionally.

EclipseStore – Serializer – Practical Part

Now, let’s get to the practical stuff in this article. The dependency is needed first. To do this, we add the following instructions to the pom.xml. The first release was prepared when writing this article, and a SNAPSHOT version (1.0.0-SNAPSHOT) was available from the repositories. In this case, you still have to use the SNAPSHOT repositories (https://oss.sonatype.org/content/repositories/snapshots)

Definition inside the pom.xml.

The rest will happen quickly once we’re ready and have fetched the dependency. For the first test, we created a class called Node. Each Node can have a right and a left child. With this, we can create a tree.

As an example, I created the following construct and then serialized and de-serialized it once using the serializer.

Now, let’s see whether this also works with a Java object graph. To do this, the class representing the node is changed so that a father node can also be defined. Cycles can now be set up.

We take the graph listed here as an example.

This graph is also processed without any problems, without the cycles causing any issues.

You can make the examples even more complex and try out the subtleties of inheritance. New data types from JDK17 are also supported. This means I have a potent tool to handle various tasks. For example, one use can be found in another Eclipse project called EclipseStore. A persistence mechanism is provided here based on this serialization. But your own small projects can also benefit from this. I will show how quickly this can be integrated into a residual service.

Building A Simple REST Service

If you want to create a simple REST service for transferring byte streams in Java without using Spring Boot, you can use the Java SE API and the HttpServer class from the com.sun.net.httpserver package, which allows you to create an HTTP server. 

  • We create an HTTP server on port 8080 and define a context for handling requests to “/api/bytestream.”
  • The ByteStreamHandler class handles both POST requests for uploading byte streams and GET requests for downloading byte streams.
  • For POST requests, it reads the incoming byte stream, processes it as needed, and sends a response.
  • For GET requests, it sends a predefined byte stream as a response.

Remember that this is a simple example, and you can expand upon it to handle more complex use cases and error handling as needed for your specific application. Also, note that the com.sun.net.httpserver package is part of the JDK, but it may not be available in all Java distributions.

Conclusion:

We’ve looked at the typical problems with Java’s original serialization and how cumbersome the implementation is to use. The detour via JSON and XML is unnecessary when communicating from JVM to JVM using the open-source Eclipse Serializer project. There are no restrictions when it comes to modelling a graph, as not only are the current new data types up to and including JDK17 already processed, but cycles within the graph are also no problem.

Using the Serializable interface is also unnecessary and does not influence processing. The easy handling allows it to be used even in tiny projects such as the REST service shown here using on-board JDK resources. A larger project that uses the serializer is the open-source project EclipseStore. A high-performance persistence mechanism for the JVM is offered here.

Happy Coding

Sven

EclipseStore – Storing more complex data structures

How to store complex data structures using EclipseStore? Are there any restrictions? How can you work more efficiently on such structures? We will now get to the bottom of these questions here.

In the first part of my series, I showed how to prepare EclipseStore for use in a project. We also initialized the StorageManager and saved, modified and deleted the first data. But what about more complex structures? Can you use inheritance? To do this, we will now create a small class model.

First, let’s look at what inheritance looks like. To do this, we take an interface called BaseInterfaceA, an implementation BaseClassA and a derivative LevelOneA. We will now try to save this and see how it behaves depending on the input when saving.

Here is the listing of the classes and interfaces involved. Since we already saw in the first part that the implementation variant has no influence, I will use a standard implementation regarding getters and setters or the constructors.

Case I: Direct saving of the respective classes

The entities are packed directly into a list and saved in the first case. As we can see, there are no special features here. This is the same case as the first article. The output of the printElements method is exactly what we expected.

console output:

LevelOneA{valueOneA=’levelOneA – 01′}

LevelOneA{valueOneA=’levelOneA – 02′}

LevelOneA{valueOneA=’levelOneA – 03′}

 ==========

Case II: Save as a base class

Now, let’s consider the case where a class is passed to the StorageManager for storage as one of its base classes. In this case, the LevelOneA class is passed as the base class BaseClassA. It should be noted that this primary type was also used when defining the root list.

The output again meets our expectations.

LevelOneA{valueOneA=’levelOneA – 01′}

LevelOneA{valueOneA=’levelOneA – 02′}

LevelOneA{valueOneA=’levelOneA – 03′}

 ==========

Case II: Saving an implementation as an interface

Now let’s move on to the case in which we proceed via the interface. There are no problems here either. Everything behaves as expected.

The output on the console is still unchanged.

LevelOneA{valueOneA=’levelOneA – 01′}

LevelOneA{valueOneA=’levelOneA – 02′}

LevelOneA{valueOneA=’levelOneA – 03′}

 ==========

Case IV: Saving a mixed list via a common interface

Now if we define a list of type BaseInterfaceA as root and then add instances of different implementations. How does EclipseStore behave during the save process? To make a long story short, it works as intended. All instances are stored neatly after deployment. So we have no loss of information.

The console output will then look like this.

LevelOneA{valueOneA=’levelOneA – 01′}

BaseClassA{valueBaseA=’BaseClassA – 02′}

LevelOneA{valueOneA=’levelOneA – 03′}

 ==========

Storing lists, trees and graphs

Now that we’ve looked at whether inheritance is supported as much as we hoped, we’re getting to the point where we can start thinking about the data structures themselves. We have seen that simple entities and lists of entities are relatively easy for EclipseStore. Now, let’s go one step further and look at trees and graphs.

Storing of trees

In computer science, we understand a tree to be a data structure that can have branches but does not contain any cycles. This makes them a special form of graphs, just as lists are a special form of trees. So, let’s come to a tree implementation in which a node can have two child nodes. Of course, you can also build this structure with n child nodes, but this will not bring any added value in our case. If we create such a tree and give the root node to the StorageManager, we can save this tree without any further action. Modifying individual elements also works as usual. You can also change the desired segment and save on this or any higher-level piece.

Output to the console (subsequently formatted for readability):

Node{id=’rootNode’, 

leftNode=Node{id=’Root-L’, 

leftNode=Node{id=’Root-L-L’, 

leftNode=null, 

rightNode=null

}, 

rightNode=null

}, 

rightNode=Node{id=’Root-R’, 

leftNode=Node{id=’Root-R-L’, 

leftNode=null, 

rightNode=null

}, 

rightNode=Node{id=’Root-R-R’, 

leftNode=null, 

rightNode=null

}

}

}

Storing graphs

Unfortunately, you don’t just have to deal with lists and trees. Complex data models often have cycles. This can come about, for example, through bidirectional relationships. Unwanted cycles can also arise in a model if it grows over time and is repeatedly expanded. Whether these cycles are tightly or loosely coupled plays a minor role. The following example creates a chart that contains multiple cycles. Can the diagram then be saved, modified and loaded? Does EclipseStore recognize these structures and can resolve or break the loops?

In this example, the graph consists of nodes of the GraphNode class. This class contains an attribute of type String to store an ID. There is also a reference to the parent node and a list of child nodes. This means you can now create any nested chart you want.

In this case, the graphic looks like this.

The node rootNode is passed to the StorageManager as root and saved. A subsequent loading of the root results in the previously created graph.

As we can see from this example, EclipseStore can store graphs without any problems. Cycles are acceptable here.

Conclusion:

We have now seen that EclipseStore can store any data structure in its entirety; lists, trees and graphs are not a limitation. This allows us to create the data model independently of the persistence layer. This is a possibility that I have always been looking for in the last 20 years of my Java activities.

We will now deal with the intricacies of EclipseStore in the following parts. It remains exciting.

Happy coding

Sven

How to start with EclipseStore – 01

We will take the first steps with the Eclipse Store here. I will show what the Eclipse Store is, how you can integrate it into your project and what the first steps look like from a developer’s perspective. All in all, it is a practical introduction.

What is Eclipse Store?

First, I would like to briefly explain what Eclipse Store actually is. Simply put, this is a mechanism for storing Java object trees. This initially sounds very theoretical, but it is much simpler than you might think.

When I started developing Java applications in 1996, I dreamed of being able to easily store the objects I created in my application. I found the assumption that serialization could be used to write everything to the hard drive as a byte stream very good. Unfortunately, the reality looked different or still looks different today. You were quickly confronted with various technologies, all with their own characteristics. I still remember very clearly the first steps in which the application’s data had to be persisted via a JDBC interface. Who doesn’t know the JDBC-ODBC interface from back then? A typical application now had JDBC connections, mapping layers that implement everything in SQL and, of course, caches in various places so that the application’s responsiveness remained tolerable for the user. It’s not easy when dealing with a more complex data model.

With Eclipse Store, we now have a tool that eliminates most of these technology layers. So we have finally achieved what was promised to us in 1996.

Let’s now come to the implementation of our own project. For this, we need the maven dependencies. For this example, I used the first final version. This is based on Microstream Version 8 and is the basis for the Eclipse project. As far as I know, Microstream is not being actively developed further.

Now that we have added the necessary dependencies to the project, we can start with the first lines of source code.

Basic first Steps:

The Storage Manager:

We will now look at the essential elements of architecture. The linchpin of the persistence solution is the “Storage Manager”. This is the interface through which the behaviour can be configured. All entities are also transferred to persistence, retrieved and deleted. The storage manager itself is available in various forms. We will focus on embedded storage here for now. This implementation is an in-memory solution that accesses the local hard drive directly. You shouldn’t let the name confuse you. As is usual with RDBMS systems, it is not a tiny solution that allows you to do a little testing. The implementation of embedded storage is absolutely suitable for production.

So, an instance of this implementation is needed. The easiest way to get this is to call:

The chosen implementation depends on which dependencies have been defined in the pom.xml. In our case, it’s the embedded implementation. We will discuss various other implementations in the following parts. We will also take a closer look at how the behaviour can be configured later. These parameters are irrelevant for the first steps and can safely be ignored.

Storage Root:

The storage manager now needs to know the root of the graphs to be stored. You can imagine this as described below.

If we only have one element that needs to be stored, then no further wrapper is necessary. However, if we have two parts that we want to save, we need a container to hold both elements. This container is then the root through which initial access to all elements can occur. This can be a list or a class specifically tailored to your needs. This container must now be made available for persistence.

Let’s look at an example. Two different strings should be stored here. A list is used as a container. This means that the instance of the list is the StorageRoot element. This must now be declared as an anchor point.

The “storeRoot()” instruction causes this modification to be saved at the root. We will see this “storeRoot()” method call more often. This persists in all changes in the graph, starting from the root. In our case, the root itself is saved.

How can another element be added to this container? There is no need to keep a reference to this container. Instead, we can ask the Storage Manager for this reference. What should not be forgotten at this point? If you overwrite the root, all previously saved elements will be deleted. These are then no longer under the supervision of persistence.

A downer at this point. Unfortunately, the “root()” method is not typed. Here, you have to know exactly what type the container is. We’ll look at how to deal with this better. I ask for patience here.

Any number of elements can now be added or removed from this list. To save the changes, the StorageManager is informed of this. Let’s look at this in a little more detail. We will expand and modify the previous source code for this. Instead of simple strings, a separate class with two attributes is now used. The first attribute is again a string and represents the data value. The second attribute is an instance of the type LocaDateTime and is an automatically set timestamp. We can already see here that it is no longer a simple attribute. The constructor takes over the data value, which can be modified later. The timestamp cannot be set explicitly. There is only one getter here. Such constructs can also be processed by EclipseStore without any further action.

Case I – We add elements to the container:

In our case, the simplest case is to add one or more elements to the root element, a list of type DataElement. This is done by adding the desired elements to the list and calling the “storeRoot()” method at the root level or saving the elements yourself.

And here is the method in which the elements are specified directly.

Case II – We remove elements from the container:

The opposite operation of adding elements is deleting them. Here, too, the process is very straightforward. However, looking at the different options is unusual at the beginning. First, it’s the most intuitive version. Here, the element is removed from the holding object, the root itself. This reduced instance is then passed on to the StorageManager with the request to save this modification.

However, you can also use the removed element yourself. For this purpose, the element previously removed from the holding instance is passed to the StorageManager for storage. This approach works but is anything but intuitive. At this point, for reasons of comprehensibility, you should not write it. Your colleagues will be grateful for that.

Case III – We modify the elements within the container:

Since elements can now be added and removed, the only thing missing is the modification. Here, too, the procedure is analogous to adding. The instance is modified and then passed to the StorageManager for storage. The saving process can, of course, also take place via a holding element. For readability reasons, however, saving the modified elements yourself is strongly advisable. Exception can, of course, be when you change a lot of elements within a holding instance.

Fazit:

We have now seen how easy it is to get started with EclipseStore. Elements can now be saved without any further effort. This way, you can realize the first small applications. But that’s not the end of it. The following parts will address the intricacies and challenges of more complex applications. So it remains exciting.

Happy Coding

Sven