Mastering Data Caching Techniques in MuleSoft

Introduction to Caching

Caching stands as a pivotal mechanism in augmenting the performance of diverse applications. Essentially, it involves the process of storing data in a cache – a specialized hardware or software component designed for data retention. The primary goal of caching is to expedite future data requests, enabling quicker data retrieval than from the source.

The Significance of Caching

The importance of caching in application development cannot be overstated. It plays a crucial role in boosting performance metrics. When an application is tasked with fetching data, it typically queries back-end databases or other systems. In scenarios where request volumes are high, direct and repeated access to these back-end systems can significantly strain resources and elongate response times. Caching addresses this issue by storing frequently requested data, thus diminishing the load on back-end systems and ensuring swifter responses.

Particularly in the context of HTTP Get requests, caching serves dual purposes: enhancing content delivery speed to users and reducing server workload. The latter is especially vital in preventing unnecessary computational demands on server resources.

The Mechanics of Caching

At its core, caching involves storing data in high-speed access components like RAM (Random-access Memory), often supplemented by software elements. The fundamental objective of a cache is to streamline data access by minimizing reliance on slower, underlying storage layers.

Upon the arrival of a new data request, the cache is the first search location. A ‘cache hit’ is registered when the data is found in the cache, leading to rapid data retrieval. Conversely, a ‘cache miss’ occurs when the data isn’t in the cache, necessitating access to the slower primary data store. The efficiency of a caching system is directly proportional to its ability to serve a higher number of requests from the cache, thereby accelerating overall system performance.

Types of Caching

There are two types of caching:

Server-side caching
Client-side caching

Advantages of Implementing Caching

Network Efficiency

Caching significantly reduces network costs by storing content at various points within the network path between the content origin and the consumer. When cached closer to the user, it minimizes additional network activity, thereby optimizing bandwidth usage.

Enhanced Responsiveness

A major benefit of caching is improved response times. Since it negates the need for a complete network round trip, content retrieval becomes substantially faster. Browser caches, for example, can provide nearly instantaneous access to content.

Optimized Performance on Existing Hardware

Caching allows for more efficient use of the original server hardware. By enabling aggressive caching, the content origin server can deliver higher performance without additional hardware investment, leveraging the capacities of servers along the delivery path.

Content Availability During Network Disruptions

Caching policies can maintain content availability even during short-term disruptions from the origin servers, ensuring continuous access for end-users.

Appropriate Scenarios for Caching

Token Caching in REST API

In scenarios involving REST APIs that require JWT tokens, caching these tokens can minimize repetitive calls and enhance API response times.

Static Data Caching

Caching is ideal for static data retrieval from backend databases, reducing the frequency of database queries.

Batch Data Changes

For back-end systems that only update data through daily or nightly batches, caching the existing data ensures consistent availability.

Caching in REST APIs

REST architecture inherently supports caching

GET requests, which are cacheable by default.
POST requests can be made cacheable with appropriate headers, while responses to PUT and DELETE requests are generally not cacheable.

Caching Strategies in Mule

Caching is an effective strategy that can be used to reduce load, latency, and performance. Caching can be implemented in Mule by using of two strategies:

Using “HTTP Caching” policy in API Manager to cache the whole API response
Using “Cache Scope” or “Object Store” Connector to cache specific endpoints

Generally, it is advised not to cache the whole API response as certain endpoints may require non-caching implementation. In this article we will be going through all the strategies.

HTTP Caching Policy

This policy is typically used for static data and involves caching the entire API response. It relies on Mule’s object store for data retention and adheres to HTTP caching directives.

HTTP Caching is generally used for the static lookup data.
Within this policy, users can avoid making multiple calls to backend system to fetch any data which in turn improves API performance by reducing any expensive data processing.
The API policy internally uses mule object store to store the cache.
It follows HTTP caching directives, including invalidation and it also requires object store v2 to be enabled on the application when you are deploying it to a cloudhub.

Cache Scope and Object Store Connector

These tools allow for specific endpoint caching, providing flexibility in managing cacheable and non-cacheable endpoints. This approach is recommended over full API response caching to cater to varied endpoint requirements.

In conclusion, caching is a strategic tool in MuleSoft that boosts performance, reduces server load, and ensures content availability, making it an essential component in efficient application design.

Steps to Implement HTTP Caching Policy

Deploy your Mule application with Object Store v2 enabled on CloudHub.
In API Manager, apply the HTTP caching policy to your active API.

3. Open the application and click on policies tab on left panel.

4. Click on “Add Policy” and search for “HTTP Caching” policy.

5. Select “HTTP Caching” policy and click next.

You will be able to see the following fields which we need to configure:

- HTTP Caching Key – Define the unique key in the incoming request. Should be a data weave expression.
- Maximum Cache Entries – Maximum no of entries to cache.
- Maximum Cache Entries – Maximum no of entries to cache.
- Distributed – True in case we are using cluster or multiple workers
- Persistent – True in case we want to persist cache data in case of runtime restart.
- Follow HTTP Caching directives – To add proper HTTP headers in response to tell client we are using caching.
- Invalidation Header – To specify in case we don’t want cache result.
- Condition Request Caching Expression – Based on which caching will be used so if condition is true then only use caching e.g., storing the result in cache and reuse in case we get the same request.
- Condition Response Caching Expression – Based on which caching will be used so if condition is true then only use caching e.g., storing the result in cache and reuse in case we get the same request.

6. Update the policy configuration as per your need and click on apply.

7. Now you can see the “HTTP Caching” policy is applied to your API.

8. Hit your deployed API. For the 1^st hit you can see the logs (if you have added loggers in your application) in cloud hub console. And in response header we won’t be able to see any headers related cache.

9. Now when you hit for the second time you won’t be able to find the cloud hub logs and in the response headers you can see cache age header.

In this way we can apply and configure HTTP caching policy on APIs.

Cache Scope (W/O Object Store)

The cache scope is for storing and reusing frequently called data. We can use a cache scope to reduce the processing load on mule instance and to increase the speed of message processing within a flow.
On each request it calculates the key using below algorithm and then stores the response payload of the flow as value for that key. However, we can also set up our own key through a custom caching strategy.
key = SHA256KeyGenerator + SHA256Digest
Rather than invoking the backend APIs and performing many database queries for each customer query, you can use Cache scope to send fewer requests – say, once every 10 minutes and respond to the customer with a cached response.
Cache scope stores the values in memory i.e., in heap memory by default. So, when we restart or the application crashes then it clears the data in cache scope.

Note: If we want to store the data in persistent then we need to use object store.

We’ll see more practically while building our Mule application.

Caching Strategy

By default, Cache scope uses a default caching strategy that stores data in an in-memory object store.
The disadvantage of an in-memory object store is if your application crashes, your data will be erased.
If you want the data to be persistent, create a new custom object store or refer to an ex