Pub.cache vs pub.storage

We have a small amount of state that needs to be persistently stored for some integrations, and we don’t have place in the source or target applications to store the information. These will not be high volume integrations, so performance is not a large concern.

pub.storage:put and pub.storage:get seem like they would do the job, but the need to explicitly manage locking seems potentially troublesome

pub.cache:put and pub.cache:get also seem viable, if we choose the “Eternal” and “Persist to Disk” options on the cache configuration.

Does anyone have experience and opinions on what the best option is for new code?

1 Like

In general we would recommend the cache services over the storage services. Simply because they are more modern. They will perform better, but are NOT designed for large volumes. If you simply want to store context for a short period of time then great, but ensure that the volume is a known quantity and not too large.

Otherwise use the store services, but make sure you reconfigure the ISCore JDBC connection to use a proper database, not the default embedded database.

regards,
John.

2 Likes

Complete academic discussion side trip: I’m often bemused at the buzzwords we adopt from time to time. “Modern” is one of those recent en vogue terms. And can have condescending undertones. What characteristics are applicable for “modern”? If something gets labelled as “more modern” what judgements might we make? What capabilities or features can we assume? Will such assumptions be accurate?

May I humbly suggest to avoid labels that are not well-defined and instead explicitly note the characteristics that are useful for decisions. :slight_smile:

Regarding the OP’s question about “what the best option is” consider that “best” is always subjective. What works for one in a given situation may not be good for another, even if the situation is similar.

We’ve used the pub.storage services without issue, though our scenario may be a bit simpler than yours – storing the timestamp of the “last record seen” to be used in the next scheduled run as the “get records modified after time”. No possibility in our case of multiple threads working on the same keyed object.The locking aspect was not a significant concern. We always unlock right after a get, and use the implicit locking provided by put.

Selecting cache or storage facility might simply be an arbitrary decision. They are both intended for the same purpose. From the docs:

“The pub.cache services are a tool for maintaining state information for the short term” – p. 80 in Services Reference

pub.store – “These services are a tool for maintaining state information in the short-term store” – p. 928

The former is implemented using Terracotta Ehcache while the latter uses the “IS Internal” DB, which can be shared by IS instances with or without an IS cluster (we have it without – IS Internal defined to use the same DB and user in 2 instances).

Which to choose depends in part on what your environment currently has set up. Beyond that, it will depend on which you prefer based upon your assessment of the services. Both can do the job.

2 Likes

Thank you both. Our use case is also storing timestamps for ETL type processing and we have an external Oracle database configured for the pools, so we will try pub.storage

Yes, I shouldn’t have used the term “modern”, after all I’m not that modern myself anymore :smile:. What I wanted to say is that the storage services were written a long time ago, whereas as @reamon pointed out the cache services are based on eh-cache and much more recent. The advantage being that you have more scope for deciding how/where the data gets persisted and to tune performance etc. So all things being equal I would go for that one, especially if I had the cash to fork out for Terracotta.
regards,
John.

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.