The (In)famous MongoDB Message: Cannot Add Session Into the Cache - TooManyLogic...
source link: https://www.percona.com/blog/2021/06/03/mongodb-message-cannot-add-session-into-the-cache-toomanylogicalsessions/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
The (In)famous MongoDB Message: Cannot Add Session Into the Cache – TooManyLogicalSessions
This week, I had an interesting case whereas one of our customers was facing the issue:
Studying this issue and discussing it with my colleagues I had to chance to explore in detail how the logical sessions in MongoDB are handled. First, a brief explanation of how the entire process workers and what is logical sessions:
What are Logical Sessions
Logical sessions allow operations to be tracked as they are consumed throughout the system. This enables simple, precise cancellation of operations and distributed garbage collection. For example, a find()
operation will create cursors in all the relevant shards in a cluster. Each cursor will start acquiring results for its first batch to return. Before logical sessions existed, to cancel an operation like this would mean traversing all the shards with administration privileges, figuring out which activity was associated with the operation, and then kill it.
With logical sessions, it is now possible to kill using the killSessions
command. For example:
How Logical Sessions Works
The Logical Session has an in-memory cache and the physical data stored in system.sessions
collection. Each node (router, shard, config server) has its own in-memory cache. A cache entry contains:
_id
– The session’s logical session iduser
– The session’s logged-in username (if authentication is enabled)lastUse
– The date and time that the session was last used
The in-memory cache periodically persists entries to the config.system.sessions
collection, known as the “sessions collection.” The sessions
collection has different placement behavior based on whether the user is running a standalone node, a replica set, or a sharded cluster.
How does the expiration process work?
When a node receives a request with attached session info, it will place that session into the logical session cache. If a request corresponds to a session that already exists in the cache, the cache will update the cache entry’s lastUse
field to the current date and time.
At a regular interval of five (5) minutes (user-configurable), the logical session cache will sync with the sessions
collection. Inside the class, this is known as the “refresh” function. There are four steps to this process:
- All sessions that have been used on this node since the last refresh will be upserted to the sessions collection. This means that sessions that already exist in the sessions collection will just have their
lastUse
fields updated. - All sessions that have been ended in the cache on this node (via the endSessions command) will be removed from the sessions collection.
- Sessions that have expired from the
sessions
collection will be removed from the logical session cache on this node. - All cursors registered on this node that match sessions that have been ended (step 2) or were expired (step 3) will be killed.
We have available the following parameters to adjust the Logical Session:
Parameter Default Value Description disableLogicalSessionCacheRefresh false(boolean) Disables the logical session cache’s periodic “refresh” and “reap” functions on this node. Recommended for testing only. logicalSessionRefreshMillis 300000ms (integer) Changes how often the logical session cache runs its periodic “refresh” and “reap” functions on this node. localLogicalSessionTimeoutMinutes 30 minutes(integer) Changes the TTL index timeout for the sessions collection. In sharded clusters, this parameter is supported only on the config server.This is a brief introduction to logical sessions. Most of the explanation in this post was based on the source code documentation which has great detail:
https://github.com/mongodb/mongo/blob/master/src/mongo/db/s/README.md#logical-sessions
Applying the Theory in Real Workloads
The Logical Session Cache Refresh process syncs the in-memory cache with the system.session
based on a frequency defined in the logicalSessionRefreshMillis
. However, syncing is not its only function. The other function is to reap unused sessions. Let me show a test:
And mongoS configured with:
Opening a shell in mongoS
to verify the results and we can see 500 sessions created plus the one I’m currently using:
And if we count the number of sessions in the collection, we can see increasing:
Which makes sense, because the cache is not instantly synchronized with the collection (we can have more sessions in-memory than in the collection).
Where is the maxSessions
value is based? If I set in mongoS
with --setParameter maxSessions=100
and re-run the tests with 500 sessions:
I get the following error:
So, maxSessions
is based on the cache value, not in the collection system.sessions
. We can confirm this in the source code:
Fixing the Issue
The sessions are created every time that a connection is open at the database or a startSession is executed. Apart from the Mongo Shell, it is not possible to disable the sessions at the driver level. This is confirmed in this JIRA ticket:
https://jira.mongodb.org/browse/MONGOSH-458
Answer: No, the driver does not support that and is unlikely to do so. Original feature was implemented to debug or to be backwards compatible with earlier server versions, however we only support 4.0+ so we do not need either case.
Let’s say you have a throughput of 10k connections/second. If we have set the logicalSessionRefreshMillis
to 10 minutes (600000 milliseconds, 600 seconds) here is the maximum sessions you will be able to open before the Logical Session Refresh starts reaping unused sessions:
To stabilize the use of sessions we have two alternatives. Increase the number of MaxSessions
or reduce the logicalSessionRefreshMillis
. I made a test with logicalSessionRefreshMillis=10000
(1-second) and re-run the tests and below is the maximum number of activeSessions
I got in my tests:
This is because the reaping process of the Logical Session Refresh is much more aggressive.
And what happens if I increase the maxSessions
to a huge number? Let’s check:
For 10k activeSessions
:
There was an increase of 5Mb of memory usage. And adding more 10k sessions:
So we can estimate memory usage of 4MB for each 10K sessions. For 3 million maxSessions
, we can expect an increase in the memory usage of 1,17 GB.
Conclusion
To conclude, based on the results above, there are two alternatives to circumvent this issue. I would go first reducing the logicalSessionRefreshMillis
which seems to me it creates less impact. Also, it is important to estimate if the current maxSessions
supports the given throughput of the application (you can check the formula mentioned previously).
Useful Resources
Finally, you can reach us through the social networks, our forum, or access our material using the links presented below:
STAY UP-TO-DATE With Percona!
Join 33,000+ of your fellow open-source enthusiasts! Our newsletter provides updates on Percona open source software releases, technical resources, and valuable MySQL, MariaDB, PostgreSQL, and MongoDB-related articles. Get information about Percona Live, our technical webinars, and upcoming events and meetups where you can talk with our experts.
By submitting my information I agree that Percona may use my personal data in send communication to me about Percona services. I understand that I can unsubscribe from the communication at any time in accordance with the Percona Privacy Policy.
Author
Vinicius Grippa is a Percona Senior Support Engineer and an Oracle Ace Associate. Vinicius has a Bachelor's degree in Computer Science and has been working with databases for 13 years. He has experience in designing databases for mission-critical applications and, in the last few years, has become a specialist in MySQL and MongoDB ecosystems. Working in the Support team, he has helped Percona customers with hundreds of different cases featuring a vast range of scenarios and complexities. Vinicius is also active in the OS community, participating in virtual rooms like Slack, and speaking at MeetUps, and presenting conferences in Europe, Asia, North and South America.
Leave a Reply Cancel reply
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK