3

Obsidian 8.0

 1 year ago
source link: https://medium.com/@davidmnorman/obsidian-8-0-8b45a3b84668
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Obsidian 8.0

Added W-TinyLFU client caching algorithm, persistent query support, cache customization option for hit-ratio optimization, overhauled and debugged server-side caching, rebuilt developer tool for compatibility.

The Obsidian logo, depicting the a smiling ox

Introduction

Obsidian is Deno’s original GraphQL caching solution, providing client-side caching for React components using an in-house LFU, LRU, or Window TinyLFU cache, and server-side caching for Oak Routers using Redis.

New Features

Our main areas of focus were on improving Obsidian’s client-side and server-side caching, and rebuilding Obsidian’s developer tool for a better development experience. To those ends, we focused on a few key features:

W-TinyLFU

The goal of W-TinyLFU as a dual admission/eviction policy is to provide a client cache with excellent hit-ratios while also minimizing the memory overhead using probabilistic admission logic.

Segmented Caching

A W-TinyLFU cache is actually a set of caches. New items initially get stored in a very small window LRU cache. When the cache fills, evicted items get moved over to a probationary segment of the main LRU cache, so long as there is room. Items in the probationary segment that get repeat queries are promoted to the protected segment, while any item evicted from the protected segment will be moved back into the probationary segment. If an item is evicted from the window cache when there is no room in the probationary segment, that item and the item that would be evicted from the probationary segment are examined by our TinyLFU logic, which determines which item would have the biggest positive impact on hit-ratios if cached, and that winner is put into the probationary segment while the other gets put up for garbage collection.

1*ToGSv4mj3RQ3J6Wep4wf5w.png

Frequency Sketch

Our Frequency Sketch handles the logic for the TinyLFU to assess the impact of any individual cache item. By utilizing bitwise operations, we reduce the memory requirements of our cache to only 5% of that required by traditional LFU frequency tracking. A 4-bit Count-Min Frequency Sketch was used to estimate the frequency of cache hits for each entry to greatly reduce the memory overhead. Our frequency sketch uses a hashing function to assign each cache hit to four different counters, and retrieves the frequency by returning the minimum of the four designated counters to minimize the risk of overcounting due to collisions. While 64 bit integers are widely used to incorporate 16 4-bit counters for implementation of frequency sketch, Obsidian uses two 32 bit integers. The number type in Javascript is a double-precision 64 bit binary format, which is unable to handle 64-bit integers, as it only uses 52 bits for significant digits. In addition, Javascript is not equipped with a built-in bitcount algorithm, so bit twiddling was used to remedy this issue, which also supports up to 32-bit integers.

Persistent Queries

We have implemented the ability for developers to use persistent queries when using Obsidian, allowing for only a small 64 character, SHA256 hashed version of the query string to be sent to the server as opposed to the entire query string. As queries can be quite complicated in GraphQL, persisted queries have the potential to reduce the cost of sending queries to the server, particularly in the context of applications that perform identical query requests repeatedly to the server.

When a hashed query is received by the server, the server checks to see if it has a record of that query hash. If it does, it will utilize the corresponding full query string to check its server-side cache or the query the database if needed. If not, it prompts the client to send the hashed query as well as the full query string. Upon receipt, the server not only executes its normal query operations, but also stores the full query and query hash to persist it. Upon subsequent identical requests, the persisted hash will be found, allowing for decreased load on the server to process query requests.

Cache Assignment and Wrapper Updates

Obsidian Wrapper now accepts ‘W-TinyLFU’ as a value for its algo property, enabling our newest caching algorithm for use in the client-side cache. Additionally, revisions to cache assignment within the wrapper have been made to improve stability and prevent crashes during cache initialization. With our new persistent queries feature requiring use of both the Obsidian Wrapper and Router in order to function, we have added a useCache property, to allow developers to make use of persistent queries without also needing to use Obsidian’s client-side caching. Finally, we have added the option for developers to specify the search criteria their application uses to fetch data as an array of searchTerms, which will be elaborated on below.

searchTerms

Obsidian is built with flexibility in mind, allowing it to be used for a wide variety of programs using GraphQL queries in myriad different ways. However, the focus on flexibility consequently meant that our cache was incapable of effectively normalizing data. The cache relies on the presence of unique keys in ROOT_QUERY to know that data is available in the cache. However, after a broad queries, data is available in the cache — Obsidian was just not able to transform the data to retrieve it. We didn’t want to give up Obsidian’s flexibility by forcing more constrained queries, but needed a way to improve cache effectiveness.

To accomplish this, we added an option for developers using Obsidian to specify the kinds of search terms being used as the primary arguments in their queries as an array in both Obsidian’s client-side wrapper and server-side router. When a broad query is made, Obsidian will iterate over the results to store the cache keys in the ROOT_QUERY according to every developer-specified query argument. While this does come at a slight cost for initial, broad queries, it ensures that any future queries containing that data can be directly accessed from the cache. This feature can be enabled for either cache independently, or used for both.

Without searchTerms, ROOT_QUERY after a broad query for characters might look like this:

{
allPeople: [
"Person~1", "Person~2", "Person~3", "Person~4", "Person~5"
]
}

By specifying in search terms that your application makes queries by ‘name,’ however, ROOT_QUERY would instead look like this:

{
allPeople: [
"Person~1", "Person~2", "Person~3", "Person~4", "Person~5"
],
'onePerson(name:"LukeSkywalker")': [ "Person~1" ],
'onePerson(name:"C-3PO")': [ "Person~2" ],
'onePerson(name:"R2-D2")': [ "Person~3" ],
'onePerson(name:"DarthVader")': [ "Person~4" ],
'onePerson(name:"LeiaOrgana")': [ "Person~5" ]
}

Improved Server-side Caching

We have completely overhauled our server-side caching logic to reduce bloat, maintain stability, and greatly improve performance. Server-side caching still uses Redis, but now takes a far more streamlined approach, resulting in faster data retrieval and less delay before fetching from a database endpoint, if needed. As in the client-side wrapper, we have added the option to disable the server-side cache entirely if a developer would like to only use the Obsidian Router for persistent queries.

Developer Tool

While earlier versions of Obsidian had a developer tool, it had not been updated in quite some time, and its main dependencies had been deprecated. We set out to rebuild a tool from scratch that would provide developers with real-time query analytics to help them optimize their client-side caching policy. With Obsidian’s new Chrome developer tool, developers can view a dashboard that visually displays the current caching policy, total cache capacity, query time for each GraphQL query, as well as stats on their client-side cache’s current hit-ratio. In addition, we have implemented a query log that distinctly displays hits, misses, and mutations on a line graph, as well as a detailed log of all executed queries. Clicking on any node in the graph will filter down to just that query in the log for easy lookup, while clicking on any query in the log will highlight the related node in the graph. The log details expand when clicked on, giving developers useful information about any individual query.

Future Improvements

Development is an iterative process, and while we are proud of the strides we’ve made during this cycle, we have pinpointed a few areas that could still be improved by future project contributors including:

  1. Hill Climber Optimization could be implemented to dynamically determine and adjust the optimal window size according to the nature of a particular cache trace, thus further improving the versatility of Window-TinyLFU Algorithm. A recency-skewed cache trace requires larger window size for best cache-hit ratio, and hill climber optimization algorithm can be used to modify the window size in real time.
  2. The effectiveness of our server-side caching has been improved, but there’s still room for refinement. The server-side caching algorithm could benefit from continued optimization efforts and additional mutation support.
  3. The revamped developer tool is off to a great start, but could benefit from additional features such as the ability to directly examine and clear the cache as well as an adaptivity graph once Hill Climber Optimization is implemented to provide insight into cache hit performance relative to ideal window size. The developer tool is also entirely directed toward client caching in its current form, but could be expanded to allow for server-side cache integration.
  4. Refactor from JavaScript/TypeScript to AssemblyScript. As a subset of TypeScript, AssemblyScript could be relatively simple to refactor caching algorithms into. At the same time, having a language that compiles into a low-level binary format like WebAssembly could result in significant performance increases to the application.

Other articles on Obsidian:

Obsidian 1.0
Obsidian 2.0
Obsidian 3.0
Obsidian 3.1
Obsidian 3.2
Obsidian 4.0
Obsidian 4.0 Technical Brief
Obsidian 6.0 Technical Brief
Obsidian 7.0 Technical Brief

Co-authored by:

David Kim | GitHub | LinkedIn

David Norman | GitHub | LinkedIn

Eileen Cho | GitHub | LinkedIn

Joan Manto | GitHub | LinkedIn


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK