Skip to content

Commit

Permalink
Refactor save/restore functions and internals
Browse files Browse the repository at this point in the history
  • Loading branch information
whitfin committed Sep 14, 2024
1 parent 7b9cd09 commit f2da77d
Show file tree
Hide file tree
Showing 17 changed files with 253 additions and 339 deletions.
18 changes: 8 additions & 10 deletions docs/general/local-persistence.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,27 @@
# Local Persistence

Cachex ships with basic support for dumping a cache to a local file using the [External Term Format](https://www.erlang.org/doc/apps/erts/erl_ext_dist). These files can then be used to seed data into a new instance of a cache to persist values between cache instances.

As it stands all persistence must be handled manually via the Cachex API, although additional features may be added in future to add convenience around this. Note that the use of the term "dump" over "backup" is intentional, as these files are just extracted datasets from a cache, rather than a serialization of the cache itself.
Cachex ships with basic support for saving a cache to a local file using the [External Term Format](https://www.erlang.org/doc/apps/erts/erl_ext_dist). These files can then be used to seed data into a new instance of a cache to persist values between cache instances. As it stands all persistence must be handled manually via the Cachex API, although additional features may be added in future to add convenience around this.

## Writing to Disk

To dump a cache to a file on disk, you can use the `Cachex.dump/3` function. This function supports an optional `:compression` option (between `0-9`) to help reduce the required disk space. By default this value is set to `1` to try and optimize the tradeoff between performance and disk usage. Another common approach is to dump with `compression: 0` and run compression from outside of the Erlang VM.
To save a cache to a file on disk, you can use the `Cachex.save/3` function. This function will handle compression automatically and populate the path on disk with a file you can import later. It should be noted that the internal format of this file should not be relied upon.

```elixir
{ :ok, true } = Cachex.dump(:my_cache, "/tmp/my_cache.dump")
{ :ok, true } = Cachex.save(:my_cache, "/tmp/my_cache.dat")
```

The above demonstrates how simple it is to dump your cache to a location on disk (in this case `/tmp/my_cache.dump`). Any options can be provided as a `Keyword` list as an optional third parameter.
The above demonstrates how simple it is to save your cache to a location on disk (in this case `/tmp/my_cache.dat`). Any options can be provided as a `Keyword` list as an optional third parameter.

## Loading from Disk

To seed a cache from an existing dump, you can use `Cachex.load/3`. This will *merge* the dump into your cache, overwriting and clashing keys and maintaining any keys which existed in the cache beforehand. If you want a direct match of the dump inside your cache, you should use `Cachex.clear/2` before loading your data.
To seed a cache from an existing file, you can use `Cachex.restore/3`. This will *merge* the file into your cache, overwriting and clashing keys and maintaining any keys which existed in the cache beforehand. If you want a direct match of the file inside your cache, you should use `Cachex.clear/2` before loading your data.

```elixir
# optionally clean your cache first
{ :ok, _amt } = Cachex.clear(:my_cache)

# then you can load the existing dump into your cache
{ :ok, true } = Cachex.load(:my_cache, "/tmp/my_cache.dump")
# then you can load the existing save into your cache
{ :ok, true } = Cachex.restore(:my_cache, "/tmp/my_cache.dat")
```

Please note that loading from an existing dump will maintain all existing expirations, and records which have already expired will *not* be added to the cache table. This should not be surprising, but it is worth calling out.
Please note that loading from an existing file will maintain all existing expirations, and records which have already expired will *not* be added to the cache table. This should not be surprising, but it is worth calling out.
163 changes: 78 additions & 85 deletions lib/cachex.ex
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,6 @@ defmodule Cachex do
count: [1, 2],
decr: [2, 3, 4],
del: [2, 3],
dump: [2, 3],
empty?: [1, 2],
execute: [2, 3],
exists?: [2, 3],
Expand All @@ -81,13 +80,14 @@ defmodule Cachex do
inspect: [2, 3],
invoke: [3, 4],
keys: [1, 2],
load: [2, 3],
persist: [2, 3],
purge: [1, 2],
put: [3, 4],
put_many: [2, 3],
refresh: [2, 3],
reset: [1, 2],
restore: [2, 3],
save: [2, 3],
size: [1, 2],
stats: [1, 2],
stream: [1, 2, 3],
Expand Down Expand Up @@ -489,40 +489,6 @@ defmodule Cachex do
def del(cache, key, options \\ []) when is_list(options),
do: Router.route(cache, {:del, [key, options]})

@doc """
Serializes a cache to a location on a filesystem.
This operation will write the current state of a cache to a provided
location on a filesystem. The written state can be used alongside the
`load/3` command to import back in the future.
It is the responsibility of the user to ensure that the location is
able to be written to, not the responsibility of Cachex.
## Options
* `:compression`
Specifies the level of compression to apply when serializing (0-9). This
will default to level 1 compression, which is appropriate for most dumps.
Using a compression level of 0 will disable compression completely. This
will result in a faster serialization but at the cost of higher space.
## Examples
iex> Cachex.dump(:my_cache, "/tmp/my_default_backup")
{ :ok, true }
iex> Cachex.dump(:my_cache, "/tmp/my_custom_backup", [ compressed: 0 ])
{ :ok, true }
"""
@spec dump(Cachex.t(), binary, Keyword.t()) :: {status, any}
def dump(cache, path, options \\ [])
when is_binary(path) and is_list(options),
do: Router.route(cache, {:dump, [path, options]})

@doc """
Determines whether a cache contains any entries.
Expand Down Expand Up @@ -655,7 +621,7 @@ defmodule Cachex do
This function is very heavy, so it should typically only be used
when debugging and/or exporting of tables (although the latter case
should really use `dump/3`).
should really use `Cachex.save/3`).
## Examples
Expand Down Expand Up @@ -821,14 +787,13 @@ defmodule Cachex do
## Examples
iex> Cachex.put(:my_cache, "key", "value")
iex> Cachex.import(:my_cache, [ { :entry, "key", 1538714590095, nil, "value" } ])
iex> Cachex.import(:my_cache, [ { :entry, "key", "value", 1538714590095, nil } ])
{ :ok, true }
"""
@spec import(Cachex.t(), [Cachex.Spec.entry()], Keyword.t()) :: {status, any}
def import(cache, entries, options \\ [])
when is_list(entries) and is_list(options),
do: Router.route(cache, {:import, [entries, options]})
@spec import(Cachex.t(), Enumerable.t(), Keyword.t()) :: {status, any}
def import(cache, entries, options \\ []) when is_list(options),
do: Router.route(cache, {:import, [entries, options]})

@doc """
Increments an entry in the cache.
Expand Down Expand Up @@ -973,49 +938,6 @@ defmodule Cachex do
def invoke(cache, cmd, key, options \\ []) when is_list(options),
do: Router.route(cache, {:invoke, [cmd, key, options]})

@doc """
Deserializes a cache from a location on a filesystem.
This operation will read the current state of a cache from a provided
location on a filesystem. This function will only understand files
which have previously been created using `dump/3`.
It is the responsibility of the user to ensure that the location is
able to be read from, not the responsibility of Cachex.
## Options
* `:trusted`
Allow for loading from trusted or untrusted sources; trusted
sources can load atoms into the table, whereas untrusted sources
cannot. Defaults to `true`.
## Examples
iex> Cachex.put(:my_cache, "my_key", 10)
iex> Cachex.dump(:my_cache, "/tmp/my_backup")
{ :ok, true }
iex> Cachex.size(:my_cache)
{ :ok, 1 }
iex> Cachex.clear(:my_cache)
iex> Cachex.size(:my_cache)
{ :ok, 0 }
iex> Cachex.load(:my_cache, "/tmp/my_backup")
{ :ok, true }
iex> Cachex.size(:my_cache)
{ :ok, 1 }
"""
@spec load(Cachex.t(), binary, Keyword.t()) :: {status, any}
def load(cache, path, options \\ [])
when is_binary(path) and is_list(options),
do: Router.route(cache, {:load, [path, options]})

@doc """
Removes an expiration time from an entry in a cache.
Expand Down Expand Up @@ -1174,6 +1096,77 @@ defmodule Cachex do
def reset(cache, options \\ []) when is_list(options),
do: Router.route(cache, {:reset, [options]})

@doc """
Deserializes a cache from a location on a filesystem.
This operation will read the current state of a cache from a provided
location on a filesystem. This function will only understand files
which have previously been created using `Cachex.save/3`.
It is the responsibility of the user to ensure that the location is
able to be read from, not the responsibility of Cachex.
## Options
* `:trust`
Allow for loading from trusted or untrusted sources; trusted
sources can load atoms into the table, whereas untrusted sources
cannot. Defaults to `true`.
## Examples
iex> Cachex.put(:my_cache, "my_key", 10)
iex> Cachex.save(:my_cache, "/tmp/my_backup")
{ :ok, true }
iex> Cachex.size(:my_cache)
{ :ok, 1 }
iex> Cachex.clear(:my_cache)
iex> Cachex.size(:my_cache)
{ :ok, 0 }
iex> Cachex.restore(:my_cache, "/tmp/my_backup")
{ :ok, true }
iex> Cachex.size(:my_cache)
{ :ok, 1 }
"""
@spec restore(Cachex.t(), binary, Keyword.t()) :: {status, any}
def restore(cache, path, options \\ [])
when is_binary(path) and is_list(options),
do: Router.route(cache, {:restore, [path, options]})

@doc """
Serializes a cache to a location on a filesystem.
This operation will write the current state of a cache to a provided
location on a filesystem. The written state can be used alongside the
`Cachex.restore/3` command to import back in the future.
It is the responsibility of the user to ensure that the location is
able to be written to, not the responsibility of Cachex.
## Options
* `:batch_size`
Allows customization of the internal batching when paginating the cursor
coming back from ETS. It's unlikely this will ever need changing.
## Examples
iex> Cachex.save(:my_cache, "/tmp/my_default_backup")
{ :ok, true }
"""
@spec save(Cachex.t(), binary, Keyword.t()) :: {status, any}
def save(cache, path, options \\ [])
when is_binary(path) and is_list(options),
do: Router.route(cache, {:save, [path, options]})

@doc """
Retrieves the total size of a cache.
Expand Down
41 changes: 0 additions & 41 deletions lib/cachex/actions/dump.ex

This file was deleted.

12 changes: 3 additions & 9 deletions lib/cachex/actions/export.ex
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ defmodule Cachex.Actions.Export do
#
# This command is extremely expensive as it turns the entire cache table into
# a list, and so should be used sparingly. It's provided purely because it's
# the backing implementation of the `dump/3` command.
# the backing implementation of the `Cachex.save/3` command.
alias Cachex.Actions.Stream, as: CachexStream
alias Cachex.Query

Expand All @@ -18,17 +18,11 @@ defmodule Cachex.Actions.Export do
@doc """
Retrieves all cache entries as a list.
The returned list is a collection of cache entry records, which is a little
more optimized than doing the same via `stream/3`.
This action should only be used in the case of exports and/or debugging, due
to the memory overhead involved, as well as the large concatenations.
"""
def execute(cache() = cache, options) do
query = Query.create()
batch = Keyword.take(options, [:batch_size])

with {:ok, stream} <- CachexStream.execute(cache, query, batch) do
def execute(cache() = cache, _options) do
with {:ok, stream} <- CachexStream.execute(cache, Query.create(), []) do
{:ok, Enum.to_list(stream)}
end
end
Expand Down
2 changes: 1 addition & 1 deletion lib/cachex/actions/import.ex
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ defmodule Cachex.Actions.Import do
# This command should be considered expensive and should be use sparingly. Due
# to the requirement of being compatible with distributed caches, this cannot
# use a simple `put_many/4` call; rather it needs to walk the full list. It's
# provided because it's the backing implementation of the `load/3` command.
# provided as it's the backing implementation of the `Cachex.restore/3` command.
import Cachex.Spec

##############
Expand Down
38 changes: 0 additions & 38 deletions lib/cachex/actions/load.ex

This file was deleted.

Loading

0 comments on commit f2da77d

Please sign in to comment.