summaryrefslogtreecommitdiff
path: root/guix/store/deduplication.scm
AgeCommit message (Collapse)Author
2021-11-16daemon: Do not deduplicate files smaller than 8 KiB.Ludovic Courtès
Files smaller than 8 KiB typically represent ~70% of the entries in /gnu/store/.links but only contribute to ~4% of the space savings afforded by deduplication. Not considering these files for deduplication speeds up file insertion in the store and, more importantly, leaves 'removeUnusedLinks' with fewer entries to traverse, thereby speeding it up proportionally. Partly fixes <https://issues.guix.gnu.org/24937>. * config-daemon.ac: Remove symlink hard link check and CAN_LINK_SYMLINK definition. * guix/store/deduplication.scm (%deduplication-minimum-size): New variable. (deduplicate)[loop]: Do not recurse when FILE's size is below %DEDUPLICATION-MINIMUM-SIZE. (dump-port): New procedure. (dump-file/deduplicate)[hash]: Turn into... [dump-and-compute-hash]: ... this thunk. Call 'deduplicate' only when SIZE is greater than %DEDUPLICATION-MINIMUM-SIZE; otherwise call 'dump-port'. * nix/libstore/gc.cc (LocalStore::removeUnusedLinks): Drop files where st.st_size < deduplicationMinSize. * nix/libstore/local-store.hh (deduplicationMinSize): New declaration. * nix/libstore/optimise-store.cc (deduplicationMinSize): New variable. (LocalStore::optimisePath_): Return when PATH is a symlink or smaller than 'deduplicationMinSize'. * tests/derivations.scm ("identical files are deduplicated"): Produce files bigger than %DEDUPLICATION-MINIMUM-SIZE. * tests/nar.scm ("restore-file-set with directories (signed, valid)"): Likewise. * tests/store-deduplication.scm ("deduplicate, below %deduplication-minimum-size"): New test. ("deduplicate", "deduplicate, ENOSPC"): Produce files bigger than %DEDUPLICATION-MINIMUM-SIZE. * tests/store.scm ("substitute, deduplication"): Likewise.
2020-12-19maint: Require Guile >= 2.2.6.Ludovic Courtès
* configure.ac: For Guile 2.2, require 2.2.6 or later. * guix/gexp.scm (define-syntax-parameter-once): Remove. Use 'define-syntax-parameter' instead. * guix/mnoads.scm: Likewise. * guix/inferior.scm (proxy)[select*]: Remove. * guix/scripts/publish.scm <top level>: Remove replacement for (@@ (web http) read-header-line). * guix/store/deduplication.scm (counting-wrapper-port): Remove. (nar-sha256): Call 'port-position' on PORT to compute SIZE.
2020-12-15deduplicate: Create the '.links' directory lazily.Ludovic Courtès
This avoids repeated (mkdir-p "/gnu/store/.links") calls when deduplicating lots of files. * guix/store/deduplication.scm (deduplicate): Remove initial call to 'mkdir-p'. Add ENOENT case in 'link' exception handler. Reindent. * tests/store-deduplication.scm ("deduplicate, ENOSPC"): Check for (<= links 4) to account for the initial 'link' call.
2020-12-15store-copy: 'populate-store' can optionally deduplicate files.Ludovic Courtès
Until now deduplication was performed as an additional pass after copying files, which involve re-traversing all the files that had just been copied. * guix/store/deduplication.scm (copy-file/deduplicate): New procedure. * tests/store-deduplication.scm ("copy-file/deduplicate"): New test. * guix/build/store-copy.scm (populate-store): Add #:deduplicate? parameter and honor it. * tests/gexp.scm ("gexp->derivation, store copy"): Pass #:deduplicate? #f to 'populate-store'. * gnu/build/image.scm (initialize-root-partition): Pass #:deduplicate? to 'populate-store'. Pass #:deduplicate? #f to 'register-closure'. * gnu/build/vm.scm (root-partition-initializer): Likewise. * gnu/build/install.scm (populate-single-profile-directory): Pass #:deduplicate? #f to 'populate-store'. * gnu/build/linux-initrd.scm (build-initrd): Likewise. * guix/scripts/pack.scm (self-contained-tarball)[import-module?]: New procedure. [build]: Pass it as an argument to 'source-module-closure'. * guix/scripts/pack.scm (squashfs-image)[build]: Wrap in 'with-extensions'. * gnu/system/linux-initrd.scm (expression->initrd)[import-module?]: New procedure. [builder]: Pass it to 'source-module-closure'. * gnu/system/install.scm (cow-store-service-type)[import-module?]: New procedure. Pass it to 'source-module-closure'.
2020-12-15nar: Deduplicate files right as they are restored.Ludovic Courtès
This avoids having to traverse and re-read the files that we have just restored, thereby reducing I/O. * guix/serialization.scm (dump-file): New procedure. (restore-file): Add #:dump-file parameter and honor it. * guix/store/deduplication.scm (tee, dump-file/deduplicate): New procedures. * guix/nar.scm (restore-one-item): Pass #:dump-file to 'restore-file'. (finalize-store-file): Pass #:deduplicate? #f to 'register-items'. * tests/nar.scm <top level>: Call 'setenv' to set "NIX_STORE".
2020-09-14deduplication: pass store directory to replace-with-link.Caleb Ristvedt
This causes with-writable-file to take into consideration the actual store being used, as passed to 'deduplicate', rather than whatever (%store-directory) may return. * guix/store/deduplication.scm (replace-with-link): new keyword argument 'store'. Pass to with-writable-file. (with-writable-file, call-with-writable-file): new store argument. (deduplicate): pass store to replace-with-link. Signed-off-by: Ludovic Courtès <ludo@gnu.org>
2020-07-28store: deduplication: Handle fs without d_type support.Mathieu Othacehe
scandir* uses readdir, which means that the file type property can be 'unknown if the underlying file-system does not support d_type. Make sure to fallback to lstat in that case. Fixes: https://issues.guix.gnu.org/issue/42579. * guix/store/deduplication.scm (deduplicate): Handle the case where properties is 'unknown because the underlying file-system does not support d_type.
2020-06-25deduplication: Leave the store permissions unchanged.Ludovic Courtès
Suggested by Caleb Ristvedt <caleb.ristvedt@cune.org>. * guix/store/deduplication.scm (call-with-writable-file): Call THUNK directly when FILE is (%store-directory).
2020-06-25deduplication: Fix default value of #:store in 'deduplicate'.Ludovic Courtès
* guix/store/deduplication.scm (deduplicate): Change #:store default value to (%store-directory).
2020-06-25deduplication: Use 'dynamic-wind' when changing permissions of the parent.Ludovic Courtès
Suggested by Caleb Ristvedt <caleb.ristvedt@cune.org>. * guix/store/deduplication.scm (call-with-writable-file): New procedure. (with-writable-file): New macro. (replace-with-link): Use it.
2020-06-22deduplicate: Avoid traversing directories twice.Ludovic Courtès
Until now, we'd call (nar-sha256 file) unconditionally. Thus, if FILE was a directory, we would traverse it for no reason, and then call 'deduplicate' on FILE, which would again traverse it. This change also removes redundant (mkdir-p store) calls from the loop, and avoids 'lstat' calls by using 'scandir*'. * guix/store/deduplication.scm (deduplicate): Add named loop. Move 'mkdir-p' outside the loop. Use 'scandir*' instead of 'scandir'. Do not call 'nar-sha256' when FILE has type 'directory.
2020-02-22deduplication: Use nix-base32 encoding for link names.Ludovic Courtès
Fixes <https://bugs.gnu.org/39725>. * guix/store/deduplication.scm (deduplicate): Use 'bytevector->nix-base32-string' instead of 'bytevector->base16-string'.
2019-04-25gnu, guix: Yearly ritual purging of the filesystems.Tobias Geerinckx-Rice
* gnu/packages/android.scm (android-ext4-utils)[synopsis]: Fix ‘file system’ spelling. * gnu/packages/disk.scm (rmlint)[synopsis, description]: Likewise. * gnu/packages/golang.scm (go-github-com-kr-fs)[synopsis, description]: Likewise & edit for grammar. * gnu/packages/ipfs.scm (gx, go-ipfs)[description]: Likewise. * /gnu/packages/java.scm (java-commons-vfs)[synopsis]: Likewise. * gnu/packages/linux.scm (fuseiso)[description]: Likewise. (genext2fs)[synopsis, description]: Likewise. * gnu/packages/package-management.scm (libostree)[description]: Likewise. * gnu/packages/python-xyz.scm (python-requests-file)[description]: Likewise & mark up. * gnu/packages/rails.scm (ruby-with-advisory-lock)[description]: Likewise. * gnu/packages/ruby.scm (ruby-rerun)[description]: Likewise. * guix/build/go-build-system.scm (setup-go-environment)<docstring>: Likewise. * guix/store/deduplication.scm (get-temp-link)<docstring>: Likewise.
2019-01-23deduplication: Ignore EMLINK.Ludovic Courtès
Until now 'guix offload' would fail (transient failure) upon EMLINK. * guix/store/deduplication.scm (replace-with-link) (deduplicate): Ignore EMLINK.
2018-12-14deduplication: Gracefully handle ENOSPC raised by 'link' calls.Ludovic Courtès
Reported by Andreas Enge <andreas@enge.fr> in <https://bugs.gnu.org/33676>. * guix/store/deduplication.scm (replace-with-link): Catch ENOSPC around 'get-temp-link'. Do nothing when 'get-temp-link' throws ENOSPC. Move code to restore PARENT's permissions outside of 'catch'. * tests/store-deduplication.scm ("deduplicate, ENOSPC"): New test.
2018-11-13deduplication: Restore directory mtime and permissions after deduplication.Ludovic Courtès
Fixes <https://bugs.gnu.org/33361>. * guix/store/deduplication.scm (replace-with-link): Call 'set-file-time' and 'chmod' after 'rename-file'. * tests/nar.scm ("restore-file-set with directories (signed, valid)"): New test.
2018-09-04Switch to Guile-Gcrypt.Ludovic Courtès
This removes (guix hash) and (guix pk-crypto), which now live as part of Guile-Gcrypt (version 0.1.0.) * guix/gcrypt.scm, guix/hash.scm, guix/pk-crypto.scm, tests/hash.scm, tests/pk-crypto.scm: Remove. * configure.ac: Test for Guile-Gcrypt. Remove LIBGCRYPT and LIBGCRYPT_LIBDIR assignments. * m4/guix.m4 (GUIX_ASSERT_LIBGCRYPT_USABLE): Remove. * README: Add Guile-Gcrypt to the dependencies; move libgcrypt as "required unless --disable-daemon". * doc/guix.texi (Requirements): Likewise. * gnu/packages/bash.scm, guix/derivations.scm, guix/docker.scm, guix/git.scm, guix/http-client.scm, guix/import/cpan.scm, guix/import/cran.scm, guix/import/crate.scm, guix/import/elpa.scm, guix/import/gnu.scm, guix/import/hackage.scm, guix/import/texlive.scm, guix/import/utils.scm, guix/nar.scm, guix/pki.scm, guix/scripts/archive.scm, guix/scripts/authenticate.scm, guix/scripts/download.scm, guix/scripts/hash.scm, guix/scripts/pack.scm, guix/scripts/publish.scm, guix/scripts/refresh.scm, guix/scripts/substitute.scm, guix/store.scm, guix/store/deduplication.scm, guix/tests.scm, tests/base32.scm, tests/builders.scm, tests/challenge.scm, tests/cpan.scm, tests/crate.scm, tests/derivations.scm, tests/gem.scm, tests/nar.scm, tests/opam.scm, tests/pki.scm, tests/publish.scm, tests/pypi.scm, tests/store-deduplication.scm, tests/store.scm, tests/substitute.scm: Adjust imports. * gnu/system/vm.scm: Likewise. (guile-sqlite3&co): Rename to... (gcrypt-sqlite3&co): ... this. Add GUILE-GCRYPT. (expression->derivation-in-linux-vm)[config]: Remove. (iso9660-image)[config]: Remove. (qemu-image)[config]: Remove. (system-docker-image)[config]: Remove. * guix/scripts/pack.scm: Adjust imports. (guile-sqlite3&co): Rename to... (gcrypt-sqlite3&co): ... this. Add GUILE-GCRYPT. (self-contained-tarball)[build]: Call 'make-config.scm' without #:libgcrypt argument. (squashfs-image)[libgcrypt]: Remove. [build]: Call 'make-config.scm' without #:libgcrypt. (docker-image)[config, json]: Remove. [build]: Add GUILE-GCRYPT to the extensions Remove (guix config) from the imported modules. * guix/self.scm (specification->package): Remove "libgcrypt", add "guile-gcrypt". (compiled-guix): Remove #:libgcrypt. [guile-gcrypt]: New variable. [dependencies]: Add it. [*core-modules*]: Remove #:libgcrypt from 'make-config.scm' call. Add #:extensions. [*config*]: Remove #:libgcrypt from 'make-config.scm' call. (%dependency-variables): Remove %libgcrypt. (make-config.scm): Remove #:libgcrypt. * build-aux/build-self.scm (guile-gcrypt): New variable. (make-config.scm): Remove #:libgcrypt. (build-program)[fake-gcrypt-hash]: New variable. Add (gcrypt hash) to the imported modules. Adjust load path assignments. * gnu/packages/package-management.scm (guix)[propagated-inputs]: Add GUILE-GCRYPT. [arguments]: In 'wrap-program' phase, add GUILE-GCRYPT to the search path.
2018-07-20deduplication: Work around Guile bug in 'seek'.Ludovic Courtès
Fixes <https://bugs.gnu.org/32161>. Reported by Ricardo Wurmus <rekado@elephly.net>. This mostly reverts 83099892e0cf0d9c59f5e1a0774331026e48baa8. * guix/store/deduplication.scm (counting-wrapper-port): New procedure. (nar-sha256): Use it.
2018-07-19deduplication: Remove 'counting-wrapper-port'.Ludovic Courtès
* guix/store/deduplication.scm (counting-wrapper-port): Remove. (nar-sha256): Call 'port-position' directly on PORT.
2018-07-03deduplication: Remove 'false-if-system-error', now unused.Ludovic Courtès
* guix/store/deduplication.scm (false-if-system-error): Remove.
2018-07-03deduplication: Place link files under /gnu/store/.links.Ludovic Courtès
Previously they'd always be placed next to TO-REPLACE, which would lead to EPERM in some cases. * guix/store/deduplication.scm (replace-with-link): Add #:swap-directory parameter and honor it. Add call to 'make-file-writable'. Catch 'system-error' around 'rename-file'. (deduplicate): Pass #:swap-directory and remove uses of 'false-if-system-error'. * tests/store-deduplication.scm ("deduplicate"): Add 'chmod' call.
2018-07-03deduplication: Fix incorrect use of 'throw'.Ludovic Courtès
* guix/store/deduplication.scm (get-temp-link): In handler, fix call to 'throw'.
2018-06-14deduplicate: Fix a couple of thinkos.Ludovic Courtès
* guix/store/deduplication.scm (get-temp-link): Turn 'args' in the 'catch' handler into a rest argument. (deduplicate): Use 'lstat' instead of 'file-is-directory?' to properly handle symlinks. When iterating over the result of 'scandir', exclude the ".links" sub-directory. * tests/store-deduplication.scm ("deduplicate"): Create sub-directories and call 'deduplicate' directly on STORE.
2018-06-01Add (guix store deduplication).Caleb Ristvedt
* guix/store/database.scm (register-path): Add #:deduplicate? and call 'deduplicate' when it's true. (counting-wrapper-port, nar-sha256): Move to... * guix/store/deduplication.scm: ... here. New file. * tests/store-deduplication.scm: New file. * Makefile.am (STORE_MODULES): Add deduplication.scm. (SCM_TESTS) [HAVE_GUILE_SQLITE3]: Add store-deduplication.scm. Co-authored-by: Ludovic Courtès <ludo@gnu.org>