Back to siegfried

Siegfried development benchmarks

Sat, 11 Mar 2023 16:48:08 UTC

Environment

These benchmarks were run on a m3.small.x86 machine that was automatically provisioned.

Specs for the m3.small.x86: 8 cores @ 2.8 GHz, 64GB RAM, 960 GB SSD.

You can inspect the commands that were run to generate these benchmarks here.

List of tools benchmarked
Tool Version
master siegfried 1.9.6 /root/siegfried/default.sig (2022-11-06T17:44:52+01:00) identifiers: - pronom: DROID_SignatureFile_V109.xml; container-signature-20221102.xml
develop siegfried 1.9.6 /root/siegfried/dev.sig (2023-03-11T16:48:09Z) identifiers: - pronom: DROID_SignatureFile_V109.xml; container-signature-20221102.xml

iPRES Systems Showcase

A corpus created for the 2014 iPRES conference comprising 2,206 files (5GB). Represents a range of formats, including AV and some uncommon types. Sourced from http://www.webarchive.org.uk/datasets/ipres.ds.1/.

Results
Tool Description Duration
master Master branch of github.com/richardlehane/siegfried. Corresponds to latest production release. 8.523974401s
develop Develop branch of github.com/richardlehane/siegfried. Tip of development and potentially unstable. 6.1491848s

The tools differed in output for 0 files in the corpus.

Raw output

PRONOM files

A corpus created by Greg Lepore and comprising 1,205 files (2.1GB). Includes a single sample of as many of the PRONOM IDs (PUIDs) that Greg could find.

Results
Tool Description Duration
master Master branch of github.com/richardlehane/siegfried. Corresponds to latest production release. 2.4849831s
develop Develop branch of github.com/richardlehane/siegfried. Tip of development and potentially unstable. 2.81459307s

The tools differed in output for 0 files in the corpus.

Raw output

Govdocs (Selected)

A selection from the Govdocs1 corpus comprising 26,124 files (31.4GB). Represents typical office formats, including approx. 15,000 PDFs. Originally sourced from http://openpreservation.org/blog/2012/07/26/1-million-21000-reducing-govdocs-significantly/.

Results
Tool Description Duration
master Master branch of github.com/richardlehane/siegfried. Corresponds to latest production release. 1m38.255584323s
develop Develop branch of github.com/richardlehane/siegfried. Tip of development and potentially unstable. 1m21.271394891s

The tools differed in output for 0 files in the corpus.

Raw output

The Deluxe

This benchmark checks multi-ID identification using the deluxe.sig signature file which contains four identifiers: PRONOM, LOC FDDs, freedesktop.org and tika-mimetypes. This benchmark is run against the PRONOM files corpus.

Results
Tool Description Duration
master Master branch of github.com/richardlehane/siegfried. Corresponds to latest production release. 8.164152783s
develop Develop branch of github.com/richardlehane/siegfried. Tip of development and potentially unstable. 4.291704763s

The tools differed in output for 0 files in the corpus.

Raw output

Unzipping

This benchmark checks the `sf -z` command (scans within zip files and other container formats) when run against the iPres corpus.

Results
Tool Description Duration
master Master branch of github.com/richardlehane/siegfried. Corresponds to latest production release. 24.56926464s
develop Develop branch of github.com/richardlehane/siegfried. Tip of development and potentially unstable. 19.338825617s

The tools differed in output for 0 files in the corpus.

Raw output

Profile

profiler information for siegfried development branch

History

2023-05-14 20:50:10 +0000 UTC

2023-05-07 19:25:16 +0000 UTC

2023-03-24 20:27:43 +0000 UTC

2023-03-11 16:48:08 +0000 UTC

2023-03-10 22:10:16 +0000 UTC

2022-11-06 16:53:56 +0000 UTC

2022-09-07 13:25:46 +0000 UTC

2022-09-07 12:59:56 +0000 UTC

2022-09-07 12:14:22 +0000 UTC

2022-07-17 19:26:42 +0000 UTC

2022-06-07 15:07:18 +0000 UTC

2022-06-07 14:38:50 +0000 UTC

2022-05-18 19:58:51 +0000 UTC

2022-05-18 13:58:06 +0000 UTC

2022-02-06 20:51:52 +0000 UTC

2022-02-06 20:48:37 +0000 UTC

2022-02-06 18:54:59 +0000 UTC

2022-02-06 18:36:53 +0000 UTC

2022-02-05 13:38:23 +0000 UTC

2022-02-04 10:02:54 +0000 UTC

2022-02-03 22:16:55 +0000 UTC

2022-02-01 21:33:38 +0000 UTC

2020-10-11 09:04:51 +0000 UTC

2020-10-07 17:26:52 +0000 UTC

2020-10-06 17:36:19 +0000 UTC

2020-10-05 17:50:39 +0000 UTC

2020-09-22 19:53:21 +0000 UTC

2020-09-21 23:57:24 +0000 UTC

2020-09-21 23:37:01 +0000 UTC

2020-09-13 06:20:48 +0000 UTC

2020-09-13 06:02:39 +0000 UTC

2020-09-10 04:30:15 +0000 UTC

2020-09-09 03:41:50 +0000 UTC

2020-09-09 03:10:37 +0000 UTC

2020-09-08 02:53:23 +0000 UTC

2020-09-01 07:40:03 +0000 UTC

2020-08-31 01:09:52 +0000 UTC

2020-06-22 03:26:07 +0000 UTC

2020-06-15 15:01:05 +0000 UTC

2020-06-15 05:06:03 +0000 UTC

2020-04-22 02:23:19 +0000 UTC

2020-04-05 02:51:15 +0000 UTC

2020-02-25 03:51:55 +0000 UTC

2020-02-25 03:33:51 +0000 UTC

2020-02-12 20:27:13 +0000 UTC

2020-02-11 21:12:06 +0000 UTC

2020-01-21 23:35:20 +0000 UTC

2020-01-14 22:18:52 +0000 UTC

2020-01-10 11:12:45 +0000 UTC

2020-01-01 21:40:32 +0000 UTC

2019-12-30 14:49:43 +0000 UTC

2019-12-30 14:36:24 +0000 UTC

2019-12-11 21:22:13 +0000 UTC

2019-12-05 21:52:59 +0000 UTC

2019-08-18 13:54:03 +0000 UTC

2019-08-15 17:23:20 +0000 UTC

2019-07-29 18:52:25 +0000 UTC

2019-06-29 18:08:46 +0000 UTC

2019-06-15 10:53:03 +0000 UTC

2019-06-06 18:45:34 +0000 UTC

2019-02-23 20:58:08 +0000 UTC

2019-02-16 10:33:59 +0000 UTC

2019-02-03 11:12:55 +0000 UTC

2018-09-19 01:50:01 +0000 UTC

2018-08-27 06:10:52 +0000 UTC

2018-07-27 05:54:19 +0000 UTC