Back to siegfried

Siegfried development benchmarks

Sat, 23 Feb 2019 20:58:08 UTC

Environment

These benchmarks were automatically run on a t1.small.x86 machine provisioned from https://www.packet.net/.

Specs for the t1.small.x86: 4 Physical Cores @ 2.4 GHz; 8 GB DDR3 RAM; 80 GB SSD.

You can inspect the commands that were run to generate these benchmarks here.

Tools

Tool Version
master siegfried 1.7.11 /root/siegfried/default.sig (2019-02-16T11:09:29+01:00) identifiers: - pronom: DROID_SignatureFile_V94.xml; container-signature-20180917.xml
develop siegfried 1.7.11 /root/siegfried/default.sig (2019-02-16T11:09:29+01:00) identifiers: - pronom: DROID_SignatureFile_V94.xml; container-signature-20180917.xml

iPRES Systems Showcase

A corpus created for the 2014 iPRES conference comprising 2,206 files (5GB). Represents a range of formats, including AV and some uncommon types. Sourced from http://www.webarchive.org.uk/datasets/ipres.ds.1/

Results

Tool Description Duration
master Master branch of github.com/richardlehane/siegfried. Corresponds to latest production release. 39.446260352s
develop Develop branch of github.com/richardlehane/siegfried. Tip of development and potentially unstable. 39.306189165s

The tools differed in output for 0 files in the corpus.

Raw output

PRONOM files

A corpus created by Greg Lepore and comprising 1,205 files (2.1GB). Includes a single sample of as many of the PRONOM IDs (PUIDs) that Greg could find.

Results

Tool Description Duration
master Master branch of github.com/richardlehane/siegfried. Corresponds to latest production release. 7.734800343s
develop Develop branch of github.com/richardlehane/siegfried. Tip of development and potentially unstable. 7.734544346s

The tools differed in output for 0 files in the corpus.

Raw output

Govdocs (Selected)

A selection from the Govdocs1 corpus comprising 26,124 files (31.4GB). Represents typical office formats, including approx. 15,000 PDFs. Originally sourced from http://openpreservation.org/blog/2012/07/26/1-million-21000-reducing-govdocs-significantly/

Results

Tool Description Duration
master Master branch of github.com/richardlehane/siegfried. Corresponds to latest production release. 6m33.799594666s
develop Develop branch of github.com/richardlehane/siegfried. Tip of development and potentially unstable. 6m32.84614645s

The tools differed in output for 0 files in the corpus.

Raw output

The Deluxe

This benchmark checks multi-ID identification using the deluxe.sig signature file which contains four identifiers: PRONOM, LOC FDDs, freedesktop.org and tika-mimetypes. This benchmark is run against the PRONOM files corpus.

Results

Tool Description Duration
master Master branch of github.com/richardlehane/siegfried. Corresponds to latest production release. 14.243684559s
develop Develop branch of github.com/richardlehane/siegfried. Tip of development and potentially unstable. 14.162408331s

The tools differed in output for 0 files in the corpus.

Raw output

Profile

profiler information for siegfried development branch

History

2019-02-23 20:58:08 +0000 UTC

2019-02-18 20:08:35 +0000 UTC

2019-02-16 10:33:59 +0000 UTC

2019-02-14 20:35:53 +0000 UTC

2019-02-03 11:12:55 +0000 UTC

2018-10-10 11:24:35 +0000 UTC

2018-09-19 01:50:01 +0000 UTC

2018-08-30 07:36:32 +0000 UTC

2018-08-27 06:10:52 +0000 UTC

2018-08-21 06:00:13 +0000 UTC

2018-07-27 05:54:19 +0000 UTC