In my recent updates to this site I’ve added a new “Chart your results” tool on the siegfried page (in the right hand panel under “Try Siegfried”). This tool produces single page reports like this: https://www.itforarchivists.com/siegfried/results/ea1zaj.
Before covering this tool in detail let’s recap some of the existing ways you can already analyse your results.
I appreciate that not everyone is a command-line junkie, but the way I inspect results is just to use sf’s -log flag. If you do
sf -log chart (or
-log c) you can make simple format charts:
(In these examples I add “o” to my log options to direct logging output to STDOUT… otherwise you’ll see it in STDERR).
A chart can be a starting point for deeper analysis e.g. inspecting lists of files of a particular format:
You can also inspect lists of unknowns with
-log u and warnings with
Rather than re-run the format identification job with every step, you can pair these commands with the
-replay flag to run them against a pre-generated results file instead. I cover this workflow in detail in the siegfried wiki.
These tools both do a lot more than simple chart generation. E.g. DROID-SF can create a “Rogues Gallery” of all your problematic files. Brunnhilde has a GUI, does virus scanning, and can also run bulk_extractor against your files. I’d definitely encourage you to check both of these tools out!
If your needs are a little bit simpler, and you just want a chart, then my new “Chart your results” tool might be a good fit.
To try this tool, go to the siegfried page and upload a results file in the “Chart my results” form in the right-hand panel.
Let’s run through some of its features:
Probably the distinguishing feature of this tool is that you can easily share your analysis with colleagues, or with the digital preservation community broadly, by “publishing” your results. This gives you a permanent URL (like https://www.itforarchivists.com/siegfried/results/ea1zaj) and stores your results on the site. Prior to publication you can opt to “redact” your filenames if they contain sensitive information. I’ve added a privacy section to this site to address some of the privacy questions raised by this feature in a little more detail.
That’s it, please use it, and if you like it tweet your results!
Version 1.7.5 of siegfried is now available. Get it here.
The headline feature of this release is new functionality for the
sf -update command requested by Ross Spencer. You can now use the
-update flag to download or update non-PRONOM signatures with a choice of LOC FDD, two flavours of MIMEInfo (Apache Tika’s MIMEInfo and freedesktop.org), and archivematica (latest PRONOM + archivematica extensions) signatures. There are two combo options as well: PRONOM/Tika/LOC and the Ross Spencer “deluxe” (PRONOM/Tika/freedesktop.org/LOC).
PRONOM remains the default, so if you just do
sf -update it will work as before.
To go non-PRONOM, include one of “loc”, “tika”, “freedesktop”, “pronom-tika-loc”, “deluxe” or “archivematica” as an argument after the flags e.g.
sf -update freedesktop. This command will overwrite ‘default.sig’ (the default signature file that sf loads).
You can preserve your default signature file by providing an alternative
-sig target: e.g.
sf -sig notdefault.sig -update loc. If you use one of the signature options as a filename (with or without a .sig extension), you can omit the signature argument i.e.
sf -update -sig loc.sig is equivalent to
sf -sig loc.sig -update loc.
sf -updatenow does SHA-256 hash verification of updates and communication with the update server is via HTTPS
So I’ve updated this site. Apologies to those who liked the old look, which used the retro-Windows BOOTSTRA.386 theme. I liked it too. But it didn’t have great readability. The new CSS is basic, but it is super lightweight and I can tinker with it. I’m using Yahoo’s Pure CSS framework.
Under the hood, I’ve simplified the site a bit too. It used to be a monolithic golang app, served on appengine, but now most of the static content is generated by Hugo. The golang app parts now mostly just power a few small webservices (the update server, the try siegfried service, the sets tool, and the new chart tool). You can explore most of these services on the siegfried page.
The site’s code is up on github.