Thread

Posted on Fri Jul 13 16:09:20 2007 by mjc
daemon mode for exiftool

Hi all,

I want to call exiftool from a c image viewer program (gThumb), to get the metadata for hundreds of images (as part of the thumbnailing process).

exiftool works extremely well, except that it takes ~0.3s per file. I think that most of the time is actually spent initializing the program and perl, rather than actually generating the metadata.

Phil, have you considered adding a "daemon mode" to exiftool, so that it could be launched once and run persistently? Has anyone experimented with that?

- Mike

Direct Responses: 5700 | 5701 | Write a response
Posted on Fri Jul 13 16:11:51 2007 by andyarmstrong in response to 5699
Re: daemon mode for exiftool
How about embedding Perl in gThumb? Failing that I'm sure an exif extraction server could be written in Perl.
Direct Responses: 5702 | 5705 | Write a response
Posted on Fri Jul 13 16:19:41 2007 by exiftool in response to 5699
Re: daemon mode for exiftool
Hi Mike,

The exiftool script does have powerful multi-file processing abilities. A daemon would only be useful if you want interactive processing of multiple files. Right now to do this you would need to write your own script using calls to the ExifTool functions. It wouldn't be too difficult to set this up if you know a bit of Perl, and would definitely avoid the start-up cost of loading Perl, ExifTool and all the associated libraries. But in the end I'm not sure how much this will speed things up. You could run some tests on your system by processing a large number of files in a single directory to see if the speed benefits will be worth it.

Exiftool used to be a lot quicker, but for each new piece of information that it extracts, it slows down just a little bit more. And now the amount of information extracted from some images is really crazy.

- Phil
Write a response
Posted on Fri Jul 13 16:22:40 2007 by mjc in response to 5700
Re: daemon mode for exiftool

Sure, embedding Perl in gThumb is an option, but it seems a bit hackish and fraught with peril (i.e., packaging and maintenance problems).

A perl server calling exiftool libs would work, but would duplicate a lot of code in the CLI tool, causing maintenance/tracking pain. But at least the c and perl would be cleanly separated.

Adding it into the standard exiftool CLI tool seems to be the most elegant (add a --daemon option). Maybe it introduces cross-platform issues, I don't know. It would let other similar programs call exiftool in a speedy manner, though.

-Mike

Direct Responses: 5703 | Write a response
Posted on Fri Jul 13 16:35:41 2007 by exiftool in response to 5702
Re: daemon mode for exiftool
Hi Mike,

Ah, so you are using the stand-alone Windows version. Right.

Try running the tests I suggested, and if you see enough speed benefit then I will look into adding a CLI option for you.

- Phil
Direct Responses: 5706 | Write a response
Posted on Fri Jul 13 16:54:30 2007 by mjc in response to 5700
Re: daemon mode for exiftool

Sure, embedding Perl in gThumb is an option, but it seems a bit hackish and fraught with peril (i.e., packaging and maintenance problems).

A perl server calling exiftool libs would work, but would duplicate a lot of code in the CLI tool, causing maintenance/tracking pain. But at least the c and perl would be cleanly separated.

Adding it into the standard exiftool CLI tool seems to be the most elegant (add a --daemon option). Maybe it introduces cross-platform issues, I don't know. It would let other similar programs call exiftool in a speedy manner, though.

-Mike

Write a response
Posted on Fri Jul 13 17:00:13 2007 by mjc in response to 5703
Re: daemon mode for exiftool

Using a directory of mixed files (jpgs, tiffs, RAWs, other), I get these benchmarks:

ls -1 | xargs -n1 exiftool -S -a -e -G1

~ 35 seconds

ls -1 | xargs -n100 exiftool -S -a -e -G1

~ 13 seconds

Calling exiftool once per file is 2-3 times slower than passing a list of 100 files. So the initialization overhead is pretty severe.

-Mike

Direct Responses: 5707 | Write a response
Posted on Fri Jul 13 17:32:23 2007 by exiftool in response to 5706
Re: daemon mode for exiftool
Hi Mike,

Thanks. I'll let you know what I come up with. (it may be a few days as I'm quite busy at the moment.)

- Phil
Direct Responses: 5708 | Write a response
Posted on Fri Jul 13 17:47:13 2007 by mjc in response to 5707
Re: daemon mode for exiftool

Hmm, I did another benchmark on a more typical folder of jpgs and avis from a digital camera (217 files, 2007-06 folder) and the benchmarks were:

one at a time: 1m 30s

batch mode: 1m 10s

which suggests there is not that much to be gained by a daemon mode. I'm a little puzzled why this folder shows much less improvement than the other one, which had a strange mix of raw/tiff/jpg/pdf/other files... I guess more research is needed before bothering with daemons.

-Mike

Direct Responses: 5709 | Write a response
Posted on Fri Jul 13 17:54:55 2007 by mjc in response to 5708
Re: daemon mode for exiftool

One last insight... the two benchmarks may differ because of the processing speed with unknown file types.

For instance, if I run exiftool one-at-a-time on 47 *.eml (Thunderbird email) files, it takes 50 seconds. In batch mode, it takes 8 seconds!

Food for thought...

-Mike

Direct Responses: 5710 | Write a response
Posted on Fri Jul 13 18:03:53 2007 by exiftool in response to 5709
Re: daemon mode for exiftool
I should have thought of this.

When exiftool finds an unknown file, it must load all the modules one-by-one until it discovers what the file format is. And if it isn't a supported format, it has to load ALL modules. This gives you the highest possible initial overhead.

- Phil
Direct Responses: 5711 | 5713 | Write a response
Posted on Fri Jul 13 18:13:04 2007 by exiftool in response to 5710
Re: daemon mode for exiftool
(of course, exiftool avoids this overhead by only processing recognized file extensions in a directory unless you specifically tell it to process another type of file.)
Write a response
Posted on Fri Jul 13 18:26:47 2007 by mjc in response to 5710
Re: daemon mode for exiftool

Hmm. Interesting. gThumb already knows the mime type of the file when it wants metadata. Could the known mime type be supplied to exiftool as a parameter to speed up processing (ideally exiftool would quickly quit if it didn't recognize the specified mime type)?

I could add an array of mime types that exiftool is known to support, but then I'd have to manually update the program each time new file support was added.

We'd have to agree on mime type names for RAW files - I'm not sure if they are all standardized (some are).

- Mike

Direct Responses: 5714 | Write a response
Posted on Fri Jul 13 18:45:01 2007 by exiftool in response to 5713
Re: daemon mode for exiftool
You can get a list of supported extensions by typing

exiftool -listf

I don't give a list of MIME types, but there would be problems with doing this:

ExifTool reports all RAW image formats as "image/x-raw" MIME type. If there are accepted MIME types that are more specific than this, perhaps I should be using them...

- Phil
Direct Responses: 5717 | Write a response
Posted on Fri Jul 13 21:08:10 2007 by mjc in response to 5714
Re: daemon mode for exiftool

Thanks for all the comments, Phil!

I can gain speed by being more careful about not feeding exiftool unsupported formats. The only formats that gThumb supports that exiftool doesn't are the high dynamic range types (OpenEXR and Radiance rgbe - any plans for those? I don't even know if they carry metadata...).

The daemon idea probably wouldn't gain much speed after all.

- Mike

Direct Responses: 5718 | Write a response
Posted on Fri Jul 13 22:21:59 2007 by exiftool in response to 5717
Re: daemon mode for exiftool
Hi Mike,

Great. Glad this helps.

I've never heard of those formats, and haven't had any requests to support them. But unless they contain metadata, there isn't much reason to do it (except maybe to extract the image dimensions or something like that).

- Phil
Direct Responses: 5719 | 5720 | Write a response
Posted on Fri Jul 13 22:36:28 2007 by exiftool in response to 5718
Re: daemon mode for exiftool
You got me thinking so I ran a quick test on my system here. I timed the following two commands:

exiftool -listf exiftool -listg

The way ExifTool is implemented, the first command doesn't need to load any of the format-specific modules, and it takes 0.100 sec on my system. The second command loads all modules to determine the full group list, and takes 0.677 sec here. The difference is mainly due to the time required to load all the modules, but there is a bit more CPU work done by -listg, so I hacked the code to remove this and the time dropped to 0.622 sec. So on my system (a 1.83 GHz Intel Core Duo), the time to load all modules is 0.522 seconds. That's pretty hefty. (...and you want me to add more?... hehe)

- Phil
Write a response
Posted on Fri Jul 13 22:50:08 2007 by mjc in response to 5718
Re: daemon mode for exiftool

OpenEXR does have metadata, but I don't know anyone who uses the format. So it's just a curiosity...

- Mike

Write a response