|
Posted on Fri Jul 13 16:09:20 2007
by mjc
|
| daemon mode for exiftool |
Hi all,
I want to call exiftool from a c image viewer program (gThumb), to get the metadata for hundreds of images (as part of the thumbnailing process).
exiftool works extremely well, except that it takes ~0.3s per file. I think that most of the time is actually spent initializing the program and perl, rather than actually generating the metadata.
Phil, have you considered adding a "daemon mode" to exiftool, so that it could be launched once and run persistently? Has anyone experimented with that?
- Mike
|
|
|
Posted on Fri Jul 13 16:11:51 2007
by andyarmstrong
in response to 5699
|
| Re: daemon mode for exiftool |
|
How about embedding Perl in gThumb?
Failing that I'm sure an exif extraction server could be written in Perl.
|
|
|
Posted on Fri Jul 13 16:19:41 2007
by exiftool
in response to 5699
|
| Re: daemon mode for exiftool |
|
Hi Mike,
The exiftool script does have powerful multi-file processing abilities.
A daemon would only be useful if you want interactive processing
of multiple files. Right now to do this you would need to write your
own script using calls to the ExifTool functions. It wouldn't be too
difficult to set this up if you know a bit of Perl, and would definitely
avoid the start-up cost of loading Perl, ExifTool and all the associated
libraries. But in the end I'm not sure how much this will speed things
up. You could run some tests on your system by processing a large
number of files in a single directory to see if the speed benefits
will be worth it.
Exiftool used to be a lot quicker, but for each new piece of information
that it extracts, it slows down just a little bit more. And now the
amount of information extracted from some images is really crazy.
- Phil
|
|
|
Posted on Fri Jul 13 16:22:40 2007
by mjc
in response to 5700
|
| Re: daemon mode for exiftool |
Sure, embedding Perl in gThumb is an option, but it seems a bit hackish and fraught with peril (i.e., packaging and maintenance problems).
A perl server calling exiftool libs would work, but would duplicate a lot of code in the CLI tool, causing maintenance/tracking pain. But at least the c and perl would be cleanly separated.
Adding it into the standard exiftool CLI tool seems to be the most elegant (add a --daemon option). Maybe it introduces cross-platform issues, I don't know. It would let other similar programs call exiftool in a speedy manner, though.
-Mike
|
|
|
Posted on Fri Jul 13 16:35:41 2007
by exiftool
in response to 5702
|
| Re: daemon mode for exiftool |
|
Hi Mike,
Ah, so you are using the stand-alone Windows version. Right.
Try running the tests I suggested, and if you see enough speed benefit then
I will look into adding a CLI option for you.
- Phil
|
|
|
Posted on Fri Jul 13 16:54:30 2007
by mjc
in response to 5700
|
| Re: daemon mode for exiftool |
Sure, embedding Perl in gThumb is an option, but it seems a bit hackish and fraught with peril (i.e., packaging and maintenance problems).
A perl server calling exiftool libs would work, but would duplicate a lot of code in the CLI tool, causing maintenance/tracking pain. But at least the c and perl would be cleanly separated.
Adding it into the standard exiftool CLI tool seems to be the most elegant (add a --daemon option). Maybe it introduces cross-platform issues, I don't know. It would let other similar programs call exiftool in a speedy manner, though.
-Mike
|
|
|
Posted on Fri Jul 13 17:00:13 2007
by mjc
in response to 5703
|
| Re: daemon mode for exiftool |
Using a directory of mixed files (jpgs, tiffs, RAWs, other), I get these benchmarks:
ls -1 | xargs -n1 exiftool -S -a -e -G1
~ 35 seconds
ls -1 | xargs -n100 exiftool -S -a -e -G1
~ 13 seconds
Calling exiftool once per file is 2-3 times slower than passing a list of 100 files. So the initialization overhead is pretty severe.
-Mike
|
|
|
Posted on Fri Jul 13 17:32:23 2007
by exiftool
in response to 5706
|
| Re: daemon mode for exiftool |
|
Hi Mike,
Thanks. I'll let you know what I come up with.
(it may be a few days as I'm quite busy at the
moment.)
- Phil
|
|
|
Posted on Fri Jul 13 17:47:13 2007
by mjc
in response to 5707
|
| Re: daemon mode for exiftool |
Hmm, I did another benchmark on a more typical folder of jpgs and avis from a digital camera (217 files, 2007-06 folder) and the benchmarks were:
one at a time: 1m 30s
batch mode: 1m 10s
which suggests there is not that much to be gained by a daemon mode. I'm a little puzzled why this folder shows much less improvement than the other one, which had a strange mix of raw/tiff/jpg/pdf/other files... I guess more research is needed before bothering with daemons.
-Mike
|
|
|
Posted on Fri Jul 13 17:54:55 2007
by mjc
in response to 5708
|
| Re: daemon mode for exiftool |
One last insight... the two benchmarks may differ because of the processing speed with unknown file types.
For instance, if I run exiftool one-at-a-time on 47 *.eml (Thunderbird email) files, it takes 50 seconds. In batch mode, it takes 8 seconds!
Food for thought...
-Mike
|
|
|
Posted on Fri Jul 13 18:03:53 2007
by exiftool
in response to 5709
|
| Re: daemon mode for exiftool |
|
I should have thought of this.
When exiftool finds an unknown file, it must load all the modules one-by-one
until it discovers what the file format is. And if it isn't a supported format,
it has to load ALL modules. This gives you the highest possible initial overhead.
- Phil
|
|
|
Posted on Fri Jul 13 18:13:04 2007
by exiftool
in response to 5710
|
| Re: daemon mode for exiftool |
|
(of course, exiftool avoids this overhead by only processing recognized
file extensions in a directory unless you specifically tell it to process
another type of file.)
|
|
|
Posted on Fri Jul 13 18:26:47 2007
by mjc
in response to 5710
|
| Re: daemon mode for exiftool |
Hmm. Interesting. gThumb already knows the mime type of the file when it wants metadata. Could the known mime type be supplied to exiftool as a parameter to speed up processing (ideally exiftool would quickly quit if it didn't recognize the specified mime type)?
I could add an array of mime types that exiftool is known to support, but then I'd have to manually update the program each time new file support was added.
We'd have to agree on mime type names for RAW files - I'm not sure if they are all standardized (some are).
- Mike
|
|
|
Posted on Fri Jul 13 18:45:01 2007
by exiftool
in response to 5713
|
| Re: daemon mode for exiftool |
|
You can get a list of supported extensions by typing
exiftool -listf
I don't give a list of MIME types, but there would be problems with
doing this:
ExifTool reports all RAW image formats as "image/x-raw" MIME type. If there
are accepted MIME types that are more specific than this, perhaps I
should be using them...
- Phil
|
|
|
Posted on Fri Jul 13 21:08:10 2007
by mjc
in response to 5714
|
| Re: daemon mode for exiftool |
Thanks for all the comments, Phil!
I can gain speed by being more careful about not feeding exiftool unsupported formats. The only formats that gThumb supports that exiftool doesn't are the high dynamic range types (OpenEXR and Radiance rgbe - any plans for those? I don't even know if they carry metadata...).
The daemon idea probably wouldn't gain much speed after all.
- Mike
|
|
|
Posted on Fri Jul 13 22:21:59 2007
by exiftool
in response to 5717
|
| Re: daemon mode for exiftool |
|
Hi Mike,
Great. Glad this helps.
I've never heard of those formats, and haven't had any requests to support
them. But unless they contain metadata, there isn't much reason to do
it (except maybe to extract the image dimensions or something like that).
- Phil
|
|
|
Posted on Fri Jul 13 22:36:28 2007
by exiftool
in response to 5718
|
| Re: daemon mode for exiftool |
|
You got me thinking so I ran a quick test on my system here.
I timed the following two commands:
exiftool -listf
exiftool -listg
The way ExifTool is implemented, the first command doesn't need to load
any of the format-specific modules, and it takes 0.100 sec on my
system. The second command loads all modules to determine
the full group list, and takes 0.677 sec here. The difference is mainly
due to the time required to load all the modules, but there is a bit
more CPU work done by -listg, so I hacked the code to remove this
and the time dropped to 0.622 sec. So on my system (a 1.83 GHz Intel
Core Duo), the time to load all modules is 0.522 seconds. That's pretty
hefty. (...and you want me to add more?... hehe)
- Phil
|
|
|
Posted on Fri Jul 13 22:50:08 2007
by mjc
in response to 5718
|
| Re: daemon mode for exiftool |
OpenEXR does have metadata, but I don't know anyone who uses the format. So it's just a curiosity...
- Mike
|
|