Any tips for speeding up large catalog?

Discussions, questions, comments and suggestions regarding Capture One PRO, Capture One PRO For Sony / Fujifilm, Capture One DB and Capture One Express For Sony / For Fujifilm 12.x for Mac

Any tips for speeding up large catalog?

Postby NNN637069218598050848 » Thu Oct 17, 2019 2:14 pm

Hello everyone,
Have a catalog with 15000 photos. Just opening all images, let alone filtering makes things slow to a crawl, especially when first opening the program. Previews are already generated.sarego check


I have an i5-8600k overclocked and 16gb ram, catalog is on an SSD, and storage drives run in parallel with drivepool so should be faster than a normal single HDD.

REALLYYYYYY want to leave Lightroom but good Lord it's so much faster at viewing large catalogs
Last edited by NNN637069218598050848 on Sat Oct 19, 2019 6:44 am, edited 1 time in total.
NNN637069218598050848
 
Posts: 2
Joined: Thu Oct 17, 2019 4:10 pm

Re: Any tips for speeding up large catalog?

Postby tenmangu81 » Thu Oct 17, 2019 3:50 pm

Hi,

You should think about leaving "All images" and going to some album or smaller set before quitting C1. Otherwise, it could take some time on launching it.
As far as I am concerned, I have a 22 k+ images catalog, and going to "all images" from a smaller set, once C1 is open, takes 5 seconds. Filtering takes less than 2 seconds. Some on this forum suggest to move the "filter" tool off the library tool tab, it could help. But for me, I have left it as it is (in the library tool tab), and everything is almost as smooth and fast as Lightroom.
Robert
tenmangu81
 
Posts: 861
Joined: Thu Dec 04, 2014 6:33 pm
Location: Paris, France

Re: Any tips for speeding up large catalog?

Postby Eric Nepean » Fri Oct 18, 2019 6:59 am

Remove the Filter Tool from the Library tool tab.

If you have many images with unique metadata for each image, for example Title, Caption or Original File ID that is unique for each image, Filter tool takes a long time to index these fields, and it reindexes frequently.

I have put Filter tool on a custom tool tab.

I've also removed some of the Metadata that was unique for each image, and which I never used.

I wouldn't say that my 16000 image catalog is fast, but it doesn't crawl.
Eric Nepean
 
Posts: 611
Joined: Sat Oct 25, 2014 8:02 am
Location: Ottawa

Re: Any tips for speeding up large catalog?

Postby SFA » Fri Oct 18, 2019 10:40 am

The other thing to consider is which Metadata fields are currently set for active use in the Filters tool.

It may be possible, based on individual requirements, to fine tune the fields that are always active and so retain the most used values but still activate other fields as and when required.

As Eric has said Unique value fields do not make logical pre-calculated filters - you may as well just search.

This can also be true for sorting of course, especially in combination with filters and active searches. Some combinations of settings for Smart albums might be challenging in that respect if a lot of them are being updated at the same time - i.e. actively being reassessed when a session or catalogue is opened and generating a significant load on system resources.

For databases (using industry standard DB systems) that I have worked with doing volume testing for business database systems there have always been questions about the optimal "on balance" settings for the database configuration dependent on expected use. This is for desktop use rather than IT team run corporate databases. The compromise between perceived instant performance for relatively small number of records against the needs when working with millions of records has been a continuous point of discussion. Now matter what the tuners did one ended up with a perceptible fall off in "speed" at around 40k to 50k records. Bear in mind these would normally be quite small records in terms of data size and so as time passed and system memory option grew the memory limit effect and need for virtual memory swapping could be eliminated but although the processes were faster the user perception simply adjusted to the improved performance and still perceived a 'slow' result once more than 50k records were selected.

In effect, as I understood the discussions, these were mainly technical limitations that were part of the way the selected database tool worked. One could always find a faster tool but at a cost. Work with a server based tool and everything could be much faster (using a well configured server cluster). However there was a price attached to that. Usually a very large price especially for the resources requires to run the more complex server database hardware and configuration required to deliver the performance.

That is one of the reasons I have stayed with sessions - another being that most of my shoots are more logically 'sessions' as shot. One of the things I was not comfortable with in the early days of LR was the need for everything to be in a catalogue. However I think Adobe are, more often than not, pretty cute at managing user perception of performance wherever they can and one has to give them credit for that.

I'm not sure how other applications work at that level. I have often read that Aperture users loved the DAM speed available but one has to remember that as a single platform application created by developers with access to all of the resources and knowledge available in the heart of Apple and, presumable, the opportunities to have things enhanced for their own purposes, it would be rather disappointing if they had not used the opportunity to their advantage in what was, for them, a primary target marketplace of users.

Today's challenge is that sensor sizes continue to produce ever larger files (RAW or JPG) from the newer cameras and so absorb whatever hardware based performance increases we may invest in. Running to stand still as it often seems.

Keeping the settings to what we need rather than what is possible for everyday use seems to make sense for things like filters and smart albums when using large catalogues. It's much less of a perceived problem in sessions that are typically much smaller and well below that 40k to 50k records to manage boundary.


Just my thoughts.


Grant
SFA
 
Posts: 6953
Joined: Tue Dec 20, 2011 9:32 pm

Re: Any tips for speeding up large catalog?

Postby Eric Nepean » Sat Oct 19, 2019 5:16 pm

One of the CO inefficiencies I have observed is that the filter tool groups, counts and indexes EVERY Metadata field, whether Filter Tool is configured to show that Metadata field or not.

In my case I had the original import date/time and file name saved as text into two otherwise unused Metadata fields, Jobid and Instructions.

Both of these were unique for every file.
Filter tool was not configured to show/select/filter these indexes and there were no smart Albums or searches referring to the Filter Tool.

Nevertheless, there was an enormous slowdown whenever Filter Tool was active in All Images which disappeared only when I emptied both those Metadata fields.

Over several tickets/consultations regarding incredibly slow operation, CO support didn’t appear to be aware of this effect and was unable to help me, suggesting for example that I purchase a Mac Pro because all other Macs were supposedly underpowered.

I discovered the cause and solution on my own and brought it to their attention. Support agreed that it was a fruitless waste of users time and CPU cycles for Filter Tool to index/group/count unused Metadata fields with unique data, but engineering did not act on it.

I also observe that during the period when Filter Tool is reindexing there is only one CPU core in use. Bad design to utilize a only one core for a bottleneck task. It is feasible to use multiple cores: there are a number of sort algorithms that are based on sorting sublists and then merging, multiple cores could be assigned to the sublists.

Filter Tool repeats it’s indexing and counting operation after every event that could change the list of selected images, displayed, which is quite a number of events, including after ever user keystroke in the search window.

IT seems that CO has forgone a competitive advantage (speed) through short sighted triaging of development tasks.
Eric Nepean
 
Posts: 611
Joined: Sat Oct 25, 2014 8:02 am
Location: Ottawa

Re: Any tips for speeding up large catalog?

Postby IanL » Tue Oct 29, 2019 4:01 pm

OK this is interesting information. I will have to try playing with the Filter control and my problem with the All Images album.

Eric Nepean wrote:I discovered the cause and solution on my own and brought it to their attention. Support agreed that it was a fruitless waste of users time and CPU cycles for Filter Tool to index/group/count unused Metadata fields with unique data, but engineering did not act on it.


Actually why is it re-indexing at all. Just do it on import. Unless you have another program changing metadata there is no need to be doing that work at all. If you do have an outside program changing metadata let my update it manually or *choose* to have the re-indexing done all the time. In fact isn't there an option for this already? Is it just a case of not respecting the setting?
IanL
 
Posts: 285
Joined: Sat Oct 21, 2017 12:24 am
Location: Ontario, Canada

Re: Any tips for speeding up large catalog?

Postby Eric Nepean » Tue Oct 29, 2019 6:14 pm

IanL wrote:OK this is interesting information. I will have to try playing with the Filter control and my problem with the All Images album.

Eric Nepean wrote:I discovered the cause and solution on my own and brought it to their attention. Support agreed that it was a fruitless waste of users time and CPU cycles for Filter Tool to index/group/count unused Metadata fields with unique data, but engineering did not act on it.


Actually why is it re-indexing at all. Just do it on import. Unless you have another program changing metadata there is no need to be doing that work at all. If you do have an outside program changing metadata let my update it manually or *choose* to have the re-indexing done all the time. In fact isn't there an option for this already? Is it just a case of not respecting the setting?

The Capture One User may change the IPTC MetaData, or rating or color tag on one or more variants.
The scope of Finder Tool is the current collection; in a user collection, the Capture One user may add or remove images or create additional variants.

That said, I believe it is reindexing too often.
Eric Nepean
 
Posts: 611
Joined: Sat Oct 25, 2014 8:02 am
Location: Ottawa

Re: Any tips for speeding up large catalog?

Postby Skids » Wed Oct 30, 2019 8:46 am

As someone who is struggling with Capture One I find this thread both interesting and worrying. The interesting part is the conformation that the database is not very well designed. The worry is the attitude of CO support. I have had a similar experience of being told my hardware is not good enough to which I responded that their database was at least seven times slower than Adobe's and prone to self destruction. The next suggestion was a complete removal and re-install which achieved nothing.

It seems that the filtering is causing many of the speed issues, especially if this re-indexing means thousands of file read operations. I read in another post made by Eric that there are, or possibly were, issues with catalogues where large numbers of unique metadata values and being used. It begs the question of what sort of testing do CO conduct before releasing their software.

best wishes
Simon
Skids
 
Posts: 20
Joined: Wed Apr 22, 2015 2:38 pm

Re: Any tips for speeding up large catalog?

Postby SFA » Wed Oct 30, 2019 1:32 pm

Skids wrote:As someone who is struggling with Capture One I find this thread both interesting and worrying. The interesting part is the conformation that the database is not very well designed. The worry is the attitude of CO support. I have had a similar experience of being told my hardware is not good enough to which I responded that their database was at least seven times slower than Adobe's and prone to self destruction. The next suggestion was a complete removal and re-install which achieved nothing.

It seems that the filtering is causing many of the speed issues, especially if this re-indexing means thousands of file read operations. I read in another post made by Eric that there are, or possibly were, issues with catalogues where large numbers of unique metadata values and being used. It begs the question of what sort of testing do CO conduct before releasing their software.

best wishes
Simon


Simon,

Sorting things by unique reference where possible and in some way useful, is always a good idea.

Indexing a lot of unique identifiers is usually pointless for filtering though useful for searching. But the designers can only offer the facilities and not necessarily dictate how they are used. At least not in an understandable way for a consumer/desktop product. Most users, I suspect, will have no interest in the details of how to run their systems and probably do not populate any of the possible key search fields that are available to them. If left empty they are relatively easy to deal with.

And of course the ultimate source for the base data on which to build the filter is text in an xml file - a compromise, perhaps, for flexibility and portability and to support things like the Session concept that was the basis of C1 for many years.


Grant
SFA
 
Posts: 6953
Joined: Tue Dec 20, 2011 9:32 pm

Re: Any tips for speeding up large catalog?

Postby Skids » Wed Oct 30, 2019 11:57 pm

Sorting things by unique reference where possible and in some way useful, is always a good idea.

Indeed sorting data can be very useful but I believe that Eric was referring to unnecessary rebuilds of database indices when the data was not being used and had not changed.

Indexing a lot of unique identifiers is usually pointless for filtering though useful for searching. But the designers can only offer the facilities and not necessarily dictate how they are used. At least not in an understandable way for a consumer/desktop product. Most users, I suspect, will have no interest in the details of how to run their systems and probably do not populate any of the possible key search fields that are available to them. If left empty they are relatively easy to deal with.


I don't understand the distinction you are making between filtering and searching as I see them as the same. As to the design, the C1P database provides storage for the standard metadata yet when it was used by Eric the result was a significant slowdown. This observation implies that some time consuming internal database action is being triggered when it is not needed.

And of course the ultimate source for the base data on which to build the filter is text in an xml file - a compromise, perhaps,

I believe that the xmp sidecars were designed by Adobe to facilitate the exchange of meta data about raw files between image processing tools. Ideally the xmp data should be read once and stored inside the database where it can be updated and managed as required.

I think that in the first instance an application should strive to protect its users from making errors and if and when problems arise the software house should provide timely support to help rectify the error.

best wishes

Simon
Skids
 
Posts: 20
Joined: Wed Apr 22, 2015 2:38 pm

Re: Any tips for speeding up large catalog?

Postby SFA » Thu Oct 31, 2019 4:46 am

Sorting things by unique reference where possible and in some way useful, is always a good idea.

Indeed sorting data can be very useful but I believe that Eric was referring to unnecessary rebuilds of database indices when the data was not being used and had not changed.

Indexing a lot of unique identifiers is usually pointless for filtering though useful for searching. But the designers can only offer the facilities and not necessarily dictate how they are used. At least not in an understandable way for a consumer/desktop product. Most users, I suspect, will have no interest in the details of how to run their systems and probably do not populate any of the possible key search fields that are available to them. If left empty they are relatively easy to deal with.


I don't understand the distinction you are making between filtering and searching as I see them as the same. As to the design, the C1P database provides storage for the standard metadata yet when it was used by Eric the result was a significant slowdown. This observation implies that some time consuming internal database action is being triggered when it is not needed.

And of course the ultimate source for the base data on which to build the filter is text in an xml file - a compromise, perhaps,

I believe that the xmp sidecars were designed by Adobe to facilitate the exchange of meta data about raw files between image processing tools. Ideally the xmp data should be read once and stored inside the database where it can be updated and managed as required.

I think that in the first instance an application should strive to protect its users from making errors and if and when problems arise the software house should provide timely support to help rectify the error.

best wishes

Simon


(It gets difficult to deal with nest comments here since the degree of nesting is limited.)

Point one.

This would be true of course but can be a little trickier with dynamic activities.

Using a catalogue there are, in theory, check points at which indexes can be created. For example some aspects of load, the concept of drag and drop and the Synchronisation of a folder of set of folders or whatever is allowed.

However there is also the possibility of active metadata synchronisation to cope with and what ever might be happening with external editing. And for any single image there can be multiple associations and groupings to deal with plus multiple variants with different metadata entries for user variable data.

If one could provide some sort of control panel for people to indicate, just for example, which metadata fields they want to actively use with a degree of providing instant answers it would be one way to reduce the overhead - unless people did not understand the relevance and simply ticked everything anyway.

However the strange thing is that whilst some users seem to suffer greatly, as reported, other seem to not suffer at all. (Based on self reporting comments). Assuming the reality for each is as reported - why would things be so different from one user to the next?

Because C1 does not force users to store the source image files in the database it need to check on opening which files it can "see". And to some extent that also implies making a decision about the numbers reported as "available". (Depending perhaps one one view of these things - the only certainty id that different people will have different opinions.)


Point 2 - Filtering and Searching

There are various metadata fields populated with known values and these can be aggregated and a count kept. The count can be varies when a filter is applied. The filter may have multiple rules and affect the count result in different ways.

However in C1 terminology there are pre-assumed filters related to certain metadata fields that are assumed to exist and are listed in the Filter Tool information with counts provided.

By default not all of the fields are displayed but it's easy to add them for display in the tool.


On the other hand smart Albums (for example) are interactive searches, as defined, that may also have a filter or combination of filters applied. The filters can be embedded and so used as part of the search. But filters (presumably different ones) can also be applied after the search to further refine the selection if required.

This entire area of terminology can be a challenge of industry inconsistency depending on whether one thinks of a Filter as something that EXCLUDES or something the INCLUDES.


Point 3. XML Sidecars

I'm not thinking of XMP files here.

C1 started of without Catalogues by with Sessions.

I like it for that reason and still use sessions. To me they make more sense.

In a session one has a " database file" but it is not the full DAM functionality that might be expected from a catalogue based design. However the underpinnings of both approaches seem to be quite similar and up to a point can share information easily enough when required.

In a session each image will have a XML based sidecar file that contains not only the metadata that is not embedded in the source file but all of the edit data as well.

In terms of keeping thing up to a level of easy access commonality that seems to works pretty well. Obviously there are Pros and Cons or each approach but the point is that it simplifies interoperability between sessions and catalogues and allows for backwards compatibility.

Exchanging data with other applications via XMP files is a separate issue although it may well have some further influence on design and performance.

Te other factor when comparing performance between 2 applications (other than allowing for the possible different approaches taken in design decisions), is whether the perceived performance differences are real or perhaps an application doing things a different way that might make some aspects appear faster but other aspects perhaps slower.


But that is probably a discussion for a different time.


I agree with your last sentence in principle but as things become more complex it may become more difficult (and costly) to protect users from their own actions all of the time. Likewise vendors of software and hardware on which the successful use of an application seems to depend. It can be very expensive and time consuming to attempt to achieve the ultimate level of user protection. Even giants of the industry with almost unlimited resources seem to struggle.

How much are we prepared to spend to obtain such flaw free code and how long will buyers be patient waiting form their shiny new functionality?


Are we prepared to pay for premium support?


Grant
SFA
 
Posts: 6953
Joined: Tue Dec 20, 2011 9:32 pm

Re: Any tips for speeding up large catalog?

Postby Eric Nepean » Thu Oct 31, 2019 3:57 pm

Some comments to points raised above.

Difference between filtering and searching

The Filter Tool, for each of the Metadata fields which are enabled, shows a a list of all the available values of that field. Each value has a count and a "radio" button to select it.

Consider a large collection of say 2000 images.

If there are 5 possible values or even 20 possible values of a Metadata field (e.g. Category) Filter Tool's list is quite useful to the user.

If the Category field has a unique value for each of image, then Filter Tool's list has 2000 items, this is many pages long.

Such along list is not useful because a particular item is difficult to find, and flipping through the pages is slow and time consuming. Further you now have to know the in advance the value you are looking for.

In this case the Search tool is a better choice.

Search tool provides no list of values, but does show, as you type the entry, the variants that are selected so far.

The Problem

Filter Tool by default has only a few Metadatafields enabled; there are almost 50 metadata fields IIRC. Any or all of them can be enabled or disabled.

However, I observed by direct experiment that Filter Tool does the sorting operation need to create the list of items on every Metadatafield, including those not shown.

In the case of a large number of variants, and a Metadata field with unique values for most, the user can configure Filter Tool not to show the list. But Filter Tool still does the very tedious and time consuming (and unnecessary) work to create the list in the background.

Consider a list of 10000 unique values (but you don’t know they are unique). You want to make a list of unique values. Here’s how it goes:

To start the result list is empty, you add the first item from the input list.
Then you check the second item in the input against the one item of the result, and add it to the result.
Then you check the third item in the input against the two items of the result, and add it to the result.
Then you check the fourth item in the input against the three items of the result, and add it to the result.
......
......
Then you check the 10,000th item in the input against the 9999 items of the result, and add it to the result.

About 50000000 comparisons are required. That’s a big number, even for a computer.
Eric Nepean
 
Posts: 611
Joined: Sat Oct 25, 2014 8:02 am
Location: Ottawa

Re: Any tips for speeding up large catalog?

Postby mli20 » Fri Nov 01, 2019 11:07 am

Skids wrote:...
the database is not very well designed.
...


Sadly this is reflected in much of the DAM functionality of Capture One.

Earlier this year I did an analysis in depth re. Keywording, which is to be found here:

https://forum.phaseone.com/En/viewtopic.php?f=52&t=29890

I believe similar begs to be done re. filtering and searching in Capture One, and that it will come to the same conclusion: Without a complete database redesign to 2019 standards the DAM functionality in Capture One will not ever reach the professionalism and maturity we can reasonably expect from "Pro" designated software.

It appears that keywording will be a prominent feature of DxO Photolab 3. Let's see what they come up with. If Dxo can do it, surely P1 can as well.
mli20
 
Posts: 317
Joined: Wed May 01, 2013 5:07 pm

Re: Any tips for speeding up large catalog?

Postby NNN634221267024107395 » Tue Nov 05, 2019 1:00 pm

Sadly there is no claims in the release notes for the beta version of CO20 mentioning any improvements to the catalog speed. I find the performance of Capture One unusable if I want to filter the entire catalog (about 50000 images). I cannot recall this ever being a problem in Lightroom.

Very dissapointing.
NNN634221267024107395
 
Posts: 9
Joined: Fri Oct 08, 2010 10:24 am


Return to Capture One 12.x Software for Mac



Who is online

Users browsing this forum: No registered users and 2 guests