Ticket #4438 (closed Bug: fixed)
We need a script to rebuild the catalog from scratch
| Reported by: | Anonymous User | Owned by: | alecm |
|---|---|---|---|
| Priority: | major | Milestone: | 2.5 |
| Component: | Infrastructure | Keywords: | |
| Cc: |
Description (last modified by hannosch) (diff)
After a successful migration, it seems that every object from the portal root and portal_skins/custom is displayed in the foler's contents. Perhaps this is a problem w/ getFolderContents?
Change History
comment:2 Changed 7 years ago by geoff
I've seen this too when I blow away the catalog and reindex everything. I think it would nice (say for 2.1.x) if we added a method that would let you reindex all content types only.
comment:3 Changed 7 years ago by Anonymous User
Ah, the catalog-based folder_contents and folder_listing completely slipped my mind. The site is quite old and has been migrated and screwed with many times. I am sure at some point the catalog was totally rebuilt. An easy method to make it sane again would be nice. What if when querying the catalog, getFolderContents (or whatever is used) only asked for types registered with portal_types? Or would this be too expensive?
comment:4 Changed 7 years ago by alecm
We do at least one catalog based listing on essentially every plone page by default, adding a portal_types lookup to each of those calls wouldn't be acceptable I think, especially as it's only necessary when the catalog has been improperly maintained. Geoff's suggestion is a good one, but there are some problem types which though they are in portal types, aren't really cataloged content (criteria are the first things that come to mind, but there are other types which may not want to be cataloged in portal_catalog by default).
The only way to be sure is to rely on the reindexObject methods of the instances themselves, and walking the tree and doing obj.reindexObject() would be no fun really. Not for 2.1 though, maybe we can add such a helper later.
comment:5 Changed 7 years ago by geoff
Well, it's not like you'd want to rebuild the catalog frequently, but every now and then you really mess things up and it's necessary. The reindexObject method call thing is not that bad when you consider that building the catalog from scratch basically entails walking the tree and calling indexObject.
comment:6 Changed 7 years ago by alecm
I agree, we should have such a method, and big a button for it in the ZMI which clearly states it's the proper way to rebuild. But I'm not going to be writing it before 2.1 comes out. :)
comment:7 Changed 7 years ago by bhirsch
What would be entailed in doing this? Clearing the catalog, then walking through the site and calling reindexObject() on each object registered in portal_types? Is there a general rule which would avoid things like Criteria objects?
comment:8 Changed 7 years ago by geoff
Use the ZopeFindAndApply method from ~Zope/lib/python/Products/ZCatalog/ZCatalog.py (IIRC) to walk the ZODB. For each object, do something like
if shasattr(obj, 'reindexObject'):
obj.reindexObject()
where shasattr is found in Archetypes/utils.py
comment:9 Changed 7 years ago by alecm
Use ZopeFind to find the objects in portal types (though doing all objects should be safe but very slightly slower). Iterate through and see if each object has a callable reindexObject attribute, if so call it. Such special types either won't implement reindexObject or will override it to do nothing.
sketch:
objs = ZopeFind() for obj in objs:
if getattr(aq_base(obj), 'reindexObject', None) is not None:
if safe_callable(obj.reindexObject):
obj.reindexObject()
Of course such a thing in the plone core would need a few unit tests to go with it.
comment:10 Changed 7 years ago by alecm
geoff,
You'll be pleased to know that plone has it's own safe methods now:
safe_hasattr base_hasattr (using aq_base which my example should have used) safe_callable
all importable into py scripts from CMFPlone.utils IIRC.
comment:11 Changed 6 years ago by limi
+1 for this script to be added to 2.1.x. :)
We can not check for portal_types, that would be too expensive - you shouldn't index stuff that isn't content in the first place (although I know that the Zope UI makes it all too easy to clear the catalog ;)
comment:12 Changed 6 years ago by hannosch
- Component changed from Infrastructure to Catalog
- Description modified (diff)
comment:13 Changed 6 years ago by hannosch
- Priority changed from minor to major
- Milestone changed from 2.1.x to 2.5
- Summary changed from Tools being reported by getFolderContents after 2.0.5->2.1rc{1,2} migration to We need a script to rebuild the catalog from scratch
comment:15 Changed 6 years ago by alecm
- Status changed from new to closed
- Resolution set to fixed
Fixed on trunk -r9310, feel free to backport to 2.1 branch :-)

My guess is that at some point when you had 2.0 installed you told your catalog to recatalog everything in the site, including things which shouldn't have ever been in the catalog like skin templates and portal tools. 2.1 now uses the catalog to do these listings for a number of reasons. The downside is that you need to have a sane catalog for this to work. If you can demonstrate that none of these objects were catalogged on your 2.0 site pre-migration, then I will consider this a bug otherwise, it is not. You may resolve it by clearing your portal_catalog and having it reindex only the true content types.