1. We had recently upgraded to Ariba 91 (Buyer and Sourcing) SP20.
2. In our setup, we have 4 physical WINTEL (windows 2008 R2) application servers running Weblogic 10MP1 . We have 8 Buyer Nodes and 2 Sourcing Nodes . There are 2 catalogSearch cum UI nodes
3. For past few weeks we have been encountering this behaviour:
a. Whenever we try to upload and activate catalogs , one of the catalogsearch nodes will eventually become unresponsive ie they cannot be shutdown from weblogic console or command prompt. Any UI node on the same server as the catalogsearch node also becomes unresponsive. We are also unable to kill the java.exe process(s) running. Only recourse is to re-boot the physical server.
b. After the reboot the catalogs will be shown as ACTIVATED (but before that this status is ACTIVATING). And if we try to upload/activate catalogs after the re-boot they usually work and until some days later , we hit the same issue.
c. We had turned on debugging (perflog.trace, perflog.exception, cataloglow) and looked at database lock activity - all come out nought
d. The strange thing is that before we went live on 1 Apr 13, we were able to load and activate a few hundred catalog files through batch load and we did not encounter any node failure.
4. Has anyone encountered this before or have any thoughts as to what may have gone wrong.