We've been running two legs of OpenMQ 5.0 successfully for, heck, years. It's been really hands off.
We have 2 legs in a cluster.
We had to restart one of the machines, and when OpenMQ stopped, the other leg went in to a recovery mode to take over the loss of the other leg.
The problem was that this was taking absolutely forever. What we didn't at the time realize, was that the MQ was performing some SELECTs on the backing PostGres DB.
PostGres does not return empty pages back to the database after all of the rows are deleted from them, so anything that requires a table scan (in this case I think it was doing a SELECT DISTINCT on a session id or something), will cause the DB to load all of the empty pages.
We have 3 rather large tables:
mqconstate41cmqcluster -- 6GB
mqmsg41cmqcluster -- 6GB
mqtxn41cmqcluster -- 32GB
These tables had few rows (< 1000), but large amounts of dead space.
When we discovered this, we took some down time, and used the PostGres Vacuum facility to recover disk pages and such. We did that a week ago Sunday. We thought these table sizes were anomalies.
However, we had a different failure, ran out of connections on the database which caused a failover, and we see that these tables have exploded again.
Now the first two have few rows -- ~2000
The final one, has over 21M rows.
And our recovery are taking forever and shutting everything down.
Now, sure, we do a lot of JMS transactions, but even though our queues and such are persistent, we keep up, our queue depths are usually very small (< 5) but that 21M number is pretty crazy. All of these numbers are pretty crazy.
I'm tempted to TRUNCATE all of these tables and restart, not particularly caring about any in flight messages and what not at the moment. Just so I can restart.
Why are these tables so large? They literally exploded over a week, I thought the sizes were anomalies but I guess not.
What would happen if I truncated those tables and restarted? I've dredged through the source code, but it's quite difficult to find out what's being done where to these tables.