Re: Enormous transaction tables in Version: 5.0 (Build 14-e)

Amy Kang
 

> will there be instructions posted on how to build OpenMQ from github?

The OpenMQ can be built by "mvn install" after git clone from github.com.  For example,  in my case,   directly connected to internet, no proxy setting nor .m2/settings.xml file,

cd openmq/mq
mvn install

it builds successfully.  There is a mq/README file.

Regards,
Amy

On 9/29/17, 3:04 PM, Will Hartung wrote:
For the time being, we have fully vacuumed the database, and literally have a simple shell "while" loop running the normal vacuum in a continuous cycle. The vacuums take about 5s now. The three largest tables:

mqtxn41cmqcluster 14MB total space
mqmsg41cmqcluster 50MB total space
mqconstate41cmqcluster 19MB total space.

We're going to research the autovacuum settings so that we can hopefully use those.

I still think we may consider MySQL for this, however. I just loathe the idea of having to introduce in another database, especially one we're not that familiar with.

We peak at around 60 messages/sec across both legs.

The broker legs share the same machine with the application, so each application only "talks" to the local broker (rather than random or round robin). We figure if the app machine goes down, we don't really need the "other" JMS leg anyway, as long as the messages will be delivered to the live leg. Seems to work for us. More for reliability than anything else.

Also, will there be instructions posted on how to build OpenMQ from github? Right now there's a requirement on some glassfish plugin for maven, but no information on where that is (and it's apparently not in the public repos). I can't even load the source code (easily) in the Netbeans.

Thanks again for your help, Amy.

P.S. Also, thanks for having the git hub source trunk build cleanly with a simple mvn command. All good stuff.



On Thu, Sep 28, 2017 at 8:46 AM, Amy Kang <amy.kang@...> wrote:
PS.  MQ enhanced cluster is not tested with PostGres, so you are basically in a non-tested configuration.


On 9/28/17, 8:41 AM, Amy Kang wrote:
Will,

>Can we pretty much just down load it and use the config files that we already have?

Yes,  however, as for any upgrade, you should test the upgrade in stage area first.

Regards,
amy

On 9/27/17, 12:00 PM, Will Hartung wrote:
Hi Amy,

Yea, it's most definitely the PostGres vacuum facility. We're in the process of trying to get these tables more aggressively vacuumed. We're specifically looking at:

mqmsg41cmqcluster
mqconstate41cmqcluster
mqtxn41cmqcluster

We had the recovery happen again, the table soared to 32G of empty space over a week. This time, our DB had run out of connections, and OpenMQ tried for one more -- and got the error, so it mentioned about the storage system failing so it was transferring to the other broker, which stalled the other leg and everything collapsed.

So, I'm looking at getting the connection pool to a fixed state, so it gets all it needs up front and zealously hangs on to them.

We use an enhanced cluster.

is the upgrade straightfoward? Can we pretty much just down load it and use the config files that we already have?

Thanks Amy!



On Tue, Sep 26, 2017 at 1:09 PM, Amy Kang <amy.kang@...> wrote:
Hello Will,

On 9/25/17, 6:07 PM, Will Hartung wrote:
We've been running two legs of OpenMQ 5.0 successfully for, heck, years. It's been really hands off.

It almost sounds like your recent issue described below had something to do with PostCres behavior regarding deleted rows/"empty pages" .. especially you mentioned below the "tables had few rows, but large amounts of dead space" and PostGres Vacuum facility was run because of that..

You mentioned below the unexpected large size of the 3rd table which is mqtx41cmqcluster.  This table records all transactions and the broker lazily and periodically purge completed transactions (committed/rolledback) transactions.  PostCres optimization on row deletions and empty space compact likely an area can be helpful to look at.  You can also run 'imqcmd list txn' to see how many transactions in the broker(s).  Are you using MQ cluster ?  if yes,  enhanced cluster or regular ?  Since data in these tables are internal to MQ broker(s),  shouldn't be arbitrarily truncked.

By the way since 5.0,  there have been many bugs fixed,  please upgrade to the latest MQ 5.1.1 release.

Thanks,
amy


We have 2 legs in a cluster.

We had to restart one of the machines, and when OpenMQ stopped, the other leg went in to a recovery mode to take over the loss of the other leg.

The problem was that this was taking absolutely forever. What we didn't at the time realize, was that the MQ was performing some SELECTs on the backing PostGres DB.

PostGres does not return empty pages back to the database after all of the rows are deleted from them, so anything that requires a table scan (in this case I think it was doing a SELECT DISTINCT on a session id or something), will cause the DB to load all of the empty pages.

We have 3 rather large tables:
mqconstate41cmqcluster -- 6GB
mqmsg41cmqcluster -- 6GB
mqtxn41cmqcluster -- 32GB

These tables had few rows (< 1000), but large amounts of dead space.

When we discovered this, we took some down time, and used the PostGres Vacuum facility to recover disk pages and such. We did that a week ago Sunday. We thought these table sizes were anomalies.

However, we had a different failure, ran out of connections on the database which caused a failover, and we see that these tables have exploded again.

Now the first two have few rows -- ~2000
The final one, has over 21M rows.

And our recovery are taking forever and shutting everything down.

Now, sure, we do a lot of JMS transactions, but even though our queues and such are persistent, we keep up, our queue depths are usually very small (< 5) but that 21M number is pretty crazy. All of these numbers are pretty crazy.

I'm tempted to TRUNCATE all of these tables and restart, not particularly caring about any in flight messages and what not at the moment. Just so I can restart.

Why are these tables so large? They literally exploded over a week, I thought the sizes were anomalies but I guess not.

What would happen if I truncated those tables and restarted? I've dredged through the source code, but it's quite difficult to find out what's being done where to these tables.

Thanks!











Join openmq@javaee.groups.io to automatically receive all group messages.