21 January 2010

How to Use Automatic Failover In an ActiveMQ Network of Brokers

Last week I tested a new feature in ActiveMQ 5.3.0 to support automatic failover/reconnect in a network of brokers. Besides adding this information to the ActiveMQ book, one person also suggested that I also post it on my blog for easier access, so here you go!

Folks familiar with ActiveMQ already know that a network of brokers allows many broker instances to be networked for massive scalability. Prior to the addition of this feature in ActiveMQ 5.3, if one of the brokers in the network went down, reestablishing a connection with that broker when it comes back up is a manual process wrought with difficulty. By adding support for failover to the network of brokers, any broker in the network can come and go at will without any manual intervention. A very powerful feature, indeed. Although this post is long, the outcome of the testing is well worth it.



The first thing to note is the topology for the network of brokers. I used a network of three brokers named amq1, amq2 and amq3. The attached diagram explains the topology, including the consumers and producers. amq1 and amq2 are stand alone with no network connector. amq3 defines a network connector with failover to amq1 and amq2. Consumers exist on amq1 and amq2. Producer will connect to amq3. To start with, I have only configured a uni-directional network connector in amq3. Later I will change the configuration for a bi-directional network connector.

Thanks to the ability to upload any file to Google Docs this week, you can download the configuration files for the three brokers.

The next thing to do is outline the steps I used to test out this feature. These steps were performed on Mac OS X (Unix) but could easily be adapted for Windoze. Below are those steps:

1) Open six terminal windows as defined below:
1a) Terminal 1 = cd into the amq1 dir
1b) Terminal 2 = cd into the amq2 dir
1c) Terminal 3 = cd into the amq3 dir
1d) Terminal 4 = cd into the amq1/example dir
1e) Terminal 5 = cd into the amq1/example dir
1f) Terminal 6 = cd into the amq1/example dir

2) Terminal 1: start up amq1 (./bin/activemq)
3) Terminal 2: start up amq2 (./bin/activemq)
4) Terminal 3: start up amq3 (./bin/activemq)

Thanks to the configuration of the ActiveMQ logging interceptor, you should see that amq3 makes a network connection to either amq1 or amq2. For the rest of these steps, let's assume that amq3 connected to amq1.

5) Terminal 4: start up a consumer on amq1 (ant consumer -Durl=tcp://0.0.0.0:61616)
6) Terminal 5: start up a consumer on amq2 (ant consumer -Durl=tcp://0.0.0.0:61617)
7) Terminal 6: start up a producer on amq3 (ant producer -Durl=tcp://0.0.0.0:61618)

You should see 2000 messages sent to amq3. The messages should be forwarded to either amq1. The consumer connected to amq1 should have received the 2000 messages and shut down.

8) Terminal 1: shut down amq1 (ctrl-c)

Note the logging that shows the failover taking place successfully. Let's test it to see if the demand forwarding bridge actually got started.

9) Terminal 6: start up a producer on amq3 (ant producer -Durl=tcp://0.0.0.0:61618)

You should see 2000 messages sent to amq3. The consumer connected to amq2 receives the 2000 messages and shuts down.

10) Terminal 1: start up amq1 (./bin/activemq)

11) Terminal 2: shut down amq2 (ctrl-c)

Again, the failover took place successfully. Let's continue just a bit further to see if it will continue to failover if I bring up amq1 again.

12) Terminal 4: start up a consumer on amq1 (ant consumer -Durl=tcp://0.0.0.0:61616)

13) Terminal 6: start up a producer on amq3 (ant producer -Durl=tcp://0.0.0.0:61618)

You should see 2000 messages sent to amq3. The consumer connected to amq1 receives the 2000 messages and shuts down.

This proves that the failover transport is supported in a network connector and it does work correctly with a uni-directional network connector. In addition to a uni-directional network connector, I also tested a bi-directional network connector. This only requires a slight change to the configuration of the network connector in amq3. In the amq3 XML configuration file, in the network connector element, add a duplex=true attribute. Below is the network connector element for amq3 with the change:


<networkConnector name="amq3-nc"
uri="static:(failover:(tcp://0.0.0.0:61616,tcp://0.0.0.0:61617))"
dynamicOnly="true"
networkTTL="3"
duplex="true" />


With this minor change in configuration, the network connector is now bi-directional. I.e., communication between amq3 and whichever broker it connects to is two-way instead of just one-way. This means that messages can be sent in either direction, not just in one direction originating from amq3.

Below are the steps I used to test this specific change:

1) Open five terminal windows as defined below:
1a) Terminal 1 = cd into the amq1 dir
1b) Terminal 2 = cd into the amq2 dir
1c) Terminal 3 = cd into the amq3 dir
1d) Terminal 4 = cd into the amq1/example dir
1e) Terminal 5 = cd into the amq1/example dir

2) Terminal 1: start up amq1 (./bin/activemq)
3) Terminal 2: start up amq2 (./bin/activemq)
4) Terminal 3: start up amq3 (./bin/activemq)

You should see that amq3 makes a network connection to either amq1 or amq2. For the rest of these steps, let's assume that amq3 connected to amq1.

5) Terminal 4: start up a consumer on amq1 (ant consumer -Durl=tcp://0.0.0.0:61616)
6) Terminal 5: start up a producer on amq3 (ant producer -Durl=tcp://0.0.0.0:61618)

You should see 2000 messages sent to amq3. The messages should be forwarded to amq1. The consumer connected to amq1 should receive the 2000 messages and shut down.

Let's test the duplex capability of the network connector in amq3 now. To do this we'll send messages to amq1 and consume those messages from amq3.

7) Terminal 4: start up a consumer on amq3 (ant consumer -Durl=tcp://0.0.0.0:61618)
8) Terminal 5: start up a producer on amq1 (ant producer -Durl=tcp://0.0.0.0:61616)

You should see 2000 messages sent to amq1. The messages should be forwarded to amq3. The consumer connected to amq3 should receive the 2000 messages and shut down. This proves that the duplex feature is working. Now let's cause a failover/reconnect to take place and run through this same set of steps with amq3 and amq2.

9) Terminal 1: shut down amq1 (ctrl-c)

Notice the logging that shows the failover taking place successfully so that amq3 connects to amq2 now.

10) Terminal 4: start up a consumer on amq2 (ant consumer -Durl=tcp://0.0.0.0:61617)
11) Terminal 5: start up a producer on amq3 (ant producer -Durl=tcp://0.0.0.0:61618)

You should see 2000 messages sent to amq3. The messages should be forwarded to amq2. The consumer connected to amq2 should receive the 2000 messages and shut down.

Now let's test the duplex feature in the network connector.

12) Terminal 4: start up a producer on amq2 (ant consumer -Durl=tcp://0.0.0.0:61617)
13) Terminal 5: start up a consumer on amq3 (ant producer -Durl=tcp://0.0.0.0:61618)

You should see 2000 messages sent to amq2. The messages should be forwarded to amq3. The consumer connected to amq3 should receive the 2000 messages and shut down.

This proves that the duplex feature of the network connector works after a failover/reconnect to amq2.

This is a great addition to ActiveMQ that really improves the usability of a network of brokers. I already have some very large clients using this feature successfully, some of which are using a network of over 2000 brokers.

Hopefully these steps are clear enough to follow for your own use. If you need any clarifications, please contact me.

70 comments:

  1. where is the example dir, have you posted it?

    ReplyDelete
  2. I just updated the blog post with a link to the examples. Please refresh the blog post and see the very end for the link to the tarball.

    ReplyDelete
  3. Thanks for posting but unable to download.

    ReplyDelete
  4. My apologies, I guess Google Docs cannot handle a file that is that big (92mb). Please send me an email (bruce DOT snyder AT gmail DOT com) and I will transfer you the file via a TransferBigFiles.com.

    ReplyDelete
  5. I have the following use case:
    The brokers amq1, amq2 and amq3 needs to be linked to each other with
    1) I should be able to send message to any of the brokers.
    2) Assuming I have only one listener, it should get the message that was sent to any of the 3 brokers
    3) Assuming there are 2 consumers, it should get all the messages sent to any of the 3 servers without duplicates ( that is the multiple consumers should not consume the same message)

    Could you help in understanding on how to setup such a system?

    ReplyDelete
  6. @Rakesh, for the benefit of the entire ActiveMQ user community, it would be best if you posted your question to the ActiveMQ user mailing list. Information about the ActiveMQ mailing lists can be found here:

    http://activemq.apache.org/mailing-lists.html

    Bruce

    ReplyDelete
  7. hi bruce,
    are the messages you sent are persistent or non persistent ?

    ReplyDelete
  8. For this demo, I used non-persistent messages.

    ReplyDelete
  9. Bruce:

    I am new to ActiveMQ and I was struggling with correctly setting up a failover scenario. By looking at your examples I was able to get a network of brokers and set failover up within it. The only problem I had was with a Master/Slave scenario. Just wanted to say thanks for the article it really helped.

    John

    ReplyDelete
  10. @John, I'm glad to hear that this information helped you!

    Bruce

    ReplyDelete
  11. snyder,

    I have the same issue with IBM MQ. When ever MQ restart / down, my app is loosing mq connections. How can i do with IBM MQ.

    Any Helpe in this highly appreciated

    ReplyDelete
  12. @Prem, The Spring Framework's DefaultMessageListenerContainer will re-establish connections that have failed when a listener fails. Is this the functionality you are seeking? If so, you would need to use the Spring DefaultMessageListenerContainer to consume messages from IBM WebSphere MQ. This has nothing to do with ActiveMQ.

    Bruce

    ReplyDelete
  13. Bruce,

    Thanks for your quick reply. Your help in this highly appreciated.

    Our situation is Like this:

    Our application is Using WAS6.1, Spring JMS, IBM MQ 6.0 and code is deployed in Websphere pointing to Remote Q.

    1) We are using Spring JMS - DefaultMessageListenerContainer and connecting through JNDI, which is defined in WAS 6.1
    2) Q Conneciton factories and Queue Connections are defined in Websphere App server (WAS 6.1)
    3) IBM MQ is our JMS provider, which is running on Mainframe.

    When ever there is an Queue Manager Restart / Down, Our application which is deployed in Web sphere loosing MQ connections and currently we are restarting the app server to regain those MQ connections and start Receiving and sending messages.

    I am copying my config for your reference:




















    JMS Message listener container waiting on incoming requests
    to scheduler jobs.


















    false

    ReplyDelete
  14. JMS Message listener container waiting on incoming requests
    to scheduler jobs.


















    false

    ReplyDelete
  15. @Prem, I'm willing to be that your problem is with the configuration of the JndiObjectFactoryBean. I will respond to your email with the specifics.

    Bruce

    ReplyDelete
  16. Hi Bruce,

    Shouldn't you also set conduitSubscriptions to false so that you have failover *and* load-balancing?

    Mathias

    ReplyDelete
  17. @Mathias, You are correct. If you want even load-balancing of messages across all consumers, then you should set conduitSubscriptions=false. Here's more info on using conduitSubscriptions:

    When to use and not use Conduit subscriptions

    Bruce

    ReplyDelete
  18. Hello, I try to configure failover and PrefetchPolicy in my broker´s uri. In this case my project is a web project, then I don´t understand what I configure it. See my uri:

    If I only use uri with PrefetchPolicy it´s work:
    tcp://localhost:61616?jms.prefetchPolicy.queuePrefetch=1

    Or, if I use only failover it´s work too:
    failover:(tcp://localhost:61616)?maxReconnectAttempts=-1&timeout=10000

    But, when I try merge the two configurations, as a follows:
    failover:(tcp://localhost:61616?jms.prefetchPolicy.queuePrefetch=1)?maxReconnectAttempts=-1&timeout=10000

    It´s don´t work.

    ReplyDelete
  19. @Maikel, I think your problem is with the second question mark. I think it should be an ampersand symbol (&) instead.

    Bruce

    ReplyDelete
  20. In your example, you still had AMQ3 a point of failure for producers.

    What if you want just 2 brokers running in a networked grid, for full redundancy?

    So:

    PROD1 -> AMQ1 -> CONS1
    PROD2 -> AMQ2 -> CONS2

    But AMQ1 and AMQ2 are connected together so that messages on AMQ1 are picked up by CONS2 too, and vice versa CONS1 can also pick up from AMQ2.

    Would you recommend having 2 network connectors (one on each AMQ instance), or one network connector on one of the instances in duplex mode?

    I am slightly inclined to the first alternative (also to have deployment as identical as possible between instances).

    Thanks.

    ReplyDelete
  21. The advantages of using a duplex network connector include a savings on the number of sockets that are opened by the brokers in the network, a simpler configuration (albeit sometimes kinda confusing - see below) and the ability to traverse a firewall from a hub broker when using a hub and spoke broker topology.

    The disadvantages of the duplex network connector include the fact that it can cause some confusion for folks who are not familiar with the way that the network of brokers is configured. For example, if they only look at the configuration of the broker without a network connector, they may be inclined to think that there is no networked configuration. Comments in the configuration file can alleviate this disadvantage. Also, you must be careful using conduitSubscriptions on a duplex network connector if your consumers are using message selectors. If so, you need to disable conduitSubscriptions so that those selectors will be respected.

    ReplyDelete
  22. OK, so given there isn't an issue with the number of sockets, it would work to have 2 AMQ instances with 2 non-duplex sockets connected to each other?

    I am looking at a simple dual node configuration, with 2 identical producer applications, each with embedded AMQ inside them, and a number of consumers (say 2 just for the sake of the example) picking off messages from these 2 queues. I want the 2 AMQ instances to look as one, and the consumers would load-balance (and failover between themselves) irrespective of which queue they're connected to.

    ReplyDelete
  23. Yes, you can use two brokers, each of which defines a non-duplex network connector pointing at one another will work just fine.

    You will need to use a failover transport in each consumer that contains the broker URIs for the two brokers. That way if one broker goes down, the consumer will automatically reconnect to the other broker and continue it's job.

    ReplyDelete
  24. I am configuring a slightly different scenario (bad char art): Hub1 and Hub2 have simplex connections pointing to each other. AMQ1 and AMQ2 connects to the hubs using duplex connectors, and any producer/consumer only connect to AMQ1/2.

    [HUB1] <--> [HUB2]
    ^ ^
    | |
    v v
    [AMQ1] [AMQ2]
    ^ ^
    | |
    (producer) (consumer)


    In my attempts I can get it working if AMQ1/2 use load-balance connectors to the Hubs:


















    But each AMQ here statically connect to BOTH hubs. I want to configure failover connection so that only ONE connection to the hubs are present. However, the following configuration does not work:

















    as the startup process of AMQ got stuck when trying to start the network connector.

    Anything I am doing wrong?

    ReplyDelete
  25. Bruce,

    Thanks for the great post.

    I have setup everything as you wrote. However when I place advisorySupport="false" in the broker tag of all the amq*.xml and start all the brokers.

    Then I start the consumer using command:
    ant consumer -Durl=tcp://0.0.0.0:61616
    and finally started producer using this command:
    ant producer -Durl=tcp://0.0.0.0:61617

    All the produced messaged then go in pending state on the broker running on port 61617 and consumers never get the messages.

    Does the network of broker stop forwarding messages without advisorySupport enabled? I thought that was fixed in recent release. (I'm using activemq 5.4.2 release).

    Thanks,
    Anubhava

    ReplyDelete
  26. @ Anubhava, When an ActiveMQ dynamic network of brokers, advisory messages are required in order for the brokers to communicate state to one another. When you disable support for advisory messages in a dynamic broker network messages will stop flowing. This is not a bug, it is by design.

    If you want to disable advisory messages in a network of brokers and still allow messages to flow properly, you must configure the broker network statically.

    For more information, check out the page about networks of brokers.

    Hope that helps.

    Bruce

    ReplyDelete
  27. Hi Bruce,

    Thanks a lot for your reply. I have read http://activemq.apache.org/networks-of-brokers.html page few times.

    I thought I am already using static network brokers. My broker config is same as you've provided in this post. One of my networkConnector tag looks like this:



    So even though uri here starts with static: is it not static?

    cheers,
    Anubhava

    ReplyDelete
  28. Sorry XML got wiped out, trying again:

    &lt:networkConnector name="amq2-nc"
    uri="static:(failover:(tcp://0.0.0.0:61616))"
    dynamicOnly="true"
    networkTTL="2"
    duplex="true" /&gt:

    Thanks,
    Anubhava

    ReplyDelete
  29. @ Anubhava, You are correct, that is a static URI for the network connector.

    I just found this issue that seems to match what your describing, perhaps it is the same problem:

    https://issues.apache.org/jira/browse/AMQ-2640

    This applies to a situation where the default credentials.properties file is being utilized from the activemq.xml config file. The solution is to make sure to use the default username and password in the network connector configuration.

    Try that out to see if it solves your problem.

    Bruce

    ReplyDelete
  30. Bruce,

    Thanks so much for the quick response. I have been actually having this problem for long time and have scanned many opened/closed AMQ JIRA tickets on this site. However I tried it again today by having my networkConnector like this (I even removed failover):

    <networkConnector name="amq1-nc"
    uri="static:(tcp://localhost:61617)"
    userName="system"
    password="manager"
    />

    But still it didn't forward the messages to other network broker when advisorySupport is false.

    cheers,
    Anubhava

    ReplyDelete
  31. @Anubhava, I'm sorry that my suggestions have not helped your situation. My advice at this point is to post a message to the ActiveMQ user mailing list so that Dejan or Gary can have a look at your problem. See the doc about the mailing lists/forums.

    ReplyDelete
  32. Hi Bruce,
    I have a question, can I have a network of brokers as well as the Shared File System Master Slave (http://activemq.apache.org/shared-file-system-master-slave.html) with the same set of brokers (amq1, amq2 and amq3)?
    Like, right now in your example, amq3 broker xml file has the entry about amq1 and amq2 as Network-connector, and producer and consumer are using the URIs as their respective brokers they are directly connected to? Now, lets say I do configure all of them in shared-file-system-master-slave also, and then give the set of URI's to clients in failover transport URI, Is that fine?


    Deepak

    ReplyDelete
  33. @Deepak, Please see my answers below:

    > I have a question, can I have a network of brokers as well as the
    > Shared File System Master Slave (http://activemq.apache.org/shared-
    > file-system-master-slave.html) with the same set of brokers (amq1,
    > amq2 and amq3)?

    In an ActiveMQ master/slave configuration, a slave broker does not fully start up until the master broker fails. Although you can use a networked configuration with slave brokers, so that when they do start up they will use it. But a slave broker is not available until the master fully fails.

    > Like, right now in your example, amq3 broker xml file has the entry
    > about amq1 and amq2 as Network-connector, and producer and
    > consumer are using the URIs as their respective brokers they are
    > directly connected to?

    Yes, that's correct.

    > Now, lets say I do configure all of them in shared-file-system-
    > master-slave also, and then give the set of URI's to clients in failover
    > transport URI, Is that fine?

    Yes, when you configure a master/slave scenario, you must give the clients the URIs to the master as well as the slave using the failover transport. In the event that the master fails and the slave takes over, the failover transport on the client will be able to automatically reconnect to the slave when it comes up.

    Bruce

    ReplyDelete
  34. Thanks Bruce for the detailed info.
    So, like as you said -
    a slave broker does not fully start up until the master broker fails.
    It means, I cannot have master/slave between amq1,2,3 with clients connected to all three, right?

    If above is correct, then ---
    I have to have amq 1m,1s,2m,2s,3m,3s - total 6 of brokers, every m & s sharing file system. Now, every corresponding connected client has to be given a transport URI having m & s both.
    (a) Like in your ex, amq3 has entry about amq1 and amq2 as network connector, how will that be in this case?
    (b) I know network of brokers is for scalabilty, for HA, we already have m & s, but lets say both 1m & 1s fail (down), then is there a way to connect client of amq1 to other set of amq- m & s without manual intervention through a prior-config?

    ReplyDelete
  35. @Deepak, If you are deploying each broker in a master/slave pair with six brokers in total, then you can certainly have client apps connect to all three. You will want to make sure that each client app uses the failover transport with a list of all the broker URIs.

    One thing that you cannot do is share the data directory between active brokers. The shared filesystem master/slave configuration is meant to share the data directory between a master broker and slave brokers where only one of these brokers is active at any given time. You cannot allow two active brokers to share the same data directory. Doing this will result is very unpredictable behavior as this is not how ActiveMQ is designed to operate.

    > a) Like in your ex, amq3 has entry about amq1 and amq2 as network
    > connector, how will that be in this case

    In a situation where you are deploying master/slave pairs instead of single brokers, the network connector configuration will need to make use of the failover transport along with the broker URI for the master and the slave. This way, if the master broker fails, the failover transport will automatically try to reconnect to one of the brokerURIs in the list. With the proper failover transport configuration options, when the slave broker becomes the new master, the failover transport will establish a connection to it.

    > (b) I know network of brokers is for scalabilty, for HA, we already have
    > m & s, but lets say both 1m & 1s fail (down), then is there a way to
    > connect client of amq1 to other set of amq- m & s without manual
    > intervention through a prior-config?

    The common solution to this situation is to use the shared filesystem configuration because you can point as many brokers at the same data directory as you like, but only one of those brokers will ever become fully active at a time to be the master. The first broker to grab the lock on the data directory is automatically considered the master broker and all other brokers that are pointed at the same data directory are automatically considered slaves. If the master fails, the first slave to grab the lock on the data directory automatically becomes the new master.

    When using such a configuration, it's very common for folks to use some type of a watchdog process (such as daemontools). The purpose of a watchdog process is to automatically restart a broker if it goes down. This minimizes the number of brokers that you need to run pointing at the same data directory. But it's common for folks who are using a watchdog process to still run brokers in pairs so that if the master goes down, the slave will automatically become the new master and the watchdog process will restart the broker that went down allowing it to become the new slave.

    As described above briefly, when using a master/slave configuration, all client apps should utilize the failover transport with a list of broker URIs for all brokers involved (i.e., both master brokers and slave brokers). If the broker to which a client app is connected fails, the failover transport will automatically start trying to connect to other brokers in the list of URIs.

    ReplyDelete
  36. Thanks Bruce for the detailed replies, I will get back to you after I experiement with all the combination of scenarios I have in mind as stated above.
    I have another question -
    (1) We implement onMessage() method of MesssageListener(), it runs in different thread, the question is whether that thread is blocked until we are done with the processing of message in the onMessage() or it has set of threads for next messages, and onMessage() doesn't block its own read?

    Deepak

    ReplyDelete
  37. @Deepak, the session thread that executes a message listener's onMessage method does so in a serial manner. This means that only one message is handed by the onMessage method at a time. This is a rule from the JMS specification (see section 4.4.14 of the JMS 1.1 spec).

    Bruce

    ReplyDelete
  38. Thanks Bruce. So it means that onMessage() will be blocking its own read.
    Deepak

    ReplyDelete
  39. @Deepak, A MessageListener does not read messages from the broker (i.e., pull messages). Instead, the thread in which the MessageListener is executed is from the broker side and it pushes messages to the MessageListener's onMessage() method. There is no blocking behavior, it's simply a matter of serial execution in the thread on the broker side. Again, this is mandated by the JMS spec, it is not ActiveMQ specific.

    Hope that helps.

    Bruce

    ReplyDelete
  40. Hello i got a problem with Activemq and maybe you can help me :

    I have a topology in brokers network like this :

    producer-EmbeddedBrokerLeft <--> BrokerBridge <--> EmbeddedBokerRight-DurableConsumer

    Every thing works well at startup, i can see the customer subscription on bridge and left broker.

    The problem when i disconnect the Bridge and start it again (or use a file system master/slave) both left and right brokers resume the connection however :

    -In the leftBroker the durableConsumer (attached to the rightBroker) is offline (however it's online in the bridge) and it doesn't forward the messages stored to the bridge (as for it (left) there is no demands on bridge)

    ReplyDelete
  41. @Bakhti, how have you defined the bridge? If you're using the jms-to-jms bridge feature in ActiveMQ, I recommend using the ActiveMQ component for Apache Camel instead. The jms-to-jms bridge has not been maintained in a while because the Camel component basically replaced it.

    ReplyDelete
  42. Hi Bruce,

    From your response to Deepak:

    @Deepak, If you are deploying each broker in a master/slave pair with six brokers in total, then you can certainly have client apps connect to all three. You will want to make sure that each client app uses the failover transport with a list of all the broker URIs.


    It's not clear to me what network connections need to be established when I have m1, s1, m2, s2, m3 and s3. Could you please tell me the networkConnectors on each master and slave? Should they be duplex? FYI, I'm trying to set up topology as shown on this blog http://edelsonmedia.com/?p=143

    ReplyDelete
  43. @javalearnerny, What I was referring to in that response was the use of the ActiveMQ failover transport. Using this transport from a JMS client will allow you to specify as many broker URIs as is necessary for your situation via a comma separated list of URIs. The failover transport will then connect to one of those ActiveMQ brokers via its URI. If that broker becomes unreachable (i.e., goes down) then the failover transport will automatically attempt to connect to another URI in the list. With the proper configuration for reconnect delays, retries, etc. on the failover transport, this can alleviate connectivity issues from the client side.

    Regarding your question about whether they should duplex -- this seems to be a confusion between master/slave config and a network of brokers. These are two wholly different concepts whereby the master/slave config is for high availability and the network of brokers is for message broker clustering purposes. The duplex option is available on a network connector when configuring a network of brokers. Whether or not you should use this is not clear to me based on your limited description.

    Regarding your reference to Holly's blog (she is a friend of mine and co-worker at SpringSource/VMware), what she describes in that blog post is a network of brokers where each broker is also using a master/slave config. Based on the comment in the element of the ActiveMQ config, I have to guess that she means that each broker defines it's own network connector for its two peer brokers on either side (see the diagram). An example might look like this:

    <networkConnectors>
    <networkConnector name="brokerA"
    uri="static:(failover:(tcp://hostB:61616,tcp://hostD:61616))"
    conduitSubscriptions="true"
    duplex="true"
    dynamicOnly="true"
    networkTTL="3"
    suppressDuplicateQueueSubscriptions="true" />
    </networkConnectors>

    This network connector would be defined in the activemq.xml config file for a broker whose brokerName is brokerA. Notice that it points to both hostB and hostD (where brokerB and brokerD respectively reside) but not to hostC. This is because the diagram shows a ring topology where each broker connects to only the broker on either side of it. Of course, this is a guess and based on assumptions, but hopefully it helps you out nonetheless.

    ReplyDelete
  44. hi Bruce,
    I went through the documentation and this blog several times. Could you please help me with the following:
    We are using ActiveMQ 5.5.1 and have decided to use Kahadb filestore (which is on SAN) for persistence.
    We have 2 brokers, 1 each on a different server. We are trying to achieve failover as well as load balancing. Could you please explain how is this possible with just one filestore and 2 brokers.
    With only one filestore (SHARED FILE SYSTEM MASTER/SLAVE) I understand that I can only have one broker as active(master) at any given time and the slave gets activated only when master is down.
    With a filestore on SAN, how do I achieve both failover as well as load balancing with 2 brokers. Could you please guide me?

    Thanks
    Suresh

    ReplyDelete
  45. @Suresh, First I need to understand what you mean when you use the term load balancing. Are you referring to the ability to balance the load of messages across multiple consumers? If so, then you just need to configure each consumer to use the failover transport which allows each consumer to specify the broker URL for both message brokers. Below is in an example of this:

    ...
    String brokerUrl = "failover://(tcp://broker1:61616,tcp://broker2:61616)?initialReconnectDelay=100";
    ActiveMQConnectionFactory connectionFactory = new ActiveMQConnectionFactory(brokerUrl);
    Connection connection = connectionFactory.createConnection();
    Session session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
    ...

    Notice that the brokerUrl uses the failover transport and inside it both broker URLs are specified. This tells the JMS client to connect to the first broker URL but if that one becomes unreachable, automatically reconnect to the next one. This will handle automatic failover between the master broker and the slave broker in the event that the master broker goes down.

    The one thing that you need to know about this scenario is that the master broker will not automatically be restarted. You will need to find a way to bring it back up using some sort of watchdog process. I often recommend using the daemontools for this purpose and its supervise utlity.

    ReplyDelete
  46. Hi Bruce,
    Thanks for your quick response.
    I should have asked my question in detail.
    I have 2 brokers A and B running on two different servers. I would like Broker A and Broker B to use same filestore while having a duplex connection between these 2 brokers.
    Enabling duplex connection will help to distribute the load between the two brokers (store and forward).
    I understand Failover can be achieved by clients specifying both A and B while making call.
    The challenge seems to be that how can I have these 2 brokers share the same filestore while both of them are running and active.
    Please let me know.

    ReplyDelete
  47. @Suresh, Your previous question mentioned master/slave which is why I provided the answer I did. What you are now asking is about creating a network of brokers between the two message brokers and having them share the same data store.

    Creating a network of brokers is certainly possible, but two active brokers cannot share the same data store. The reason for this is that the broker state is kept in the data store and if you point two active brokers at the same data store you will get unexpected results. The only way for two brokers to share the same data store is when configuring master/slave capabilities because only one of the brokers is up and running at any given time. In a network of brokers, each broker has its own data store.

    So you can configure master/slave between two brokers. You can also configure a network of brokers between two different brokers. But these two configurations are separate concepts and based on your questions of these concepts kind of overlapping, I think you're mixing the two together.

    ReplyDelete
  48. hi Bruce,
    great article. So we are implementing slightly different from what you have done.

    We have only one consumer talking to either
    amq1 or amq2 to receiver messages.
    amq1 is master and amq2 is slave. so if amq1 goes down, consumer successfully connects to amq2 and receives the messages. But only problem we have is we want consumer to connect back to amq1 after amq1 is up .

    how do we achieve that ?

    what about any messages leftover in amq1 or amq2 during failover, will those be processed ? means lets say 100 messages are left over in amq1 and fail over happens. Now comsumer is connected to amq2 , how can we make sure that those 100 messages on amq1 are processed and vice-versa

    ReplyDelete
  49. Hi Bruce,
    In the diagram that you have, is it possible to have one consumer that listens to messages from both amq1 and amq2 at the same time?

    ReplyDelete
  50. @Mohan, the way to achieve what you want is by using the updateClusterClients and rebalanceClusterClients features that I outlined in another blog post titled New Features in ActiveMQ 5.4: Automatic Cluster Update and Rebalance.

    ReplyDelete
  51. @Sunil, the only way to have a single JMS client listen for messages from two brokers concurrently is create a custom JMS client that makes a connection to each message broker manually. There is no inherent support for such a thing in JMS.

    ReplyDelete
  52. Hi Bruce,
    Thanks for you previous reply. It was very helpful.

    We are trying to group messages by groupId from 0 to n We set before sending the messages.

    My Spring defaultMessageListenerContainer spring configuration is like this:





















    This runs inside a stand alone java application. We have one instance running on each of the n machines.
    With this setting, the listener container on the machine 1 processes messages with JMSXGroupID='0' and so on for other machines.
    It works fine with n=2 in our stage environment i.e. the message with group Id 0 is processed by the consumer running on machine 1. Somehow with n=6 in the production environment, the message
    group ID is not working as expected. For example, if there is a message with a group Id of 3, it is getting
    processed by the listener on machine 5 which means that the JMSXGroupID is somehow not honored by consumers. Not sure what is
    going on.

    Thanks in advance,
    Sunil

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. My spring bean definition xml doesn't show up there. But basically we are using messageSelector property with value JMSXGroupID='n' to ensure that the message with a particular group Id is only processed by a particular listener.

      Thanks,
      Sunil

      Delete
  53. This comment has been removed by the author.

    ReplyDelete
  54. @Sunil, For what it's worth, to post XML to Blogger, you need to manually convert the < and > to ASCII character entities &lt; and &gt; More information can be found here:

    http://www.w3schools.com/html/html_entities.asp

    At any rate, your question sounds rather complex and ActiveMQ specific. For any complex situations, it always helps to construct a test case that is stripped down to the bare minimum so that it only focuses on the problematic area. Not only is this good to have when asking questions on a mailing list but oftentimes I work through my own problems by doing this.

    Because your question is ActiveMQ specific, I recommend that you post a question to the ActiveMQ user mailing list. You will find many people there that are more up-to-date on ActiveMQ than me. Information about subscribing can be found here:

    http://activemq.apache.org/mailing-lists.html

    ReplyDelete
  55. @Bruce: It was a nice blog. Can you help me with a clustering issue that I face with my AMQ Cluster configuration? I have posted it in the mailing list as below:

    http://activemq.2283324.n4.nabble.com/ActiveMQ-Clustering-Issue-td4655306.html

    I would appreciate if you could give me some ideas as to why my Consumer dies out rather than failing over to the slave?

    ReplyDelete
    Replies
    1. @joesan, It looks like you've gotten help on this issue via the mailing list.

      Delete
  56. Hi!

    Is it possible to set up AMQ failover so, that master is in one PC and secondary is on other PC?

    If yes, how should I configure?

    Thanks in advance.

    ReplyDelete
    Replies
    1. Sure, it's designed to be easy to set up each broker on a different machine. You simply configure the networkConnectors and transportConnectors to use IP addresses instead of the term localhost.

      Delete
  57. Hi Bruce,

    We've currently setup ActiveMQ 5.7.0 as a network of broker configuration with just 2 brokers. BrokerA has the networkConnector block commented out, BrokerB has the networkConnector listing BrokerA as shown below. We are experiencing the issue of queues which are "stuck" after some time...meaning, consumers are connected to the brokers but nothing is being consumed off the queue. The queue size just remains stagnant. We've configured our client's broker url as such --> failover:(tcp://nybeta01:7550,tcp://njbeta01:7550)?randomize=false&timeout=10000 ... The clients also use spring JMS and PooledConnectionFactory provided by activemq. We've searched numerous forums to see if there was anything else we could try. We've even introduced the conduitSubscriptions=false & enableAudit=false pproperties in the destination policy but that didn't seem to work either. Do you have any insight as to what could be the issue?




    ReplyDelete
    Replies
    1. Based on your description, it sounds vaguely like you may be experiencing consumer starvation. Here is a quote describing this problem from the ActiveMQ documentation from FuseSource:

      'If you are using a collection of consumers to distibute the workload (many consumers processing messages from the same queue), you typically want this limit to be small. If one consumer is allowed to accumulate a large number of unacknowledged messages, it could starve the other consumers of messages. Also, if the consumer fails, there would be a large number of messages unavailable for processing until the failed consumer is restored.'

      (Source: http://fusesource.com/docs/broker/5.4/tuning/GenTuning-Consumer-Prefetch.html)

      The solution here is to lower the prefetchLimit to either 0 or 1, the default for the prefetchLimit is 1000. Lowering the prefetchLimit to such a lower number tells ActiveMQ to only allow the consumer to prefetch either 0 or 1 messages and will result in faster processing of acks and therefore not allow messages to accumulate on a consumer.

      Delete
  58. Here is the networkConnector configuration -->

    <networkConnectors>
    <networkConnector
    uri="static:(tcp://njbeta01:7550)"
    duplex="true"
    conduitSubscriptions="false"
    dynamicOnly="true"
    networkTTL="2"/>
    </networkConnectors>

    Any suggestions?

    ReplyDelete
  59. Hi Bruce, Thanks for the last post regarding the prefetchLimit setting. We are using that in our setup and it has helped with the stuck messages. We are implementing another setup in which we have 4 machines (brokers) running as a Network of Brokers. Can you recommend a topology we can use where consumers/producers can connect to any one of the 4 brokers? We've started off with configuring the networkConnector for each broker to point to its neighbor as such A -> B -> C -> D -> A (a box pattern) with duplex=true for all the brokers. We're not sure if this configuration would help in the event of a broker failure (shutdown). Are we going towards the right direction?

    ReplyDelete
  60. Are you also utilizing the broker-side failover support (i.e., rebalancing of cluster clients)? There are some features to support broker-side failover that allows clients to move from one broker to another in the event of a broker failure.

    ReplyDelete
  61. Hi Bruce -- Yes, we are making use of the broker-side failover options as well...do you have any other suggestions as per our current box pattern setup?

    ReplyDelete
    Replies
    1. Because each environment is different, it is difficult for me to say if this is going to work best for you. The only way to know for sure is through comprehensive testing using the exact type of traffic that this architecture will experience in the production environment. Never, ever roll out a system to a production environment with a question in your mind about how the system will perform.

      Delete
  62. Reconnect-mechanism for JMS consumer is not working

    ReplyDelete
  63. we are using message-driven bean to process messages and using container to receive messages. When Application Server is started, for each deployed message-driven bean, its container keeps a connection to the JMS provider. When the connection is broken(ActiveMQ is stopped), the container is not able to receive messages from the JMS provider and, therefore, is unable to deliver messages to its message-driven bean instances.
    This is the reason we had to take restart of application when active mq was restarted.

    ReplyDelete
    Replies
    1. This sounds like perfect question for the ActiveMQ user mailing list where many ActiveMQ experts hang out. Information on subscribing is available here:

      http://activemq.apache.org/mailing-lists.html

      Delete