Saturday, March 14, 2015

Creating a Pentaho BI server cluster

Note - This is applicable to Pentaho BI server community edition 5.x only.

Pentaho BI server provides a large set of features which are essential for  BI applications. To use this in production we might need to create a CDA cluster to maintain high availability as well as load balancing.
To create a cluster we need to configure BI server instances to use a common data source to store configurations. I configured the following setup for this.



Follow these steps to create the cluster.

  1. Install MySQL servers and setup master master replication.
  2. Make sure you have installed Oracle Java 7 in all nodes. Using other java versions will cause runtime errors.
  3. Follow this document to to install a CDA instance. Make sure to follow the document named "Install with Your Own BA Repository" and follow the configurations related to MySQL. 
  4. Start the server and install all the components needed.
  5. Modify the cluster documentation as mentioned in this document. https://help.pentaho.com/Documentation/5.2/0P0/000/060
    1. This document is missing the information related Quartz clustering with MySQL. Only PostgresQL configuration is there. Use the following configuration instead.
      1. #_replace_jobstore_properties
        org.quartz.jobStore.misfireThreshold = 60000
        org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
        org.quartz.jobStore.useProperties = false
        org.quartz.jobStore.dataSource = myDS
        org.quartz.jobStore.tablePrefix = QRTZ5_
        org.quartz.jobStore.isClustered = true
        org.quartz.jobStore.clusterCheckinInterval = 20000
    2. When configuring Jackrabbit clustering replace unique ID  in<Cluster id="Unique_ID"> with a ID like CDA1.
  6. Recompress the CDA folder and copy it to all other nodes.
  7. Extract the file and replace unique ID  in<Cluster id="Unique_ID"> accordingly. eg- CDA2, CDA3
  8. Start each node and make sure there are no error logs in tomcat/logs/pentaho.log.
  9. Make sure all CDA changes are replicated between cluster nodes.
  10. When configure the ELB make sure to enable sticky sessions. http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/US_StickySessions.html.
  11. Alerts should be set to monitor CDA instances as well as MySQL replication.

4 comments:

  1. hai Prabhath.
    Can Pentaho CE configure for HA?

    ReplyDelete
    Replies
    1. Hi Prabhath,
      Which ELB did you use for Pentaho CE?

      Delete
  2. Good Job Prabhath, You wrote some valuable information on Pentaho BI Server for community edition.

    Get some more details on Pentaho at Pentaho Consulting

    ReplyDelete