/ pentaho

PDI JNDI Configuration Part 1

The Pentaho Data Integration toolset is a Java-based application, and hence, it is not surprising that it offers the Java Database Connectivity (JDBC) as one of the principle ways of connecting to a large selection of different database servers (including Oracle, MySQL, PostgreSQL, and MS SQL Server).

Not having a centralised configuration of your database configuration makes ongoing maintenance a much higher effort, particularly if your data analytics or data source environment changes. You have to change each individual transformations database connection details. Not good, particularly in a large, complex build.

Using the Java Naming and Directory Interface would however, be a more maintainable option over the longer term for all your PDI data projects. It offers a centralised database connection configuration choice which ensure that any database server or account maintenance can be maintained in one location. You can trust me on this one, as I've been there.

Not having a centralised configuration of your database configuration makes ongoing maintenance a much higher effort, particularly if your data analytics or data source environment changes. You have to change each individual transformations database connection details. Not good, particularly in a large, complex build.

When I was looking into this topic, I came across a blog post at the kettle-pentaho blogger site, which was a good start, however I believe a bit more elaboration is in order for the type of readership I am writing for.

I will cover using the PDI JNDI in three posts:

  1. The first post covers configuring the PDI JNDI properties
  2. The second post second post describes how to configure a JNDI connection within Spoon and
  3. A third one address how to troubleshoot a PDI JNDI configuration that is not working

Note that this post has assumed you have already downloaded the database vendor's JDBC jar file and copied it to the pdi-install-directory/lib directory. Of course, it also assumes you have the database connection details to which you wish to connect (including the server's IP or DNS, the database port, database name and the account details).

Configuring the PDI JNDI properties

JNDI functionality within PDI is realised through the support of of the simple-jndi library. Configuration is performed by editing the jdbc.properties file located within the directory pdi-install-directory/simple-jndi

When opening this file you will be presented with a number of entries already entered as part of the standard PDI application. There are five key-value properties that need to be entered in order for a JNDI connections. Each of the five entries have the convention of:


jndi-connection-name/type=javax.sql.DataSource
jndi-connection-name/driver=java-package-path-of-library
jndi-connection-name/url=the-jdbc-url-components
jndi-connection-name/user=db-user
jndi-connection-name/password=db-password

Of these key-value pairs, the most complex one is the JNDI URL. This is due to the URL being a multipart value including the JDBC driver, database vendor, database host, port number and database name. So, the pattern is typically something like this

jdbc:[connection-api]:[vendor-database-name]://[database-ip-or-dns-entry]:[port-number]/[databaseName]

Anyone who comes from a Java development background or who has configured JDBC connections would be familiar with these keys.

Single instance databases on default ports

Where your organisation is using a server to host a single version of a database, then the configuration is straightforward, especially so if the port number used is the standard one associated with the vendor. The two configurations below are for PostgreSQL and Microsoft SQL Server. (Note that the default port number is not entered.)

PostgreSQL


myJNDIname/type=javax.sql.DataSource
myJNDIname/driver=org.postgresql.Driver
myJNDIname/url=jdbc:postgresql://localhost:/myDBname
myJNDIname/user=myUserName
myJNDIname/password=myPassword

MS Sql Server


myJNDIname/type=javax.sql.DataSource
myJNDIname/driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
myJNDIname/url=jdbc:sqlserver://localhost:/myDBname
myJNDIname/user=myUserName
myJNDIname/password=myPassword

The configuration above can be written as the example below where an explicit entry of the default port numbers of the respective database engines. The result is the same as that above.

PostgreSQL


myJNDIname/type=javax.sql.DataSource
myJNDIname/driver=org.postgresql.Driver
myJNDIname/url=jdbc:postgresql://localhost:5432/myDBname
myJNDIname/user=myUserName
myJNDIname/password=myPassword

MS Sql Server


myJNDIname/type=javax.sql.DataSource
myJNDIname/driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
myJNDIname/url=jdbc:sqlserver://localhost:1433/myDBname
myJNDIname/user=myUserName
myJNDIname/password=myPassword

Non-Default Ports and Multiple Database Instances

In some organisations, a single instance database may be run on a non-default port, or the host server may have multiple versions of a database installed, having instance names listening on non-default port numbers. In this case, JNDI uses the port number rather than the instance name to connect to the database. An example for MS SQL Server is illustrated below.


myJNDIname/type=javax.sql.DataSource
myJNDIname/driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
myJNDIname/url=jdbc:sqlserver://localhost:3901/myDBname
myJNDIname/user=myUserName
myJNDIname/password=myPassword

Restart Spoon

Be aware that after you have configured the JNDI configuration in the PDI jdbc.properties file, you need to restart the Spoon in order to reload the JNDI configuration into the PDI environment.

This competes Part 1 of our PDI JNDI configuration. Look to Part 2 where the configuring of the database connection in Spoon is explained.

PDI JNDI Configuration Part 1
Share this