Where I work we ETL our data into our DB2 data pile via the the use of bash scripts that I wrote that wrap a reasonably handy Java based data tool, that look at an DB2 Express C database that records the jobs running and meta data about the jobs. It has its issues, mainly around the limitations of bash and the amount of errors it can handle at any given time. The other limitation is around the Java data tool, that we have now lost the developers to maintain this tool and sort out bugs.
DB2 ETL Solutions
When you Google for “DB2 ETL Tools” then the first hit is on Developer Works which by the title (ETL solutions for IBM DB2 Universal Database) you think great there’s going to be a list on here, but instead it just points you too the EXPORT, LOAD and IMPORT commands and some very basic script examples of how to get these to work. The next few links are not that good either and googling around the subject you get to various products, non of which are designed to work with DB2 other than through a Java or worse ODBC driver, apart from IBM Data Stage which is very expensive.
The Question
I think to myself after seeing this then “How does everyone else do it?”. Having come from a SQL Server background with SSIS and DTS (nearly 4 years ago now) then initially I was shocked at how few tools there are for DB2, without using some verbose logging in text file and emailing you the content, this leads me to think “This is not the way the wide world can do it?”. I have over 400 ETL jobs where ~85% of them need to run everyday, therefore with another developer that has now left we deigned a way to load and record the loading of the jobs. He has now left and support for the tool he made is now becoming harder to do and even when it is it rarely goes smoothly.
So my question to the community is:
Future Posts
Looking at the results of this survey and considering what to do next to move away from our Java tool