Hoarding Data: October 2013

Thursday, October 17, 2013

ODI 12c debugger

Debugging ODI is hard

Why is it so hard? Well, the following are a few reasons. ODI has tools such as OdiStartScen that will let an ODI Session launch another ODI Session on a wholly different agent. ODI interfaces (mappings in 12c) can use DB-native utilities for pumping data. This means that ODI does not have a view into the data being transferred. In ODI 12c we have introduced multi-threading in ODI Tasks. This means that like ODI Loadplans, ODI Tasks can be run in serial or parallel mode.

What have we done then?

While designing and implementing the debugger was a blast, it was disheartening to succumb to some limitations. However there are not that many of them. Just to bring it up in front, ODI debugger will let you suspend the execution flow before/after an ODI Step or before/after an ODI Task. Breakpoints can be set on design-time artifacts like Package or Procedure or on runtime artifact - ODI Blueprint.

Know your Sessions

The problem of distributed debugging is solved for the client by ODI detecting a new Session launch that is associated with a parent Session and making the client aware of the same. A caveat here is that the client must be actively debugging the parent Session. Otherwise there is no place for us to send the notification. So the good thing is that the client gets a transparent debugging experience. Don't worry, client has the option to not connect to the child Session. There are options so that even upfront you can say that the child Session is not candidate for debugging. In this case child Session is allowed run on and client cannot debug it.

See my data before or after

Data debugging is only allowed at the beginning and ending of an ODI Task. The fact that was mentioned earlier viz. ODI may use native utilities to pump data in either direction prevents ODI from providing a transfer-by-transfer data debugging facility. However, for most usecases, the current data debugging facility will be more than sufficient, since at the end of the ODI Task we will be letting the client see uncommitted data. Note that this is a view-only tool. No modification of data is allowed. You may think of this as a limitation, but there is a way around it. ODI Debugger allows JIT modification of code. For example, if you suspend execution before an ODI Task and then modify the code of the Task, you can push that code to the repository. So when ODI gets to the point of actually executing that Task it is your modified code that will get executed. How can this help you? Well, sprinkle some dummy ODI Tasks in your ODI flow. Then when and if you see problems with your data and the next Task is a dummy Task, pause at the beginning of the Task and insert code to modify your data.

ODI threads not Java threads

ODI debugger also transparently takes care of multi-threaded tasks for the client. It provides you with a view of the threads of execution that are tied to the current ODI Session. You can see the execution stack of each of these threads and also open the Session viewer to see the precise point at which the execution has been paused. Right now, if there are multiple threads of execution in a Session and more than one of threads has been paused it is a bit difficult to make out from the Session editor's Blueprint tab as to which suspend point belongs to which thread, but it is a minor point.

See your variables' state

You will also be glad to see a variables view. Historically ODI variables have had a lot of bugs associated with it - be it actual bugs or user perception problem. The debugger can help here to a large extent in that it lets you see the ODI Variables and updates the displayed values as and when they change. Not only that the debugger will even let you dynamically modify the value of the Variable. That should be a pretty powerful tool.

So there you have it - a feature-rich debugger for ODI that should help customers and DW developers immensely.

Wednesday, October 9, 2013

ODI JMS (XML) data server peculiarities

ODI JMS data servers

ODI has JMS Queue, JMS Topic, JMS XML Queue and JMS XML Topic drivers. The Queue/Topic drivers are slightly different from XML Queue/Topic drivers. Of course the obvious difference is that the former is for Delimited/Fixed width message types and the latter is for XML message types. In addition to this there is one more subtle difference.

Go on and create a new JMS Queue data server. Give it some name and then switch to the JNDI tab. You will see drop-down for authentication type, text fields for user name and password, drop-down for JNDI protocol, text fields for JNDI Driver (the fully qualified name of the InitialContextFactory class for your JNDI server), JNDI URL and JNDI Resource.

Gotcha 101

Many people mistakenly think that JNDI Resource is where you put the JNDI name of your Queue/Topic. This is wrong. This field is for entering the JNDI name of your JMS connection factory. It can be a custom connection factory created on the JMS server by you or an in-built one for the JMS server. If you are using Weblogic you will have the following connection factories available without any configuration : javax.jms.QueueConnectionFactory, javax.jms.TopicConnectionFactory, weblogic.jms.ConnectionFactor and weblogic.jms.XAConnectionFactory. This concludes all the configuration you can do the data server.

Gotcha 102

Did you notice anything funny? Even on the Physical or Logical schema you have no place to provide the name of JMS Queue/Topic. For this you actually have to jump through another hoop. It will be explained below. But first think about the impact of this limited configuration. You have provided JNDI URL to your JNDI server, credentials, authentication mechanism and connection factory, but no Queue/Topic information. This means that when you press that nice helpful 'Test connection' button all it is testing is creation of the InitialContext object. It does not test actual connectivity to the Queue/Topic.

Now let us say you want to connect to your Queue. Create a Model baased on the Logical Schema of your JMS Queue data server. Again, you cannot reverse engineer a JMS Queue/Topic. So you have two options :

create a Datastore under this model and define the column structure by hand or
make a copy of a Delimited/Fixed width File Datastore

After the Datastore has been created there is one more thing to do : there is a text field on the Definition tab of the Datastore named 'Resource name'. For a File Datastore this would point to the actual file. This is where you enter the JNDI name of your JMS Queue/Topic.

Phew! That was a tortuous setup.

What about JMS XML Queue/Topic? Now that has some differences. For example the JNDI URL takes a key-value pair where the key is JMS_DESTINATION and the value is the JNDI name of your Queue/Topic. So all the JMS configuration is in one place. Also, a JMS XML Model can be reverse engineered.
You may now be saying to yourself 'All right!!!!!! Let me make all my JMS Queue/Topics XML-based messages'. Hold your horses, because there is a down side : you cannot right-click a JMS XML Queue/Topic Datastore and do 'Data' or 'View data'. It is not supported.

So there you are. A little the worse for wear owing to this trip over ODI JMS. Wait for the next few blog posts that will reveal to you some more hidden pitfalls and capabilities.