Talend: tips and tricks part 1

This blog contains some convenient tips and tricks that will make working with the open source tool Talend for data integration a lot more efficient. This blogpost will be especially useful for people who are just discovering this amazing tool, yet I am sure that people who have been using it for a while will also find it very helpful. These series of tips will be spread over multiple blog entries so make sure to check back often for future tips!

1. Testing expressions in the tMap component

Using the tMap component, you have the possibility to test your expressions. This way you can easily see whether or not the result is what you expected it to be. You can also use this to determine whether or not your expression will error. Let’s create an example.

We’ve got details of employees as input for our tMap. We would like the first name to be shown in uppercase. First of all, go into the expression builder by clicking the ellipsis next to your expression.

Ellipsis expression builder

To convert the first name to uppercase, we have to use the StringHandling function “UPCASE”. This will result in the following expression: StringHandling.UPCASE(employee.First_name)

After you’re done filling in test values, click on the “Test!” button and wait for the result. If everything goes as expected, you should see your first name in uppercase on the right side of the window.

Test expression builder tMap

2. Optimizing the appearance of the tLogRow component output

tLogRow is one of the most frequently used components. It is recommended that you learn how to optimize its use. Firstly, make sure that you always have the right appearance selected for your output. You can find this property in the basic settings of your tLogRow-component.

tLogRow modes

There are three types of Modes that you can choose between:

  • Basic

Basic will generate a new line for each record, separated by the “Field Separator” you’ve chosen (see image above). When using basic mode, I highly recommend to check the “Print header” option when working with multiple column records or multiple outputs, purely for visibility reasons.

basic mode output

  • Table (print values in cells of a table)

The table mode shows the records and their headers in a table-format, including the name of the component that generated this output (in our case: “tLogRow_1”). This emphasizes the importance of properly naming everything, especially when you have multiple components that generate output. In this case, it would have been better to rename our component to “EMPLOYEES”. Personally, I prefer this mode.

table mode output

  • Vertical (each row is a key value/list)

Vertical mode will show a table for each one of your records.

vertical mode output

The output mode you decide to use depends on what you’re trying to visualize. For example, when your goal is to show a single string, I would recommend using the basic mode. But when you have multiple table outputs (for example: departments, customers and employees in a single output), I’m certain the table mode would be the best option.

Sometimes your data is spread over multiple lines, resulting in an unclear output, like shown in the image below.

output with wrap

To force the output to put all the data on one single line, you can uncheck the “Wrap” option. This option is located underneath your output and will enable a horizontal scrollbar.

output without wrap

Do you also want to be able to get data regarding tweets using Talend, as shown in the image above? Read my previous blogpost and find out how!

3. Resetting windows and maximizing/minimizing them

Sometimes you accidently close a window and have a hard time finding a way to get it back. You can very easily reset your environment by clicking on “Window” – “Reset Perspective”.

reset perspective

You can see all of the views by clicking on “Windows” – “Show View” – “Talend”. Some of the views are not shown by default, such as “Modules”. Modules can be used to import .jar-files without having to restart your studio, which will most likely save you some time.

Lastly, because Talend is Eclipse-based, you have the possibility to maximize and minimize windows. I personally use this function when examining the output of a tLogRow-component including a lot of data. You can achieve this by either double-clicking on the window or by right-clicking on it and selecting “Minimize”/”Maximize”.

That’s it for now. I hope you enjoyed reading this blog and make sure to return soon for future blogs!

New in Java 8 : Default and static methods in interfaces

Default method’s  (aka Defender methods) in interfaces are new in Java 8. They enable you to define a default implementation of a method in the interface itself.

If an interface is implemented by several classes, it’s hard to add method’s afterwards, as it will break the code and require all implementing classes to define the method as well. Adding a method to the interface, and defining a default implementation for it, will resolve this problem.

Here’s a code example :


public Interface Draw {

public void drawCircle();

   default public void drawRectangle() {

      System.out.println("draw a rectangle");

   }

}

Implementing classes that have not defined the drawRectangle() method, will print “draw a rectangle” when drawRectangle() is executed on them.

Interfaces that extend this interface can

  • define nothing, in which case the method will be inherited
  • declare the default method again with no implementation, which will make it abstract
  • Redefine the default method so it get’s overridden

These default methods were added to Java in order to be able to implement the new Streams API. As they needed to update the Collection interface, adding the stream() and parallelStream() methods. If they didn’t had the default method, they should have updated all classes that implement the Collection interface.

Static methods

Also new in Java 8 is the use of static method’s in an Interface.

So now, drawRectangle()  could also be defined as a static method, but that would give the impression that it is a utility or helper method, and not part of the essential core interface. So in that case, it’s better to go for the default method.

You could argument that an abstract class would have done the job as well. But as Java has single inheritance, that choice would narrow down our the design possibilities. And as the poster above your bed is shouting every day : ‘Favor composition over inheritance!!’ right ? So we want to avoid inheritance anyway.

So what will happen if you try to implement 2 interfaces with the same default methods ? Well, you will get the following compile time error :

Duplicate default methods named [methodname] with the parameters () and () are inherited from the types [interface1] and [interface2]

To avoid this error, choose an implementation of one of the interfaces :

interface Draw{
   default void circle() {
     System.out.println("draw circle");
   }
}
interface Print{
   default void circle() {
     System.out.println("print circle");
   }
}

class MyClass implements Draw, Print {
   @Override
   public void circle() {
     Draw.super.circle();
   }
}

That’s it, a quick overview of this new feature in Java 8.

Use of contexts within Talend

When developing jobs in Talend, it’s sometimes necessary to run them on different environments. For other business cases, you need to pass values between multiple sub-jobs in a project. To solve this kind of issues, Talend introduced the notion of “contexts”.

In this blogpost we elaborate on the usage of contexts for easily switching between a development and a production environment by storing the connection data in context variables. This allows you to determine on which environment the job should run, at runtime, without having to recompile or modify your project.

To start using contexts in Talend you have two possible scenario’s:
1) you can create a new context group and its corresponding context variables manually, or
2) you can export an existing connection as a context.
In this example we’ll go over exporting an existing Oracle connection as a context.

Double click an existing database connection to edit it and click Next. Click Export as context

Image

NOTE There are some connections that don’t allow you to export them as a context. In that case you’ll have to create the context group and its variables manually, add the group/variables to your job, and use the variables in the properties of the components of your job.

After you’ve clicked the Export as context button you’ll see the Create/Edit context group screen. Enter a name, purpose and description and click Next.

Image

Now you’ll see all the context variables that belong to this context group. Notice that Talend has already created all the context variables that are needed for the HR connection. If you want to change their names you can simply click them and they become editable.

Click the Values as table tab.

Image

In the Values as table tab you can edit the values of the context variables by simply clicking the value and changing it. To add a new context, click the context symbol in the upper right corner.

Image

The window that pops up is used to manage contexts. To create a new context, click New, enter the name of the context, in our example Production, and click Ok. To rename the Default context, select it, click Edit, enter Development and click Ok. When you’re done editing, click Ok.

Image

After the window closes, you’ll see that an extra column appeared. Enter the connection data of the production environment in the Production column and click Finish.

Image

In the connection window it’s possible to check the connection again, but this time you’ll be prompted which connection you want to check.

Image

Verify that both the connections work and click Finish.

Now that we’ve exported the connection as a context, it’s possible to use it in a job. Create a new job, use the connection that has been exported as a context and connect it to a tLogRow component. Your job should look something like this

Image

When using a connection that has been exported as a context in a job, you have to include the context variables in order for your job to be able to run. Go to the context tab and click the context button in the bottom left.

NOTE When using one of the newer versions, Talend proposes to add missing context variables whenever you try to run a job, because of this you don’t need to add them manually as described in this example.

Image

Select the context group that contains the context variables, in our case the HR context group.

Image

Select the contexts you want to include and click OK

Image

NOTE A context group can also be added to a job by simply selecting the context from the repository, dragging it towards the context tab of the job, and dropping it there.

Once you’ve added the context group to the job, it’s possible to run the job for both the development and production environment by selecting the context in the dropdown menu of the Run tab.

Image

Introduction to Websockets and JSON-P API in JEE7

Websockets (JSR 356) and the JSON-Processing API (JSR 353) are both introduced in the JEE7 specification. Together with JavaScript an HTML5, they enable web applications to deliver a richer user experience.

Websockets allow you to communicate bidirectional and full duplex over TCP, between your server and different kind of clients (browser’s, JavaFX… ). It’s basically a push technology, where, for example events or data originating from the server or a client, can be pushed to all the other connected clients.

In our demo , JSON strings are send between client and server, so that’s where the JSON Processing API comes in. It’s a  portable API that allows you to parse, generate, transform and query JSON by using the streaming or model API. But you could also send XML or any other proprietary format.

Serverside components

  1. A java class annotated with
    @ServerEndpoint(value=”/endpoint”, decoders=EncodeDecode.class, encoders=EncodeDecode.class)
    with following method annotations :
    @OnOpen : when connections is open
    @OnMessage : when a message comes in
    @OnClose : when a message is closed
  2. A java class that encode/decodes the message from/to JSON and Java object. (That’s where the JSON-P API comes in).

Clientside component

An html file that contains JavaScript to communicate with our server endpoint. Communication is done through a WebSocket object, declared as follows :

connection = new WebSocket(‘ws://localhost:8080/mywebsocket/endpoint’);

will trigger the @OnOpen method of our server side endpoint.

connection.onmessage : fired when a message comes in

connection.send : will trigger the OnMessage annotated method of our endpoint

connection.close : will trigger the OnClose annotated method of out endpoint

Demo

It’s a screen that sends messages to all the connected clients, including itself. When the client opens a connection on the server, his session is added to a list of active sessions. When a client sends a message to the server, it is distributed to all the sessions in the list. When the client closes his browser tab or window, his session is removed from the list. The data that we send, can be any complex JSON or XML model. To keep it simple, we just send a simple string.

This application needs to be deployed on a JEE7 compliant servet. So at this moment (May 2014) it will only run on Glassfish 4.0 or WildFly 8.

The war file can be found here. After deployment, open url (for Glassfish) http://localhost:8080/mywebsocket/socket.html.

 The Code

Java endpoint

package be.iadvise.mywebsocket;

import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

import javax.websocket.EncodeException;
import javax.websocket.OnClose;
import javax.websocket.OnMessage;
import javax.websocket.OnOpen;
import javax.websocket.Session;
import javax.websocket.server.ServerEndpoint;

@ServerEndpoint(value="/endpoint", decoders=EncodeDecode.class, encoders=EncodeDecode.class)
public class MyEndPoint {
 // contains list of active sessions
 private static List<Session> sessions = Collections.synchronizedList(new ArrayList<Session>());

 @OnOpen
 public void onOpen (Session s) {
 sessions.add(s);
 System.out.println("Open session : no of sessions = "+sessions.size());
 }

 @OnMessage
 public void onMessage (MyMessage msg, Session s) throws IOException, EncodeException {
 for (Session session : sessions) { // loop over active sessions and send the message.
 session.getBasicRemote().sendObject(msg);
 }
 }
 @OnClose
 public void onClose (Session s) {
 sessions.remove(s); // remove session from the active session list.
 }
}

Java Decode/Encode message

package be.iadvise.mywebsocket;

import java.io.Reader;
import java.io.StringReader;
import java.io.StringWriter;

import javax.json.Json;
import javax.json.JsonObject;
import javax.json.JsonReader;
import javax.json.stream.JsonGenerator;
import javax.websocket.DecodeException;
import javax.websocket.Decoder;
import javax.websocket.Encoder;
import javax.websocket.EndpointConfig;

/**
 * This class will encode/decode the messages from/to the client.
 * Decoder : from client to server -> converts the JSON to MyMessage object
 * Encoder : from server to client -> converts MyMessage object to JSON
 *
 * We are using JSON, but you can use XML or any other format.
 */
public class EncodeDecode implements Decoder.Text<MyMessage>, Encoder.Text<MyMessage> { 

 @Override
 public MyMessage decode(String txt) throws DecodeException {
 Reader reader = new StringReader(txt);
 JsonReader jsonReader = Json.createReader(reader);
 JsonObject object = jsonReader.readObject();
 String text = object.getJsonString("text").getString();
 return new MyMessage (text);
 }

 //Check if decode is possible. If not, return false
 @Override
 public boolean willDecode(String s) {
 System.out.println("Will decode asked for " + s);
 return true;
 }

 @Override
 public void init(EndpointConfig config) {
 System.out.println("init called on chatdecoder");
 }

 @Override
 public void destroy() {
 System.out.println("destroy called on chatdecoder");
 }

 @Override
 public String encode(MyMessage object) {
 System.out.println("I have to encode " + object);
 StringWriter sw = new StringWriter();
 JsonGenerator generator = Json.createGenerator(sw);
 generator.writeStartObject();
 generator.write("text", ((MyMessage)object).getText());
 generator.writeEnd();
 generator.flush();
 String answer = sw.toString();
 System.out.println("I encoded an object: " + answer);
 return answer;
 }
}

Java message


package be.iadvise.mywebsocket;

public class MyMessage {
private String text;
public MyMessage(String text) {
super();
this.text = text;
}
public String getText() {
return text;
}
public void setText(String text) {
this.text = text;
}
@Override
public String toString() {
return "MyMessage [text=" + text + "]";
}
}

The html file


<html>
<head>
<script language="javascript">
var connection;
var me;
function openSocket() {
connection = new WebSocket('ws://localhost:8080/mywebsocket/endpoint');
connection.onmessage = function(evt) {
var x = JSON.parse(evt.data);
mytext = x.text;
var chld = document.createElement("p");
chld.innerHTML = mytext;
var messages = document.getElementById("messages");
messages.appendChild(chld);
}
}

function talk() {
var txt = document.getElementById("msg").value;
var message = {
'text':txt
};
connection.send(JSON.stringify(message));
}
function closeSocket() {
alert('closing socket')
connection.onclose = function () {}; // disable onclose handler first
connection.close();
}
</script>

<script type="text/javascript">
if (window.addEventListener) { // all browsers except IE before version 9
window.addEventListener ("beforeunload", closeSocket, false);
}
else {
if (window.attachEvent) { // IE before version 9
window.attachEvent ("onbeforeunload", closeSocket);
}
}
</script>
</head>
<body onLoad="openSocket();">
<p>
SimpleWebSocket
</p>
<!-- <table id="chatbox" style="display:none"> -->
<table id="chatbox">
<tr><th width="400">messages</th></tr>
<tr>
<td width="400" id="messages">
</td>
</tr>
<tr>
<td>
<input type="text" id="msg"/>
<input type="submit" value="send" onclick="talk(); return false;"></input>
</td>
</tr>
</table>
</body>
</html>

 Conclusion

Websockets are a huge improvement for building rich applications. This is the first time that push technology is actually build in the JEE framework. Before that, we had to use polling or other techniques in order to get the same results. In this blog, I showed that you don’t need much code to start off. Once you get this working, you can gradually go further building more complex sockets.

 

Doxxy 1.2 has been released

Today we have great news for you: Docufy becomes Doxxy !
And not only the name improved!

Doxxy is a RAD-tool for generating operational reports. With its intuitive APEX UI, you easily configure your documents by adding DOCX-templates and SQL-queries. The engine is written in PL/SQL, which makes installation, integration and maintenance very straight forward. The tool comes as a packaged application for APEX 4.x.

The main concepts and principles are still the same:

  • Simple architecture and installation
  • User-friendly RAD-tool
  • Gathering data via MS Word templates
  • Datasets via SQL statements
  • Generation of DOCX documents
  • Easy integration with the development software of your choice
  • Master-detail structures possible

On the occasion of APEX World of last month, we released Doxxy 1.2. This version includes some interesting new features.

What is new?

First of all, Doxxy is a tool for developers: for APEX developers … surely, but in fact for anyone who is developing against an Oracle database and who needs a printable output. Until now, the reporting engine generated a .DOCX file as printable document. In version 1.2 there is an extra option available which makes it possible to have a PDF-document as output.

Other new features we added to the product are:

  1. Possibility to add some PL/SQL logic at the beginning or at the end of the generation process.
    Possible use-cases can be:
    a) set an Oracle context with a language indicator at the beginning of a report, or preparing your data in temporary tables to make the querying more easy.
    b) At the end you may use it for updating a print-status or – flag on given records.
  2. Performance optimalization for documents with a lot of content or with a lot of IF-statements
  3. The export –and import mechanism is XML based. It is now also possible to export/import multiple documents from a given folder in one run.
  4. Easy search-box to quickly find a document in the object tree
  5. Template visualization and validation: when you do an upload of a template, the system does some basic validations on the ‘formal’ content of the template, especially on the names of the tags.
    From within the Doxxy-UI you may also visualize the formal structure of your templates. Errors are visually emphasised in red.
  6. Simplified mechanism to include images (coming from a BLOB-column) into the report-output.
  7. Extra page to maintain your doxxy-specific private synonyms.

Give Doxxy a try and request a free trial,

Follow @doxxyNews on twitter.
Website: www.doxxy.eu.

OGH APEX World 2014

Last week we attended the the 5th annual APEX World event in Zeist. As every year it was very nice to meet the growing APEX community in the Benelux, combined with some excellent APEX international and dutch presentations.
The  keynote was given  by Joel Kallman about APEX 5.0 followed by 18 very interesting sessions about customer business cases, technical developments and international presentations by APEX specialist from all over the world.

APEX 5.0

The key focus in the new APEX 5.0 is improved developer productivity.oracle apex page designer
The page builder is completely new. Through this interface developers will be able to do more in less time and most important, in fewer clicks. With a properties sidebar on the right side of the screen it will be possible to quickly change elements and regions on a page, even multiple elements at the same time!  Regions and items can be created through drag and drop which increases the development speed.

Other new features

Tabs
Improved tab navigation. The current tab system isn’t user friendly enough, so it’s better to use lists. Now you can create new pages and define their hierarchy in the application. When this is done, an automatic tab will be created with dropdown submenus to display the hierarchy.

Interactive reports
Two important improvements for interactive reports. First and foremost it’s possible to have multiple interactive reports on one page, something we’ve all been waiting for since APEX 4.x. And secondly there is a new format function to pivot your report. Joel Kallman presented this feature: in a couple of clicks he created a nice pivoted table on the screen.

jQuery Mobile integration
With jQuery Mobile your SQL reports will have the possibility to be responsive. You have the option to:
a) only display the most important columns on a small screen, or
b) to switch to some kind of single record view. The result is something similar to what you can see here: http://elvery.net/demo/responsive-tables/

Modal popup
Instead of using a plugin to let your pages open in a modal window, users can now set this feature as a property of the page. Whenever the user navigates to this page, it will open in a modal window.

Be sure to take a look at the APEX early adaptor: apexea.oracle.com

 

Presentations

After the APEX 5.0 demonstration, there were 3 parallel tracks, all with very different and interesting sessions.  Read our impressions …

Going public with your APEX application
FOEX brought this presentation very well. Their problem scenario was the following one: If you want to make a public APEX application, you are always stuck with the typical APEX URL like “apex/f?p=100:1:5039230103::::”. During the demo they showed how to create a nice and readable URL like “apex/demo/customers”. To accomplish this they used aliases, REST services, PL/SQL and a few lines of javascript.

The best of both worlds: going hybrid with your mobile APEX application
Roel Hartman gave a presentation about Phonegap in combination with APEX. He showed a nice demo on how to sync the contacts from a database with the ones from his cell phone through a Phonegap App. It was surprising how easily this could be setup without too much code and in-depth knowledge. He used REST services to sync the data between APEX and his cellphone.

Using AngularJS in oracle applications express
Dan McGhan of Enkitec (USA) brought a technical session about combining AngularJS and APEX. He showed us a single page application containing a to do list with advanced calendar features. The end result was very nice and the demo illustrated the power of AngularJS, but it certainly requires some time to understand this framework. Maybe an interesting idea is to include AngularJS natively in APEX 6.0?

A B2B weboracle apex b2b webshop - tuur hendrickxshop with APEX!
iAdvise did two presentations. The first one dealt with a B2B webshop we developed in APEX for Billiet. Justine Ghekiere gave a brief introduction about the core business of her company, Biliet. Our colleague Tuur Hendrickx showed a lot of features he implemented in the webshop with APEX. Topics he show-cased were:  special advertisements, restricted products for different customers, the use of a shopping cart and a stunning layout were demonstrated.

APEX & HTML5
We also attended a nice presentation of Martin Giffy D’Souza about APEX and HTML5. He showed the advantages of HTML5 and the typical use cases in APEX. During a live demo he showed how to record a video within APEX and stream the feed to another frame in the same screen. Really impressive!  Also nice to see was how easily it is to implement voice recognition by using HTML5.

Dutch immigration services (IND) monitor xml messages with oracle apex
A department of the Dutch government has built an application which provides residence permits to immigrants or refugees. Before they could start building the APEX application there was a lot of effort necessary in the Oracle database for dealing with all the XML files. It was not just a problem with the size of the XML files, but there were also issues with differences between Oracle 10.2 and 11.2 in the way the database handles XML files.

Reporting solutions for oracle APEX – choose your weapons
During this session Dietmar Aust gave us an overview of possible reporting solutions  for APEX applications. Many solutions were covered in an objective way:  BI Publisher, Jasper Reports, Apache FOP, APEX PDF printing, PL/PDF, … Dietmar even demonstrated our own tool Doxxy (www.doxxy.eu). Nice to hear that he likes Doxxy! He also showed us his own solution for typical problems related to exporting data from interactive report to MS Excel, especially regarding the proper data types: OPAL:XP (for eXPorting to MS Excel).

Single-click deployment in APEX development
One of the last tracks we visited was about single-click deployment of APEX applications in OTAP areas. They talked about the use of bamboo, in combination with GIT and APEX. It was nice to see how they solved the problem of continuous integration with APEX.

A logistic data portal with APEX!oracle apex data portaal - menno hoogendijk
In the second iAdvise customer case Robert Esseling explained why Bas Logistics needed a data portal. Those requirements where then demonstrated by Menno Hoogendijk.
The portal has an admin module to manage the data import and mapping settings. In the very straight-forward  front-end, users drill down from dashboards to detailed data.

 

Thanks to the organization for hosting this great event, really one of the best conferences in the benelux!
See you at APEX World 2015!

wpg_docload.download_file : mime type not recognized by client

For a project we are currently working on, we needed to generate, and send a Word 2010 document to the client. The document was generated by a great PL/SQL document generation tool called Doxxy, and was sent to the client using the wpg_docload package. This is a standard Oracle pl/sql package that can be used to download files, BLOBs and BFILEs.

Before the download, we set the Content-type in the http header as follows :

owa_util.mime_header('application/vnd.openxmlformats-officedocument.wordprocessingml.document',FALSE);

When sending the document to the client, we got the following popup in our browser :

Image

So it looked like our browser didn’t recognized that this was an Word 2010 document.

Looking at the response header, using Firebug, we got the following result :

Image

Somehow the content type for Word 2010 was overwritten to text/html; charset=utf-8.

So, time for the good old trial and error approach, which, after a while, paid off.

Before setting the response header to : owa_util.mime_header(‘….’,FALSE); we need to issue the following commands :

htp.flush();
htp.init();

Now the code looks like this  :

-- first clear the header
 htp.flush;
 htp.init;
 -- set up HTTP header
 owa_util.mime_header('application/vnd.openxmlformats-officedocument.wordprocessingml.document', FALSE);
 -- set the size so the browser knows how much to download
 htp.p('Content-length: ' || DBMS_LOB.getlength(v_blob));
 -- the filename will be used by the browser if the users does a save as
 htp.p('Content-Disposition:attachment; filename="'||nvl(v_filename,'export')||v_ext||'"');
 -- Set COOKIE (for javascript download plugin)
 htp.p('Set-Cookie: fileDownload=true; path=/');
 -- close the headers
 owa_util.http_header_close;
 -- download the BLOB
 wpg_docload.download_file(v_blob);

After adding these 2 lines, we got the correct mime type :

Image

Many thanks to Willem Albert and Bjorn Fraeys for delivering the content for this blog !