HTTP/2 server push

HTTP/2 general

In general HTTP/2 is about optimizing server resource usage. This is mainly achieved by using 1 connection between server and client and re-using that connection for all requests/responses for the duration of the session.

This is in sharp contrast with HTTP/1. Where a connection is created, a request is sent, a response is received and the connection is terminated. This is overhead is more or less hidden to the user because multiple connections are used in parallel.

HTTP/2 server push

HTML pages reference many other resources. Using HTTP/1 the client needs to parse the HTML page, identify the referenced resources and fetch them. Every round-trip incurring the connection setup/breakdown overhead.

Using HTTP/2 the server can send these referenced resources to the client before they are needed and because the same connection is used there is no connection setup/breakdown overhead.

Use cases

Pushing page resources before they are needed will make a site/application more responsive to it’s users. But doing this manually for all pages of an application is only going to be feasible for the smallest of applications.

Web/ UI component frameworks may push framework resources that are needed. Multiple approaches can be taken in this space. All framework resources can be pushed or only the resources that are needed based on the components of the framework that are used.

What is available

Since HTTP/2 is a draft spec, it is still early days for HTTP/2. Currently there is no standard Servlet API but that can’t stop us, Jetty already has an experimental API.

Google Chrome Canary has support for HTTP/2 when started with –enable-spdy4 as start parameter.

Firefox has support for HTTP/2 when the network.http.spdy.enabled.http2draft is switched on.

Test case

In order to test server push I’ve taken one of my panoramic vacation photos and sliced it up into 400 parts. This may be a little over the top but as with all tests we want to test the limits.

The test has been executed using 2 web-modules:

  • blog-http1-no-push – containing a servlet on URL /nopush that does not perform any pushes.
  • blog-http2-push – containing a servlet on URL /push that executes server pushes for the image slices.

The blog-http1-no-push web module was deployed to a Jetty server containing only the http, annotations and deploy modules running on port 8080.

The blog-http2-push web module was deployed to a Jetty server containing only the http2, annotations and deploy modules running on port 8443.

Both these setups are available as Docker images:

Both web modules contain a single servlet. The servlets take a rows & columns attribute as parameters. This allows us to control the amount of resources that are contained in the generated page. They also control the amount of resources that are pushed by the blog-http2-push web module.

During testing I did notice that the server sometimes becomes unstable when trying to push all 400 image slices. I’ve contacted the jetty users mailing list, perhaps some additional configuration needs to be set when pushing a lot of resources. I’m waiting for their reply.

How do you use HTTP/2 in code

Initially there was a push method on the Dispatcher class, but while writing this blog the Jetty project deprecated that method and made a PushBuilder available via the Request class.

final Request jettyRequest = (Request) getRequest();

jettyRequest
    .getPushBuilder()
    .push(resourcePath);

Checkout the sources on Github https://github.com/teyckmans/http2-push

Performance difference

In order to have a correct test case I’ve deployed the Docker images in the Google cloud in the us-central1-a zone in order to have real network overhead. Measurements have been taken with cache disabled in Google Chrome Canary using the load number.

HTTP/1 – no push
http://%5Bhost-external-ip%5D:8080/blog-http1-no-push/nopush?rows=5
3.01s (average of 6)

HTTP/2 – push
https://%5Bhost-external-ip%5D:8443/blog-http2-push/push?rows=5
1.51s (average of 6)

Pretty spectacular difference if you ask me.

Do It Yourself

I’ve uploaded the Docker images to the docker hub so you can try it out and experience the difference yourself.
Use the following command lines to run the test web-modules.

HTTP/1 – no push
docker pull teyckmans/blog-http1-no-push
docker run –name blog-http1-no-push-1a -i -t -p 8080:8080 teyckmans/blog-http1-no-push

HTTP/2 – push
docker pull teyckmans/blog-http2-push
docker run –name blog-http2-push-1a -i -t -p 8443:8443 teyckmans/blog-http2-push

Using Apex Authorization schemes in PL/SQL

The problem with using APEX authorization schemes in PL/SQL has been addressed several times in blogs and forums, but we occasionally still get questions  on how to solve this:

I have a page where users with admin roles can modify data and other users can only view it. Hiding the button to save the record is easily done with an authorization scheme:

Capture

However, now I want my items to be displayed as “Read Only” too. There is no option to select your authorization scheme, but Apex wouldn’t be Apex if there hadn’t been an easy solution.

The function “apex_authorization.is_authorized(‘authoutization_scheme’)” does the trick. It will check the authorization scheme and return a boolean. Add a small PL/SQL block in the Read Only-part of your item like this:

Capture

Now your item is read only for persons without the admin role.


Some additional information:

With this function it’s also possible to combine multiple authorization schemes:

IF apex_authorization.is_authorized('isAdmin')
   OR apex_authorization.is_authorized('isWrite')
   OR :P3000_USER = 'TEST' THEN
  RETURN FALSE;
ELSE
  RETURN TRUE;
END IF;

Attention: if you want to use this functionality prior to Apex 4.2, you need to use “apex_util.public_check_authorization“!

5 hidden gems in Java 8

There’s been a lot of attention for the major new features of Java 8: Lambdas, the streaming API and the new Date and Time API. Of course these are the ones that make this a game changing release, but there’s more to Java 8. Inspired by José Paumard´s Devoxx talk 50 new things we can do with Java 8 I’d like to shed some light on 5 smaller features that will make your life as a Java developer easier. I won’t go 50 like José (and actually… neither did he), but these are the ones you definitely need to see.

Join me!

So we probably all had our fair share of recreating the same boilerplate code over and over again: iterating over a list of values in order to concatenate them with a delimiter to a single String. In Java 7 this would probably look something like this:

List<String> values = ...
StringBuilder result = new StringBuilder("[");
boolean first = true;
for(String item : values) {
  if(first) {
    first = false;
  } else {
    result.append(", ");
  }
  result.append(item);
}
result.append("]");
System.out.println(result.toString());

While this is certainly more concise and more readable than how the code would have looked in the 1.4 era, before generics and enhanced for-loops, but it still is a hideous pile of boilerplate for something very simple. So now for the Java 8 solution:

StringJoiner joiner = new StringJoiner(", ", "[", "]");
values.forEach(joiner::add);
System.out.println(joiner.toString());

This actually showcases not only the new StringJoiner, but two other Java 8 features as well: method references and the forEach() method on the Iterable interface.

While StringJoiner is actually meant for some behind-the-scenes processing for a Collector in the streaming API (http://blog.joda.org/2014/08/stringjoiner-in-java-se-8.html), it does eliminate a lot of boilerplate in more traditional code.

Longing for hashCode

When implementing your own hashCode() method for a class that has long fields, in the past you had to calculate the hash yourself or wrap the long value in a Long object and then call its hashCode() method. In order to avoid unnecessary creation of objects, Java 8 allows you to use the static method Long.hashCode(long value) for this.

Line it up!

Java 7 gave us the very convenient Files class with a lot of useful static helper methods for working files, amongst which Files.readAllLines(Path path). This class is also updated to make the best use of the Streaming API. So now we get the even more useful Files.lines(Path path), which does not return a List of all the lines, but a Stream. This is probably a better programming model in almost all cases. When you read all the lines in a file, you will probably want to process them somehow instead of keeping them in memory. Below an example of reading all the lines in a file and printing out only those lines that start with an “A”.

Path file = Paths.get("path", "to", "file.txt");
try (Stream<String> lines = Files.lines(file)) {
  lines
    .filter(s -> s.startsWith("A"))
    .forEach(System.out::println);
} catch (IOException ioe) {
  // ...
}

Repeat after me

A new feature that will probably find most of its use in the Java EE world, is @Repeatable. By annotating your annotation type with the meta-annotation @Repeatable, it can be placed at the same element more than once. Under the hood this still wraps the separate annotations in a container annotation, but it reads a lot better.

Since the annotation is not used within Java SE 8 itself, there are only a lot of imaginary examples circulating on the internet. But then again this feature was introduced with Java EE in mind. So below snippet (derived from the Java EE 7 spec) will likely be a valid JAX-B example in Java EE 8:

public class Foo {
  @XmlElement(name="A", type=Integer.class)
  @XmlElement(name="B", type=Float.class)
  public List items;
}

Which will below the surface be translated to the current notation:

public class Foo {
  @XmlElements(
    @XmlElement(name="A", type=Integer.class),
    @XmlElement(name="B", type=Float.class)
  )
  public List items;
}

By default

Default methods on interfaces are often presented as a by-product of Lambda’s, but they can make it a lot easier to create a sustainable future-proof API. A default method is a method on an interface of which the (default) implementation is already provided.

Say for instance your public API exposes the following interface:

public interface Foo {
  void addListener(FooListener listener);
}

Say for instance that you want to add the possibility to add multiple listeners in one go without breaking the implementations of your customers. This can be achieved by adding a default method:

public interface Foo {
  void addListener(FooListener listener);

  default void addListeners(Collection<FooListener> listeners) {
    listeners.forEach(this::addListener);
  }
}

And many more

While I stick to five here, there are many, many more additions to Java 8. You can find the full list here: http://www.oracle.com/technetwork/java/javase/8-whats-new-2157071.html.

Apex Interactive Report: The difference between CIR and RIR

You’ve probably already used the reset functionality in an Interactive Report, but do you know the exact the difference between CIR (Clear Interactive Report) and RIR (Reset Interactive Report)?

RIR or CIR, what is it?

First, let’s explain what CIR and RIR is and how you can use it.

With CIR and RIR you can clear or reset your Interactive Report after using filters. This can be handy if you’ve applied a filter on a page, but also want to (re)use the full report.

If you want to apply a filter on an interactive report when linking from another page you can use the following operators.

Function Meaning
IREQ_<COLUMN_NAME> Equals
IR_<COLUMN_NAME > Same as IREQ
LT_<COLUMN_NAME > Less than
IRLTE_<COLUMN_NAME > Less than or equal to
IRGT_<COLUMN_NAME > Greater then
IRGTE_<COLUMN_NAME > Greater then or equal to
IRLIKE_<COLUMN_NAME > Like operator
IRN_<COLUMN_NAME > Is Null
IRNN_<COLUMN_NAME > Is not Null
IRC_<COLUMN_NAME > Contains
IRNC_<COLUMN_NAME > Not Contains

You can also use above functions in a saved report, then you have to use IR_REPORT_<ALIAS>.

Sometimes the page with the interactive report can be accessed by multiple pages and you would like to reset the filters already applied to the interactive report.

i.e. You have a page with Countries and you want to filter your existing Customer Report to all customers in selected state. But, when you click on Customers on another page you want to see all customers, not only the ones you just filtered. In this case you can use CIR or RIR.

You can simply enter these options in the URL or the Clear Cache option in your button / branch.

APEX_CIR_RIR

But what is exactly the difference between using the CIR and the RIR?

As the name suggests, CIR Clears and RIR Resets.. But isn’t that the same?

Almost. The main difference is that CIR Clears the report, clearing all breakpoints, filters and other defined actions on your report, ignoring the settings of the primary report. RIR Resets the interactive report to the primary report.

In the following table you can see which user defined modifications to an Interactive Report will be lost or kept when using the CIR/RIR clear cache option.

  CIR RIR
Main Function Clears Interactive Report Resets Interactive Report
Maintains:
–  Column visibility YES* NO
–  Primary Report NO YES
–  Filters NO NO
–  Breakpoints NO NO
–  Pagination NO NO
–  Sort YES NO
–  Highlight NO NO
–  Computation NO NO
–  Aggregate NO NO
–  Chart NO NO
–  Group by NO NO

* Please note that when a CIR has been given, the columns displayed are still the columns of the primary report, but ‘stripped’. But if you alter the shown columns as a user it will display these columns.

For a demonstration click HERE.

Conclusion

In conclusion, use CIR to clear all filters and other settings set by the user or in the primary report. Use RIR if you want to reset to the primary report keeping all filters and columns of the primary report.

Tableau: Delineate Belgian provinces explained

Tableau Software is a self-service BI-tool that allows data visualization, meaning that even business users should be able to easily visualise their data (without needing help of IT). You can check out the Tableau website for more information about this great tool.

As you may have already seen in several guides or tutorials, Tableau is able to link certain dimensions like postal code, countries, etc. to a certain geographical area. Unfortunately, most of these tutorials use data related to the United States of America. As of today, the Belgian geographical support is at a low point. One thing we can easily do though, is show the data based on provinces using their borders.

In this blog we’ll be showing the Belgian population per province in the year 2012 on a map. The data I used (and which you can see in the image below) can be downloaded from the Belgian Open Data website.

Data used in Excel

I’ve slightly altered the data so that we would have the English name for each province (else Tableau will not recognize it – you can also choose French or German).

I’ve opened Tableau and loaded the data. Make sure that province is set as a geographical data type (State/Province). If this is not the case you can change it by right clicking on province and then selecting “Geographical Role” -> “State/Province”.

Geographic Role - State Province

When using filled maps (=a type of map visualization that Tableau offers which will fill the area according to your chosen data), you can only use one measure. Therefore I’ve added a new calculated field “Total” (see “Analysis” -> “Create Calculated Field”) based on the male and female amount of people.

calculated field - Total population

Now we will select both the province and the total and click on “Filled maps”.

Select province and total for map2

Tableau will automatically colour the provinces according to the amount of people who live there.

Now the only thing left to do is format your layout and then you’re done! I have coloured the provinces based on the % of the total (Red -> High %, Green -> Low %). I’ve also put the % of total in the label because I think showing the normal total would be unclear. The last thing I did was add the amount of males, females, total and total percentage to the label.

end result

You can view the dashboard that I made for this blog on Tableau public. Tableau public is a free tool that Tableau offers which allows you to publish your data on the web for other people to see.

Extra tip: you can control click on multiple provinces to view the sum of the total % for the selected provinces.

Thank you for reading!

Talend MDM: How to use validation rules

When creating a MDM data model, Talend offers you standard constraint possibilities.
You can choose whether fields are mandatory or not by setting the minimum and maximum occurrence. You may also set fixed values for a field.
Sometimes these options aren’t enough: you also want to use custom validations like email validation and URL validation. To solve this kind of requirements, Talend gives the possibility to use validation rules.

To illustrate this, I’ve created the entity ‘Customer’. As you can see FirstName and LastName are mandatory fields. However, for my business case, this data model doesn’t meet all my requirements: I also want validate the PostalCode(Belgian format: ‘9999’) and the Email before it saves the record.

Datamodel

How can you solve this within Talend?

You can create a validation rule by right clicking on the Customer entity and select Set the Validation Rule as shown below.

set validation rules

A window will pop up where you need to fill in a name for your validation rule. After you’ve chosen a name the following window will be shown.

set validation rule builder

You can add rules by clicking on:select xpath

For each rule you’ve to select an XPath, create an expression and set a message. You can add these by clicking on “…” in the field.

When creating an expression you can use the expression builder by clicking on “…”. The builder provides lots of predefined functions.

expression builder

After you have created and set your validation rules you’ve to deploy your model to the MDM server. After you’ve deployed the data model, we may test the validation rule by creating a new record. When you enter an invalid postalcode or an invalid emailaddress you’ll get the following message:

invalid postalcode

When entering the correct information you’ll get the following message:

save successfully

As you can see we’ve created a single validation rule set with different rules. The record can only be saved if all the rules of the validation set are met.

Talend: Schema compatibility check

Most of the time when talking about Talend jobs, people think of standard ETL (Extract, Transform, Load). But in some cases there’s the need to check the incoming data before loading them into the target rather than just transforming it. We refer to this process as E-DQ-L (Extract, Data Quality, Load).

One of the things that you might want to check before loading is schema compatibility. For example: you expect to get a String that’s 5 long. If you, for any reason, receive a String that is larger than 5, it will generate an error. Or perhaps you expect a percent (in format BigDecimal like 0.19), but you receive it as a string (“19%”). This example will result into a failing job with an error saying “Type mismatch: cannot convert from dataType to otherDataType”.

Before I continue this blog I would like to emphasize that all the solutions below are possible with the Data Integration version of Talend, except for the last one. The last option requires a Talend Data Quality license.

Let’s create an example case: We want to extract data on a regular basis from a third-party source which we cannot fully trust in terms of schema-settings. We know how many columns we can expect and we have a rough idea of what it contains, but we do not fully trust the source to not give incompatible data. We want to load the records that are valid and we want to separately store the ‘corrupt’ data for logging purposes. I’ve gathered several solutions for this problem:

  1. Use rejected flow on an input-component

One thing you can do is reject the records as soon as you import them. Disable “die on error” on the basic settings tab of you input-component and then right-click it and select “Reject”. The rows will be rejected based on the schema of the file. In the example below we put phone number as an integer and as you can see 1 records is begin rejected. This is because the phone number contains characters and therefore cannot be read as an integer. If you did not disable the “die on error”-option then this component would make the job fail.

reject on input

  1. In case of the target being a database: use rejected links

You can also choose to directly input the data into your database, but to reject any rows that would create an error. You can then create a separate flow to determine what to do with these rejected records.

In your database output component (for example tOracleOutput) change the following:

  • Basic settings: Uncheck “Die on error”
  • Advanced settings: Uncheck “Use batch size”

Now, right-click on your component and select “Row-Reject” and connect it to an output-component. The output you’ll receive will be the rejected rows and what error would have been generated if you tried inserting them, as you can see in the picture below.

rejected rows databank

  1. Use a tFilter-component

You can make the data go through a filter-component before inserting it into your target. You can (manually) decide what’s allowed to go through. This can be useful when your destination is not a database, in which case option 1 is most likely not available.

Schema compatibilty check with tFilterRox

A tFilterRow-component also has the possibility to output the rejected rows, including the reason why they got rejected. You can enable this by right-clicking on your filter and selecting “Row-Reject”. An example of rejected rows by the filter:

rejected rows tFilterRow

Note – You can also use self-defined routines in the tFilterRow-component by checking “Use advanced mode”. This can be useful when you want to check whether or not converting is possible. For example: you could define a routine called “isInterger” that returns true if the conversion is valid and false if it’s impossible.

  1. Use a tSchemaComplianceCheck-component

Another way of making sure that your schema is compatible is by using the tSchemaComplianceCheck-component. Unfortunately, this component is only integrated in the Data Quality version of Talend.

It’s a very easy component to use. The only thing you have to do is connect the incoming data to the tSchemaComplianceCheck-component and then continue its flow to the destination source. You can get the rejected rows the same way as previously (by right clicking on it and then selecting “Row->Reject”).

tSchemaComplianceCheck job

The rejected rows and their error message look like this:

rejected rows tSchemaCompatibilty

That’s it for now. There’s probably a lot of other ways of checking schema compatibility. Feel free to comment if you know any. Thank you for reading!