Talend Open Studio Cookbook
上QQ阅读APP看书,第一时间看更新

Creating binary error codes to store multiple test results

Note

Prior to doing this exercise, it is recommended that you first jump forward to Chapter 4, Mapping Data, and do the exercises related to ternary operators and using variables in tMap.

Sometimes, it is desirable to perform multiple checks on a row at the same time, so that when a row is rejected, all of the problems with the data can be identified from a single error message. An excellent method of recording this is to create a binary error code.

A binary error code is a binary number, where each of the digit position represents the result of a validation test: 0 being pass and 1 being fail.

For example, 1101 = failed test 1 (rightmost digit), test 3 and test 4 and passed test 2. This binary value can be held as a decimal integer, in this case 13.

Getting ready

Open the job jo_cook_ch03_0070_binaryErrorCode.

How to do it…

  1. Open tMap and create six new Integer type variables: nameTest, dateOfBirthTest, timestampTest, ageTest, countryOfBirthTest and errorCode.
  2. Copy the following lines into the Expressions:
    customer.name.equals("") ? 1 << 0 : 0
    customer.dateOfBirth == null ? 1 << 1 : 0
    customer.timestamp == null ? 1 << 2 : 0
    customer.age == null ? 1 << 3 : 0
    customer.countryOfBirth.equals("") ? 1 << 4 : 0
    Var.nameTest  + Var.dateOfBirthTest  + Var.timestampTest + Var.ageTest + Var.countryOfBirthTest
  3. Add a condition in the ValidRows output
    Var.errorCode == 0
    
  4. Set the tMap Settings for the rejects output to Catch output reject.
  5. Your tMap should now look like this:
    How to do it…
  6. Run the job. You should see that the error codes are populated for all the rows where at least one field is null.

How it works…

The operator << performs a bitwise shift of the value by the relevant number of places. For example 1<<3 would place a 1 in the 4th position of a binary number (0 being the first position).

So if the field is null, the variable is assigned a bit-shifted value, otherwise it is set to 0.

By adding the numbers together, we eventually arrive at a decimal value which represents a 1 in each of the positions where a null is found.

This may be simpler to explain using an example. The following is the output from tLogRow. In this case, it is one of the rejects where three nulls have been found

How it works…

So from this output the binary value will be built as shown:

  • The nameTest variable is assigned 0
  • The dateOfBirthTest variable is assigned 1 << 1 = 10 (Binary) = 2 (Decimal)
  • The timestampTest variable is assigned 1 << 2 = 100 (Binary) = 4 (Decimal)
  • The ageTest variable is assigned 1 << 3 =1000 (Binary) = 8 (Decimal)
  • The countryOfBirthTest variable is assigned 0

So the decimal total is 0+2+4+8+0 = 14

There's more…

An alternative to using the << operator is to assign the actual decimal values to each position: 1,2,4,8 (2 power 0, 2 power 1, and so on) being positions 0 to 3. Again, adding the values gives us the desired integer result.

Decrypting the error code

Decrypting a binary error message is achieved by testing the individual bits in the integer. This can be achieved by using the shift function to create the binary bit position and performing a bitwise AND against the integer value. If the result is greater than 0, then the position is set.

For instance, if we have the value 0101 (7) in an integer column:

0101 & 1 (where the 1 equates to 1 <<0) = 1 (test 1 failed)

0101 & 10 (where 10 equates to 1<<1) = 0 (test 2 passed)

0101 & 100 (where 100 equates to 1<<2) = 100 (test 3 failed)

0101 & 1000 (where 1000 equates to 1<<3) = 0 (test 4 passed)

So the logic for our errors will look like this:

if ((errorCode & (1<<0)) > 0) {
  System.out.println("name is empty");
}
if ((errorCode & (1<<1)) > 0) {
  System.out.println("dateOfBirth is null");
}
if ((errorCode & (1<<2)) > 0) {
  System.out.println("timestamp is null");
}
if ((errorCode & (1<<3)) > 0) {
  System.out.println("age is null");
}
if ((errorCode & (1<<4)) > 0) {
  System.out.println("countryOfBirth is empty");
}

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.