Talend Open Studio Cookbook
上QQ阅读APP看书,第一时间看更新

Validating against the schema

The tSchemaComplianceCheck is a very useful component for ensuring that the data passing downstream is correct with respect to the defined schema.

This simple exercise demonstrates how rows can be rejected using this component.

Getting ready

Open the job jo_cook_ch03_0020_schemaCompliance.

How to do it…

  1. Run the job. You should see two rows being rejected.
  2. Add a tSchemaComplianceCheck and two tLogRow, right click on tSchemaComplianceCheck_1 and select Row then Rejects. Join the flow one of the new tLogRow. Connect the main to the other as shown:
    How to do it…
  3. Now, when you run the job, you will see an additional reject row being output from the tSchemaComplianceCheck component.

How it works…

The tFileInputDelimited component will detect only some of the anomalies within the data, whereas the tSchemaComplianceCheck component will perform a much more thorough validation of the data.

If you look at the output, you will see the log entry, which shows that the name field has exceeded the maximum for the schema:

How it works…