How to Use MaxFileSize Parameter to Split the Incoming File into Number of Files on the Target in CSV Format
Question
How can I use the MaxFileSize parameter to split the incoming file into many files on the target in CSV format?
Environment
HVR 5
Answer
The /MaxFileSize parameter allows you to bundle the incoming rows in a file. The rows are bundled into the same file until after the specified threshold is reached. After this is done the file is sent to the target and HVR starts writing data to the new file. You can use this parameter if you want a smaller file size so that the files are easier to manage.
Prerequisites
- A channel f_rep already present in the HVR Hub
- In this particular example, we will look at CSV file as the target
Steps
Create the channel to replicate a table from Oracle database to a file in CSV file format. The channel will look like:
Add action Integrate /MaxFileSize to split the files in bytes. For example, in this case we will specify the MaxFileSize to be 1000 bytes. This will create files on the target with size 1000 bytes.
Perform Initialize to make this action take effect to the channel.
Perform Refresh. This option will create number of files on the target with size 10KB and a little over.
NOTE: For efficiency reasons, HVR decides to start writing new file depending on the length of previous row, not the current row. This means that the actual size of the may slightly exceed the specified value.
If you look at the properties of each file, it is evident that it is approximately 10KB.
Start the Integrate job. This will create the files with size of approximately 10KB each on the target.
NOTES:
- With integration, if the incoming data is less than MaxFileSize and the integration cycle completes then a file less than MaxFileSize is created on the target. It is only when the incoming data is more than MaxFileSize, this parameter comes into effect.
- Similar behavior is for refresh operation. The last file created will be more often than not less than MaxFileSize.