Quantcast
Channel: SCN : All Content - All Communities
Viewing all articles
Browse latest Browse all 9057

Parsing Error when loading csv files that contain double quotes

$
0
0

I am loading a data file using Hana Studio. The data file is four fields, all of them NVARCHAR. The data comes from a Teradata database.

Some of the records'  field values have double quotes. There are also commas, and for this reason the data file uses a pipe ('|') as the record delimiter.

 

When I import using either a control file or using a SQL import  command from Hana Studio, I specify that the record delimiter is '|'.

Some of the records load, and some fail. The failed records show up in the error log as 'Parsing error after  'xxx' column. The error record indicates that the problem is in the space right after the double quote mark.

Other details:

  • The data file is created using Teradata SQL Assistant against a Teradata database.
  • The records are delimited by a newline character, but no carriage return.
  • The fields are delimited by a pipe ('|').
  • The file format option in SQL Assistant is 'UTF-8' and written to a *.txt. file on a Windows system.
  • The file is FTP'd to the Hana Linux box.
  • File is imported using options:
    • record delimited by '\n'
    • field delimited by '|'
    • optionally enclosed by '"'

 

  • There are some records with a control character in them- Hex value of '8D' - Don't know where this comes from yet.
    • Records with a field that have two double quotes do not load.
    • Records with a field that have two double quotes and also have the control character ( Hex value '8D') preceding the first double quote will load.
  • Example records
    • Column Names-Prod_ID,Prod_Desc, SKU, SKU_DESC
      • 12345   |Product xyz - "100" Daily|0A1234ZZ|"100" Daily
      • 12346   |Product abc - "200" Daily|0B1234|"200 Daily
    • The first record will not load
    • The second record will load. It has the control character (Hex value '8D') in the position between the hyphen and the first quote.
    • The error record from the .err file has this message:

               Parsing error: incorrect delimiter for the next column of PRODUCT_DESC field: "100"

               12345   |Product xyz - "100" Daily|0A1234ZZ|"100" Daily

                                                            ^

 

I have tried loading with and without the 'optionally enclosed by '"' '. Same result.

 

What am I doing wrong?

 

 

 

 

Thanks


Viewing all articles
Browse latest Browse all 9057

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>