ruby mechanize: how read downloaded binary csv file -
I am not very familiar with the use of Ruby with binary data. I downloaded large local CSV files to my local disk I'm using mechanics for Then I need to search these files for specific strings.
I use save_as method in the mechanize to save the file (which saves the file as binary). The content type of the file (according to mechanization) is:
application / vnd.ms-excel; Charset = x-UTF-16LE-BOM
From here, I'm not sure how to read the file. I tried to read it as a normal file in Ruby, but I Get binary data. I tried and tried to use standard UNIX tool (strings / GRP) to search without any luck.
When I run the 'File' command on one of the files, I get:
foo.csv: Little-Endian UTF-16 Unicode Pascal Program With very long lines with text, CRLF, CR, LF line terminator
I can see the data is just fine with cat or V. I also see some control characters with Vi I am
I have tried both the CSV and the sharp CCV Ruby Library, but they get the 'IllegalFormatError' exception. I have tried without any luck too.
Any help would be greatly appreciated thanks.
You can use the 'iconv' command to convert to UTF-8,
# iconv -f 'UTF-16LE' -t 'UTF-8' bad_file.csv & gt; There is also a cover for iconv in the standard library, you can use it to convert the file after reading it in your program.
Comments
Post a Comment