I know, not very exciting, but I thought I'd capture this before I forget. Having some character encoding issues with a particular dbf file. Apparently, it was encoded as ISO-8859-15 but everything seems to try to read it as UTF-8...so with Nick's help, came up with a quick way to convert DBF columns from one encoding to another.
How to convert a dbf file from one encoding to another. In this example, only convert the first column (NAME) from ISO-8859-15 to UTF-8 while keeping all other columns the same.
Step 1: Convert DBF file to csv
ogr2ogr -F "CSV" NEWFILE.csv OLDFILE.dbf
Step 2: run the attached perl script convert.pl sending the new csv file in as stdin:
./convert.pl < NEWFILE.csv > NEWFILECONVERT.csv
Step 3: Convert the newly encoded csv file back to your DBF file
ogr2ogr -F "ESRI Shapefile" NEWFILECONVERT.csv OLDFILE.dbf
====== convert.pl ==========
#!/usr/bin/perl
while (<>)
{
chomp;
@_ = split /,/;
my $name = $_[0];
if ($name =~ /\w/)
{
$_[0] = `echo -n "$name" | /usr/bin/iconv -f ISO-8859-1 -t UTF-8`;
}
print join(',', @_) ."\n";
}
====== convert.pl ==========
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment