-
-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feat] Import CSV files with column headers and correct column data types #4450
Comments
Did you see https://grass.osgeo.org/grass-stable/manuals/db.in.ogr.html#import-csv-file (using |
Yes. I've done that. It is equally a pain, especially with multi-column tables and sometimes does not work correctly. Most other data analytical programs manage to open and read these ASCII text files pretty well. v.in.ascii comes very close and is only missing recognizing column names for creating points. If it had the ability to do this and the option to import the table without creating points, it would cover >95% of the cases. |
The OGR CSV driver has a AUTODETECT_TYPE=YES open option
|
Interesting. Could this be a flag in db.in.ogr? To test, is this something that can be entered in the ogr_doo argument? |
don't know the GRASS side of things, but from https://grass.osgeo.org/grass84/manuals/v.in.ogr.html, I assume this should be |
Yes. This works!. It should be the default in db.in.ogr, perhaps with a flag to disable it and use a *.cvst instead. Too bad that we can't do this with v.in.ogr, since it won't accept cvs files with defined x and y coord. columns. But if v.in.ascii can just keep the header field, then that would be a good alternative too. |
@cmbarton I fully agree with your points. One thing is that the column names are stored in the history of the vector layer.
|
Thanks for the workaround . |
CSV files are the most common and widely readable file format for tabular data. But GRASS has no easy way to import them correctly.
db.in.ogr now imports CSV files with correct column headers (see #943). But it imports all columns as strings, regardless of the data in the columns. Because tables imported this way do not have any numeric columns they cannot be used to generate points or to connect with an existing spatial dataset via a common cat field
v.in.ascii can import CSV files and create vector points. It does correctly recognize data types in CSV columns--including a CAT column. But for unknown reasons it does not import the column headers. Of course, you can create new column headers, but this becomes a PITA with many columns--especially as it requires an arcane format to do so.
A grass extension v.in.csv does some of this. However, it requires a separate Python package, pyproj, to do so. It also does not automatically recognize data types in columns like v.in.ascii does, although you can specify which columns are numeric (integer or real). It also seems to assume that the CSV table only has point coordinates in latlon.
So db.in.ogr needs to recognize column data types, or allow a user to specify them, and v.in.ascii needs to recognize column headers. Ideally, perhaps, would be a combination of these--something that could import a CSV table correctly and optionally create vector points from that table, optionally identify a CAT column, and optionally allow a user to define column data types to override the automatic data types recognition.
Related issue: #943
The text was updated successfully, but these errors were encountered: