Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best Practices for Complex Types (Arrays, Unions, etc.) #6

Open
wrbriggs opened this issue Jun 2, 2015 · 2 comments
Open

Best Practices for Complex Types (Arrays, Unions, etc.) #6

wrbriggs opened this issue Jun 2, 2015 · 2 comments

Comments

@wrbriggs
Copy link

wrbriggs commented Jun 2, 2015

First off, thank you for creating these examples, as they are a great starting point for working with Avro and Parquet in Spark!

One suggestion I have is that it would be really nice if this example could be updated to show best practices when using Avro schemas that contain complex nested types, unions and arrays. Spark does not seem to play well with these by default (e.g., https://issues.apache.org/jira/browse/SPARK-3601), and while it's possible to cobble something together by digging through mailing lists and JIRA tickets, it would be really helpful to have it officially documented somewhere.

@sryza
Copy link
Owner

sryza commented Jun 4, 2015

I agree, though unfortunately don't have bandwidth in the near future to work on this. If you have any interest in taking this on, I'd be happy to review and merge it.

@prateek
Copy link

prateek commented Jan 26, 2016

Ran into this while doing something today, 1 & 2 discuss the issue and solution well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants