Indexing JSON logs with Parquet



We frequently use Spark SQL and EMR to analyze terabytes of JSON request logs. The builtin JSON support in Spark is easy to use and works well for most use cases. For example, this small piece of code will infer the schema of the files and provide a table that can be queried with standard SQL:

from Pocket http://ift.tt/2iaM8l5
via IFTTT

このブログの人気の投稿

温暖化で海面上昇するとどこが水没するか一発で分かる地図「Flood Maps」レビュー、未来の日本の海岸線はどうなっているのか?