More Related Content Similar to Hive - SerDe and LazySerde (20) Hive - SerDe and LazySerde2. Where is SerDe? File on HDFS Hierarchical Object Writable Stream Stream Hierarchical Object Map Output File Writable Writable Writable Writable Writable Hierarchical Object File on HDFS User Script Hierarchical Object Hierarchical Object Hive Operator Hive Operator SerDe FileFormat / Hadoop Serialization Mapper Reducer ObjectInspector imp 1.0 3 54 Imp 0.2 1 33 clk 2.2 8 212 Imp 0.7 2 22 thrift_record<…> thrift_record<…> thrift_record<…> thrift_record<…> BytesWritable(3F647200) Text(‘imp 1.0 3 54’) // UTF8 encoded Java Object Object of a Java Class Standard Object Use ArrayList for struct and array Use HashMap for map LazyObject Lazily-deserialized 3. SerDe, ObjectInspector and TypeInfo Hierarchical Object Writable Writable Struct int string list struct map string string Hierarchical Object String Object TypeInfo BytesWritable(3F647200) Text(‘ a=av:b=bv 23 1:2=4:5 abcd ’) class HO { HashMap<String, String> a, Integer b, List<ClassC> c, String d; } Class ClassC { Integer a, Integer b; } List ( HashMap(“a” “av”, “b” “bv”), 23, List(List(1,null),List(2,4),List(5,null)), “ abcd” ) int int HashMap(“a” “av”, “b” “bv”), HashMap<String, String> a, “ av” getType ObjectInspector1 getFieldOI getStructField getType ObjectInspector2 getMapValueOI getMapValue deserialize SerDe serialize getOI getType ObjectInspector3 4. LazySimpleSerDe components LazyStruct LazyInteger LazyString LazyMap LazyString LazyString LazyString LazyString LazyStructOI(“ “) LazyArrayOI(“:”) LazyMapOI(“:”,”=“) StandardIntegerOI StandardStringOI StandardStringOI byte[] data Hierarchical Object / LazyObject One Per SerDe instance LazyObjectInspector Singleton byte[](‘ a=av:b=bv 23 1:2=4:5 abcd ’) LazyStruct LazyStructOI(“=“) StandardIntegerOI LazyStruct LazyArray LazyInteger LazyInteger LazyInteger LazyInteger