In relational databases, such as oracle, the database will automatically insert the data into the corresponding partition according to the value of the partition field.
Existing test, partition table test_partitioned, need to be I & erted into a partitioned table by month, day
Target table: test_partitioned
CREATETABLE t_lxw1234_partitioned ( url STRING ) PARTITIONED BY (month STRING,day STRING) stored AS textfile;
Data in test follows the month ( month ), date ( day ), and inserted into the appropriate partition of the target table test partitioned.
This requirement isn't easy to implement if you follow the insert data in the previous section described earlier.
It's time to use dynamic partitions.
Use dynamic partitioning to be aware of setting the following paramete:
Hive. Exec. Dynamic. Partition.
Do you want to turn on dynamic partition features, default false.
When using dynamic partitions, this parameter must be set to true;
Hive. Exec. Dynamic. Partition. Mode.
Dynamic partition mode, default strict, indicating that at least one partition must be specified as static partition, nonstrict mode indicates that all partition fields can be used for dynamic partitioning.
General needs to be set to nonstrict
Hive. Exec. Max. Dynamic. Partitions. Pernode.
Default: 100, the maximum number of dynamic partitions can be created on each node on which the mr is executed.
This parameter needs to be set according to actual data.
For example, the source data contains a year of data, that's, the day field has 365 values, then the parameter needs to be set to greater than 365, if the default value 100.
Hive. Exec. Max. Dynamic. Partitions.
In all the nodes that execute mr, the maximum total number of dynamic partitions can be created.
Hive. Exec. Max. Create. Files.
In the entire mr job, the maximum number of hdfs files can be created.
General default is sufficient, unless your amount of data is large, the number of files you need to create is greater than 100000, and can be adjusted according to the actual situation.
Hive. Error. On. Empty. Partition.
Whether an exception is thrown when an empty partition is generated.
No need to set up.
SET hive.exec.dynamic.partition=true;SET hive.exec.dynamic.partition.mode=nonstrict;SET hive.exec.max.dynamic.partitions.pernode = 1000;SET hive.exec.max.dynamic.partitions=1000;INSERT overwrite TABLE t_lxw1234_partitioned PARTITION (month,day) SELECT url,substr(day,1,7) ASmonth,dayFROM t_lxw1234;
Note: the partition field name is specified in partition ( month, day ).
In the last two fields of the select clause, you must correspond to the partition fields specified in the previous partition ( month, day ), including order.