Skip to content

Tag: hadoop

When to use lateral view explode in hive

if you have a table with one or more column with array datatype  and if you want it to expand into multiple rows, you can use lateral view explode function. 

Let’s consider we have following table, where one employee has multiple phone numbers which are stores as part of array (list). 

emp_namephone_numbers
user1[“546-487-3384″,”383-767-2238”]
user2[“373-384-1192″,”374-282-1289″,”332-453-5566”]

However as a output if we want to convert this Array (list) into multiple rows, we can use lateral view explode function a mentioned below 

select emp_name, phone_number 
from 
    temp.test_laterla_view_explode
lateral view explode(phone_numbers) p as phone_number

This will generate the output as mentioned below

emp_namephone_number
user2373-384-1192
user2374-282-1289
user2332-453-5566
user1546-487-3384
user1383-767-2238
1 Comment

Update Hive Table

Hive is a append only database and so update and delete is not supported on hive external and managed table. 

From hive version 0.14 the have started a new feature called transactional.  Which allows to have ACID properties for a particular hive table and allows to delete and update. but let’s keep the transactional table for any other posts. 

Here let’s discuss how to update hive table which is not transaction, either external or managed ( External table couldn’t be transactional).

Chances are if you have tried to update the hive table, external or managed (non transactional), you might have got below errors, depends on your hive version.

 select * from temp.test_udpate;
+-----------------+-------------------+--+
| test_update.id  | test_update.name  |
+-----------------+-------------------+--+
| 1               | test user 1       |
| 2               | test user 2       |
| 2               | test user 3       |
+-----------------+-------------------+--+
delete from temp.test1 where id=1;
Error: Error while compiling statement: FAILED:
SemanticException [Error 10297]: Attempt to do update or
delete on table temp.test1 that does not use an
AcidOutputFormat or is not bucketed (state=42000,code=10297)

Then the question is how to update or delete a record in hive table?

Deleting Records in Hive Table

Deleting rerecords is easy,  you can use insert overwrite Syntax for this. Let’s says we want to delete a record from above hive table which has name as “test user 3”.  Then we need to select all the records which does not have name as “test user 3” and overwrite into same table. 

 
insert overwrite table temp.test_update \
select * from temp.test_update \ 
where name = "test user 3";

Update Records in Hive Table

updating the record consist of three steps as mentioned below. Let’s say in your test.update table we want to update the id to for all the records which has name as “test user 3”

Create a temporary table which has updated record

 
create temporary table temp.test \
as select 3 as id, name from temp.test_update \
where name="test user 3";

Delete the records which you want to update from original table

  insert overwrite table temp.test_update \
select * from temp.test_update \
where name="test user 3";

Insert the Updated Record(s)

 insert into table temp.test_update select * from temp.test;

3 Comments