Hi,
Yesterday, one of my customers asked for help to tune a procedure and when I investigated the problem I find out that their method to get a specific “column” value in csv file was the problem. So we discuss the situation and adopt a new version of the code. While doing that I just create a pipe lined function to get same data just for fun and wanted to share it here.
So let’s say have a CSV (comma separated values) file and you need to parse it line by line and maybe even you need to get the “columns” (and by saying columns I meant every comma separated values in a line of csv file) row by row. To achieve that I created a pipe lined function, here it is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
create or replace function sf_parse(p_clob clob, p_separator varchar2) return sys.ODCIVARCHAR2LIST pipelined as v_size number; v_start_pos number := 1; v_new_position number := 0; v_line varchar2(4000); x_clob clob := p_clob || to_clob(p_separator); begin v_size := dbms_lob.getlength(x_clob); while v_start_pos <= v_size loop v_new_position := nvl(instr(x_clob, p_separator, v_start_pos), 4000); v_line := substr(x_clob, v_start_pos , v_new_position-v_start_pos ); v_start_pos := v_new_position+length(p_separator); pipe row(v_line); end loop; return; end; / |
this function gets csv data as clob and parse to lines by a separator. Definition of Line is quite simple but you must be careful the platform. On Windows OS every line is ended with a carriage return (chr(13)) and a new line character (chr(10)) but on Linux OS it is just a new line character so if your CSV file is created on Windows then you must use chr(13) || chr(10) as new line (or enter) character and on Linux just chr(10).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
create table tmp_clob (id number, myclob clob); insert into tmp_clob values (1, to_clob('1,my first line,some data,123') || to_clob(chr(10)) || to_clob('2,my second line,some more data,456') || to_clob(chr(10)) || to_clob('3,my third line,some much more data,789') ); commit; select * from sf_parse((select myclob from tmp_clob where id = 1), chr(10)); --------------------------------------------------------------------- 1,my first line,some data,123 2,my second line,some more data,456 3,my third line,some much more data,789 |
as you see we have all three lines. we can even take this further and use one of the lines as a input parameter of sf_parse:
1 2 3 4 5 6 7 |
select * from sf_parse('1,my first line,some data,123', ','); ------------------------------------------------------- 1 my first line some data 123 |
we have all “columns” in csv as rows for the first line. you don’t have copy and paste the line info of course:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
select * from sf_parse((select myclob from tmp_clob where id = 1), chr(10)) MyLines, sf_parse(MyLines.Column_value, ','); ---------------------------------------------------- Column_value Column_value_1 1,my first line,some data,123 1 1,my first line,some data,123 my first line 1,my first line,some data,123 some data 1,my first line,some data,123 123 2,my second line,some more data,456 2 2,my second line,some more data,456 my second line 2,my second line,some more data,456 some more data 2,my second line,some more data,456 456 3,my third line,some much more data,789 3 3,my third line,some much more data,789 my third line 3,my third line,some much more data,789 some much more data 3,my third line,some much more data,789 789 |
have a nice day.
PS: I still can not believe that we have to talk/protest about “racism” in 2020 and I am deeply sorry for days. Life matters. We always tend to believe that we are the superior creature but we can not even understand the importance of life. Remember George Floyd. Peace.