7

Reading an OS file from inside an Oracle Database using an External Table

 3 years ago
source link: https://mikesmithers.wordpress.com/2020/04/05/reading-an-file-from-inside-an-oracle-database-using-an-external-table/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Reading an OS file from inside an Oracle Database using an External Table

Posted on April 5, 2020

Two weeks into lockdown and some are looking to break the monotony by varying their diet…

teddy_sandwich.jpg?w=584&h=778

For my part, I’ve decided to amuse myself by making a couple of “Top five” lists – the five best things about the lockdown and five things I’ve now finally got time to watch.

These lists are in text files are somewhere on my Database server and therefore provide the basis for the examples that follow.

When using Oracle, if you find yourself in a situation where you don’t have access to the Database Server file system, or would just rather use SQL to find what you’re looking for in a data set, you’re in luck.
The technique that used to be a neat way of viewing the database alert log from inside the database can now be re-purposed to look at any text file that’s located in a directory referenced by a Directory Object in the database.

What we’re going to look at here is :

  • Setting up an external table for reading from files
  • How to point the external table specific files and directories using EXTERNAL MODIFY
  • How achieve the same thing in 11gR2 databases with the help of DBMS_LOCK

First of all though, just in case you’re wondering….

The best way to view the alert log from SQL

Before 11g, we had to rely on an external table if we wanted to achieve this. Since that release however, we now have access to the X$DBGALERTEXT fixed table. So, if we want to see all the entries in the alert log today, we can do something like :

select indx, originating_timestamp, message_text
from x$dbgalertext
where trunc(originating_timestamp) = trunc(sysdate)
order by indx
/

Setting up the external table and pointing it at a file

If we think about a text file in terms of a relational table structure, it consists of two columns.
The line number acts as a synthetic key, the actual text is simply a single VARCHAR.
OK, so it could be a CLOB, but 4000 characters should be plenty for a line in a text file.

I’d like to use DATA_PUMP_DIR as the default directory for the table because that directory is one which is created with the database itself.
This means that, in order to create the table, the schema that owns it will need to have READ access to the directory :

grant read on directory data_pump_dir to hr
/
create table view_file_xt(
line number,
text varchar2(4000))
organization external(
type oracle_loader
default directory data_pump_dir
access parameters (
records delimited by newline
nologfile
nobadfile
nodiscardfile
fields terminated by '~'
missing field values are null(
line recnum,
text char(4000))
)
location('')
)
reject limit unlimited
/

Now you may be wondering why we don’t also need WRITE access to the default directory, which is what you’d expect with an External Table.
The simple reason is that the table is defined as not having a logfile, a badfile or a discardfile so this privilege is not required.
Also, it’s probably worth mentioning that we can use the RECNUM pseudo-column to generate the value for LINE.
Having a line number comes in handy when you want to run more sophisticated queries against your external table.

If I didn’t have access to the directory via the Operating System, I’d probably need to rely on something like this to get a file listing.

Anyhow, I happen to know that there’s a file called lockdown_catchup_top5.txt that I’m quite interested in.

In the olden days ( 11gR2), you’d need to do some DDL to point the table at the file. Since 12c however, you can do this :

select text
from view_file_xt
external modify(
default directory app_files
location('lockdown_catchup_top5.txt'))
where text is not null       
/
TEXT                                                                           
--------------------------------------------------------------------------------
The Post
Frozen 2
Killing Eve Series 1 and 2
Killing Eve Series 3 ( out at Easter on BBC in the UK <img draggable="false" role="img" class="emoji" alt="🙂" src="https://s0.wp.com/wp-content/mu-plugins/wpcom-smileys/twemoji/2/svg/1f642.svg">
Good Omens - to remind me that the Apocalypse doesn't have to be the end of the
world.

Despite the fact that we’ve pointed our table at a different directory and location from that it was created with, we can see that our use of the EXTERNAL MODIFY clause has had no lasting effect :

select directory_name, location
from user_external_locations
where table_name = 'VIEW_FILE_XT'
/
DIRECTORY_NAME       LOCATION           
-------------------- --------------------
DATA_PUMP_DIR                           

On my database, the HR user now has access to two directories :

select directory_name
from all_directories
/
DIRECTORY_NAME     
--------------------
DATA_PUMP_DIR
APP_FILES

Incidentally, if you’d rather see the directories to which you have access, together with the privileges that you have on them then you could just run :

select priv.table_name as directory_name,
listagg(priv.privilege,', ') within group (order by priv.privilege) as privileges
from user_tab_privs priv
inner join all_directories dir
on dir.directory_name = priv.table_name
group by priv.table_name   
order by 1,2
/
DIRECTORY_NAME       PRIVILEGES                   
-------------------- ------------------------------
APP_FILES            READ, WRITE                  
DATA_PUMP_DIR        READ                         

I have a file in the DATA_PUMP_DIR directory called lockdown_top5.txt.
If I want to view this file using our new table, I can use the EXTERNAL MODIFY clause again :

select text
from view_file_xt
external modify(
default directory data_pump_dir
location('lockdown_top5.txt'))
where text is not null       
/

As expected, I’ll now get the contents of the new file in the new directory :

TEXT                                                                                               
----------------------------------------------------------------------------------------------------
The commute into work is much easier.
Every day is dress-down day, whilst only some days are get dressed days.
I'm pretty sure my webcam is switched off.
The BBC are showing classic sporting events that I remember for first time round.
It's finally stopped raining in Devon.

If you’re on 11gR2 or earlier, you can still take advantage of external tables to view OS files. However, if you’re likely to have multiple concurrent sessions using this table, you’ll probably need to do a little more legwork…

Alter table statements and DBMS_LOCK

In 11g, we don’t have EXTERNAL MODIFY. Fortunately, we can change the table to point to different locations and directories :

alter table view_file_xt default directory MY_FILES location('lockdown_catchup_top5.txt');

…followed by…

select text
from view_file_xt
where text is not null;

To view the file that’s in the DATA_PUMP_DIR directory, we’d need to run :

alter table view_file_xt default directory DATA_PUMP_DIR location('lockdown_top5.txt');
select text
from view_file_xt
where text is not null;

As well as extra typing – and the need to be aware of the current transaction terminating as a result of DDL commands being issued, we also need to account for the fact that our “query” is split across multiple operations rather than a single select statement.
Depending on how this table is being used, you may want to ensure that it remains unchanged across these multiple statements.
For example, it would be unfortunate if you thought you were about to look at a list of comforting movies, only for another session to re-point the table to a data pump log file between you setting it to the movies file and issuing the select statement.
Fortunately, there is a package which you can use to address issues like this – DBMS_LOCK.

Locking the “table” until you’re done with it

In order to ensure that access to the table is serialized, we’ll need to implement some kind of locking mechanism.
The steps we want to follow are :

  • lock the table for exclusive use
  • point the table at a given file in a given directory
  • execute our query against the table
  • release the lock

To do this, we’re going to have to create a wrapper for the table rather than allowing direct access to it.

First of all, we need to grant execute on DBMS_LOCK to HR :

grant execute on dbms_lock to hr;

Now we can implement our locking by means of a package :
NOTE : this package is written in a linear fashion in order to more clearly illustrate the steps we’re taking here. You may consider it more appropriate to modularise it into multiple package members if you’re planning to use it in anger…

create or replace package view_files as
type typ_fline is table of view_file_xt%rowtype;
function cat(
i_dir in user_external_locations.directory_name%type,
i_fname in user_external_locations.location%type)
return typ_fline pipelined;
end view_files;
/
create or replace package body view_files as
GC_TIMEOUT constant number := 30;
function cat(
i_dir in user_external_locations.directory_name%type,
i_fname in user_external_locations.location%type)
return typ_fline pipelined
is
fline view_file_xt%rowtype;
stmnt varchar2(4000);
-- Locking variables
lh varchar2(128);
req_ret pls_integer;
rel_ret pls_integer;
msg varchar2(4000);
-- We're doing some DDL here so...
pragma autonomous_transaction;
begin
if i_dir is null or i_fname is null then
raise_application_error(-20000, 'A directory object and a file name must be specified');
end if;
-- Get the handle for the lock we want to use
dbms_lock.allocate_unique(
lockname => 'CAT_FILE_LOCK',
lockhandle => lh);
-- Request the lock
-- Note that, in 11g, the default for release_on_commit is FALSE so our posession of this
-- lock will persisit across transaction boundaries.
req_ret := dbms_lock.request( lockhandle => lh, timeout => GC_TIMEOUT);
-- Return code 0 is Success. Return Code 4 is that we already own the lock.
if req_ret not in (0, 4) then
msg := 'Error requesting lock : '||
case req_ret
when 1 then 'Timeout'
when 2 then 'Deadlock'
when 3 then 'Parameter error'
when 5 then 'Illegal lock handle'
end;
raise_application_error(-20000, msg);
end if;
-- Point the external table to the file we want to read...
stmnt := 'alter table view_file_xt default directory '||i_dir ||q'[ location(']'||i_fname||q'[')]';
execute immediate stmnt;
-- Now the good bit - read the contents of the file
for r_text in ( select line, text from view_file_xt) loop
fline.line := r_text.line;
fline.text := r_text.text;
pipe row(fline);
end loop;
-- Finally, release the lock  
rel_ret := dbms_lock.release( lockhandle => lh);
end cat;   
end view_files;
/

If we begin with the lockdown_top5.txt file in the DATA_PUMP_DIR :

select text
from table( view_files.cat('DATA_PUMP_DIR', 'lockdown_top5.txt'))
where text is not null -- strip out any blank lines
order by line
/
TEXT                                                                                                                             
----------------------------------------------------------------------------------------------------------------------------------
The commute into work is much easier.
Every day is dress-down day, whilst only some days are get dressed days.
I'm pretty sure my webcam is switched off.
The BBC are showing classic sporting events that I remember for first time round.
It's finally stopped raining in Devon.

We can see that there have been a couple of changes. Predictably, the the external table now is pointing to the Directory Object and file which we specified in our call to the package:

select directory_name, location
from user_external_locations
where table_name = 'VIEW_FILE_XT';
DIRECTORY_NAME                 LOCATION                     
------------------------------ ------------------------------
DATA_PUMP_DIR                  lockdown_top5.txt            

If we look at the SYS table DBMS_LOCK_ALLOCATED we can see that a new entry has been created for our lock, with an expiration set for 10 days from today :

select name, lockid, expiration
from dbms_lock_allocated
where name = 'CAT_FILE_LOCK';
NAME                               LOCKID EXPIRATION
-------------------- -------------------- -----------
CAT_FILE_LOCK                  1073741851 15-APR-2020

Just to prove that the locking works, let’s run two separate sessions using the same table to view the contents of the two different files.
We’ll add in a sleep to the package so that the second session will wait an appreciable amount of time before acquiring the lock.
To start with, let’s see how long a normal run takes :

set timing on
select text
from table( view_files.cat('DATA_PUMP_DIR', 'lockdown_top5.txt'))
where text is not null
/
TEXT                                                                                               
----------------------------------------------------------------------------------------------------
The commute into work is much easier.
Every day is dress-down day, whilst only some days are get dressed days.
I'm pretty sure my webcam is switched off.
The BBC are showing classic sporting events that I remember for first time round.
It's finally stopped raining in Devon.
5 rows selected.
Elapsed: 00:00:00.039

Now we’re going to add a sleep to the CAT function directly before the lock is released.
This will have the effect of causing the function to wait for 10 seconds before releasing the lock…

...
-- Finally, release the lock  
dbms_lock.sleep(10);
rel_ret := dbms_lock.release( lockhandle => lh);
end cat;   

Now, in session 1, we re-run our first query.
More-or-less simultaneously, we kick off the query in session 2.

Session 1 looks like this :

clear screen
select sys_context('userenv', 'sessionid') as session_id from dual;
set timing on
set feedback 1
select text
from table( view_files.cat('DATA_PUMP_DIR', 'lockdown_top5.txt'))
where text is not null
/

Session 2 is similar, but points to a different file and directory :

clear screen
select sys_context('userenv', 'sessionid') as session_id from dual;
set timing on
set feedback 1
select text
from table( view_files.cat('APP_FILES', 'lockdown_catchup_top5.txt'))
where text is not null
/

If we look at the output from Session 1, we can see that roughly 10 seconds was added to the runtime. This is the length of time we set the sleep for :

SESSION_ID         
--------------------
325844
TEXT                                                                                               
----------------------------------------------------------------------------------------------------
The commute into work is much easier.
Every day is dress-down day, whilst only some days are get dressed days.
I'm pretty sure my webcam is switched off.
The BBC are showing classic sporting events that I remember for first time round.
It's finally stopped raining in Devon.
5 rows selected.
Elapsed: 00:00:10.082

Session 2 takes much longer :

SESSION_ID         
--------------------
325912
TEXT                                                                                               
----------------------------------------------------------------------------------------------------
The Post
Frozen 2
Killing Eve Series 1 and 2
Killing Eve Series 3 ( out at Easter on BBC in the UK <img draggable="false" role="img" class="emoji" alt="🙂" src="https://s0.wp.com/wp-content/mu-plugins/wpcom-smileys/twemoji/2/svg/1f642.svg">
Good Omens - to remind me that the Apocalypse doesn't have to be the end of the world.
5 rows selected.
Elapsed: 00:00:17.908

This is because, as well as executing the 10 second sleep, it had to wait almost 8 seconds to acquire the lock ( the other two seconds was me switching between sessions to execute the query).
Obviously, this lock will only be invoked when you access the table via the package. Therefore, users needing to use this functionality should be granted execute on the package but not have privileges to query the table directly.

Acknowledgement

There are quite a few articles around about how to use DBMS_LOCK, but I am particularly indebted to Jeff Kemp for taking the time to record his adventures with this package for posterity.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK