Reading an OS file from inside an Oracle Database using an External Table
source link: https://mikesmithers.wordpress.com/2020/04/05/reading-an-file-from-inside-an-oracle-database-using-an-external-table/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Reading an OS file from inside an Oracle Database using an External Table
Two weeks into lockdown and some are looking to break the monotony by varying their diet…
For my part, I’ve decided to amuse myself by making a couple of “Top five” lists – the five best things about the lockdown and five things I’ve now finally got time to watch.
These lists are in text files are somewhere on my Database server and therefore provide the basis for the examples that follow.
When using Oracle, if you find yourself in a situation where you don’t have access to the Database Server file system, or would just rather use SQL to find what you’re looking for in a data set, you’re in luck.
The technique that used to be a neat way of viewing the database alert log from inside the database can now be re-purposed to look at any text file that’s located in a directory referenced by a Directory Object in the database.
What we’re going to look at here is :
- Setting up an external table for reading from files
- How to point the external table specific files and directories using EXTERNAL MODIFY
- How achieve the same thing in 11gR2 databases with the help of DBMS_LOCK
First of all though, just in case you’re wondering….
The best way to view the alert log from SQL
Before 11g, we had to rely on an external table if we wanted to achieve this. Since that release however, we now have access to the X$DBGALERTEXT fixed table. So, if we want to see all the entries in the alert log today, we can do something like :
select
indx, originating_timestamp, message_text
from
x$dbgalertext
where
trunc(originating_timestamp) = trunc(sysdate)
order
by
indx
/
Setting up the external table and pointing it at a file
If we think about a text file in terms of a relational table structure, it consists of two columns.
The line number acts as a synthetic key, the actual text is simply a single VARCHAR.
OK, so it could be a CLOB, but 4000 characters should be plenty for a line in a text file.
I’d like to use DATA_PUMP_DIR as the default directory for the table because that directory is one which is created with the database itself.
This means that, in order to create the table, the schema that owns it will need to have READ access to the directory :
grant
read
on
directory data_pump_dir
to
hr
/
create
table
view_file_xt(
line number,
text varchar2(4000))
organization external(
type oracle_loader
default
directory data_pump_dir
access parameters (
records delimited
by
newline
nologfile
nobadfile
nodiscardfile
fields terminated
by
'~'
missing field
values
are
null
(
line recnum,
text
char
(4000))
)
location(
''
)
)
reject limit unlimited
/
Now you may be wondering why we don’t also need WRITE access to the default directory, which is what you’d expect with an External Table.
The simple reason is that the table is defined as not having a logfile, a badfile or a discardfile so this privilege is not required.
Also, it’s probably worth mentioning that we can use the RECNUM pseudo-column to generate the value for LINE.
Having a line number comes in handy when you want to run more sophisticated queries against your external table.
If I didn’t have access to the directory via the Operating System, I’d probably need to rely on something like this to get a file listing.
Anyhow, I happen to know that there’s a file called lockdown_catchup_top5.txt that I’m quite interested in.
In the olden days ( 11gR2), you’d need to do some DDL to point the table at the file. Since 12c however, you can do this :
select
text
from
view_file_xt
external
modify
(
default
directory app_files
location(
'lockdown_catchup_top5.txt'
))
where
text
is
not
null
/
TEXT
--------------------------------------------------------------------------------
The Post
Frozen 2
Killing Eve Series 1
and
2
Killing Eve Series 3 (
out
at
Easter
on
BBC
in
the UK <img draggable=
"false"
role=
"img"
class=
"emoji"
alt=
""
src=
"https://s0.wp.com/wp-content/mu-plugins/wpcom-smileys/twemoji/2/svg/1f642.svg"
>
Good Omens -
to
remind me that the Apocalypse doesn't have
to
be the
end
of
the
world.
Despite the fact that we’ve pointed our table at a different directory and location from that it was created with, we can see that our use of the EXTERNAL MODIFY clause has had no lasting effect :
select
directory_name, location
from
user_external_locations
where
table_name =
'VIEW_FILE_XT'
/
DIRECTORY_NAME LOCATION
-------------------- --------------------
DATA_PUMP_DIR
On my database, the HR user now has access to two directories :
select
directory_name
from
all_directories
/
DIRECTORY_NAME
--------------------
DATA_PUMP_DIR
APP_FILES
Incidentally, if you’d rather see the directories to which you have access, together with the privileges that you have on them then you could just run :
select
priv.table_name
as
directory_name,
listagg(priv.privilege,
', '
) within
group
(
order
by
priv.privilege)
as
privileges
from
user_tab_privs priv
inner
join
all_directories dir
on
dir.directory_name = priv.table_name
group
by
priv.table_name
order
by
1,2
/
DIRECTORY_NAME
PRIVILEGES
-------------------- ------------------------------
APP_FILES
READ
, WRITE
DATA_PUMP_DIR
READ
I have a file in the DATA_PUMP_DIR directory called lockdown_top5.txt.
If I want to view this file using our new table, I can use the EXTERNAL MODIFY clause again :
select
text
from
view_file_xt
external
modify
(
default
directory data_pump_dir
location(
'lockdown_top5.txt'
))
where
text
is
not
null
/
As expected, I’ll now get the contents of the new file in the new directory :
TEXT
----------------------------------------------------------------------------------------------------
The commute into work is much easier.
Every day is dress-down day, whilst only some days are get dressed days.
I'm pretty sure my webcam is switched off.
The BBC are showing classic sporting events that I remember for first time round.
It's finally stopped raining in Devon.
If you’re on 11gR2 or earlier, you can still take advantage of external tables to view OS files. However, if you’re likely to have multiple concurrent sessions using this table, you’ll probably need to do a little more legwork…
Alter table statements and DBMS_LOCK
In 11g, we don’t have EXTERNAL MODIFY. Fortunately, we can change the table to point to different locations and directories :
alter
table
view_file_xt
default
directory MY_FILES location(
'lockdown_catchup_top5.txt'
);
…followed by…
select
text
from
view_file_xt
where
text
is
not
null
;
To view the file that’s in the DATA_PUMP_DIR directory, we’d need to run :
alter
table
view_file_xt
default
directory DATA_PUMP_DIR location(
'lockdown_top5.txt'
);
select
text
from
view_file_xt
where
text
is
not
null
;
As well as extra typing – and the need to be aware of the current transaction terminating as a result of DDL commands being issued, we also need to account for the fact that our “query” is split across multiple operations rather than a single select statement.
Depending on how this table is being used, you may want to ensure that it remains unchanged across these multiple statements.
For example, it would be unfortunate if you thought you were about to look at a list of comforting movies, only for another session to re-point the table to a data pump log file between you setting it to the movies file and issuing the select statement.
Fortunately, there is a package which you can use to address issues like this – DBMS_LOCK.
Locking the “table” until you’re done with it
In order to ensure that access to the table is serialized, we’ll need to implement some kind of locking mechanism.
The steps we want to follow are :
- lock the table for exclusive use
- point the table at a given file in a given directory
- execute our query against the table
- release the lock
To do this, we’re going to have to create a wrapper for the table rather than allowing direct access to it.
First of all, we need to grant execute on DBMS_LOCK to HR :
grant
execute
on
dbms_lock
to
hr;
Now we can implement our locking by means of a package :
NOTE : this package is written in a linear fashion in order to more clearly illustrate the steps we’re taking here. You may consider it more appropriate to modularise it into multiple package members if you’re planning to use it in anger…
create
or
replace
package view_files
as
type typ_fline
is
table
of
view_file_xt%rowtype;
function
cat(
i_dir
in
user_external_locations.directory_name%type,
i_fname
in
user_external_locations.location%type)
return
typ_fline pipelined;
end
view_files;
/
create
or
replace
package body view_files
as
GC_TIMEOUT constant number := 30;
function
cat(
i_dir
in
user_external_locations.directory_name%type,
i_fname
in
user_external_locations.location%type)
return
typ_fline pipelined
is
fline view_file_xt%rowtype;
stmnt varchar2(4000);
-- Locking variables
lh varchar2(128);
req_ret pls_integer;
rel_ret pls_integer;
msg varchar2(4000);
-- We're doing some DDL here so...
pragma autonomous_transaction;
begin
if i_dir
is
null
or
i_fname
is
null
then
raise_application_error(-20000, 'A directory object
and
a file
name
must be specified
');
end if;
-- Get the handle for the lock we want to use
dbms_lock.allocate_unique(
lockname => '
CAT_FILE_LOCK
',
lockhandle => lh);
-- Request the lock
-- Note that, in 11g, the default for release_on_commit is FALSE so our posession of this
-- lock will persisit across transaction boundaries.
req_ret := dbms_lock.request( lockhandle => lh, timeout => GC_TIMEOUT);
-- Return code 0 is Success. Return Code 4 is that we already own the lock.
if req_ret not in (0, 4) then
msg := '
Error requesting lock :
'||
case req_ret
when 1 then '
Timeout
'
when 2 then '
Deadlock
'
when 3 then '
Parameter error
'
when 5 then '
Illegal lock handle
'
end;
raise_application_error(-20000, msg);
end if;
-- Point the external table to the file we want to read...
stmnt := '
alter
table
view_file_xt
default
directory
'||i_dir ||q'
[ location(
']'
||i_fname||q
'['
)]';
execute
immediate stmnt;
-- Now the good bit - read the contents of the file
for
r_text
in
(
select
line, text
from
view_file_xt) loop
fline.line := r_text.line;
fline.text := r_text.text;
pipe row(fline);
end
loop;
-- Finally, release the lock
rel_ret := dbms_lock.release( lockhandle => lh);
end
cat;
end
view_files;
/
If we begin with the lockdown_top5.txt file in the DATA_PUMP_DIR :
select
text
from
table
( view_files.cat(
'DATA_PUMP_DIR'
,
'lockdown_top5.txt'
))
where
text
is
not
null
-- strip out any blank lines
order
by
line
/
TEXT
----------------------------------------------------------------------------------------------------------------------------------
The commute
into
work
is
much easier.
Every
day
is
dress-down
day
, whilst
only
some
days are get dressed days.
I
'm pretty sure my webcam is switched off.
The BBC are showing classic sporting events that I remember for first time round.
It'
s finally stopped raining
in
Devon.
We can see that there have been a couple of changes. Predictably, the the external table now is pointing to the Directory Object and file which we specified in our call to the package:
select
directory_name, location
from
user_external_locations
where
table_name =
'VIEW_FILE_XT'
;
DIRECTORY_NAME LOCATION
------------------------------ ------------------------------
DATA_PUMP_DIR lockdown_top5.txt
If we look at the SYS table DBMS_LOCK_ALLOCATED we can see that a new entry has been created for our lock, with an expiration set for 10 days from today :
select
name
, lockid, expiration
from
dbms_lock_allocated
where
name
=
'CAT_FILE_LOCK'
;
NAME
LOCKID EXPIRATION
-------------------- -------------------- -----------
CAT_FILE_LOCK 1073741851 15-APR-2020
Just to prove that the locking works, let’s run two separate sessions using the same table to view the contents of the two different files.
We’ll add in a sleep to the package so that the second session will wait an appreciable amount of time before acquiring the lock.
To start with, let’s see how long a normal run takes :
set
timing
on
select
text
from
table
( view_files.cat(
'DATA_PUMP_DIR'
,
'lockdown_top5.txt'
))
where
text
is
not
null
/
TEXT
----------------------------------------------------------------------------------------------------
The commute
into
work
is
much easier.
Every
day
is
dress-down
day
, whilst
only
some
days are get dressed days.
I
'm pretty sure my webcam is switched off.
The BBC are showing classic sporting events that I remember for first time round.
It'
s finally stopped raining
in
Devon.
5
rows
selected.
Elapsed: 00:00:00.039
Now we’re going to add a sleep to the CAT function directly before the lock is released.
This will have the effect of causing the function to wait for 10 seconds before releasing the lock…
...
-- Finally, release the lock
dbms_lock.sleep(10);
rel_ret := dbms_lock.release( lockhandle => lh);
end
cat;
Now, in session 1, we re-run our first query.
More-or-less simultaneously, we kick off the query in session 2.
Session 1 looks like this :
clear screen
select
sys_context(
'userenv'
,
'sessionid'
)
as
session_id
from
dual;
set
timing
on
set
feedback 1
select
text
from
table
( view_files.cat(
'DATA_PUMP_DIR'
,
'lockdown_top5.txt'
))
where
text
is
not
null
/
Session 2 is similar, but points to a different file and directory :
clear screen
select
sys_context(
'userenv'
,
'sessionid'
)
as
session_id
from
dual;
set
timing
on
set
feedback 1
select
text
from
table
( view_files.cat(
'APP_FILES'
,
'lockdown_catchup_top5.txt'
))
where
text
is
not
null
/
If we look at the output from Session 1, we can see that roughly 10 seconds was added to the runtime. This is the length of time we set the sleep for :
SESSION_ID
--------------------
325844
TEXT
----------------------------------------------------------------------------------------------------
The commute into work is much easier.
Every day is dress-down day, whilst only some days are get dressed days.
I'm pretty sure my webcam is switched off.
The BBC are showing classic sporting events that I remember for first time round.
It's finally stopped raining in Devon.
5 rows selected.
Elapsed: 00:00:10.082
Session 2 takes much longer :
SESSION_ID
--------------------
325912
TEXT
----------------------------------------------------------------------------------------------------
The Post
Frozen 2
Killing Eve Series 1 and 2
Killing Eve Series 3 ( out at Easter on BBC in the UK <img draggable="false" role="img" class="emoji" alt="" src="https://s0.wp.com/wp-content/mu-plugins/wpcom-smileys/twemoji/2/svg/1f642.svg">
Good Omens - to remind me that the Apocalypse doesn't have to be the end of the world.
5 rows selected.
Elapsed: 00:00:17.908
This is because, as well as executing the 10 second sleep, it had to wait almost 8 seconds to acquire the lock ( the other two seconds was me switching between sessions to execute the query).
Obviously, this lock will only be invoked when you access the table via the package. Therefore, users needing to use this functionality should be granted execute on the package but not have privileges to query the table directly.
Acknowledgement
There are quite a few articles around about how to use DBMS_LOCK, but I am particularly indebted to Jeff Kemp for taking the time to record his adventures with this package for posterity.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK