<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>neilkodner.com &#187; oracle</title>
	<atom:link href="http://www.neilkodner.com/tag/oracle/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.neilkodner.com</link>
	<description>Data Driven.  Since 1971.</description>
	<lastBuildDate>Sun, 23 Oct 2011 16:40:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>An analysis of Oracle errors in the leaked 9/11 Pager Data</title>
		<link>http://www.neilkodner.com/2009/11/an-analysis-of-oracle-errors-in-the-leaked-911-pager-data/</link>
		<comments>http://www.neilkodner.com/2009/11/an-analysis-of-oracle-errors-in-the-leaked-911-pager-data/#comments</comments>
		<pubDate>Tue, 01 Dec 2009 01:59:40 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[regex]]></category>
		<category><![CDATA[regexp]]></category>

		<guid isPermaLink="false">http://www.neilkodner.com/?p=84</guid>
		<description><![CDATA[Yes, you read that correctly. Here&#8217;s how it started: I&#8217;m working on some text analysis in Python and was looking for some test data. Someone recommended I use the 9/11 Pager Data from Wikileaks. I downloaded the data, ran my program against it (which is the subject of another post) and all was well. Got [...]]]></description>
			<content:encoded><![CDATA[<p>Yes, you read that correctly.  Here&#8217;s how it started:</p>
<p>I&#8217;m working on some text analysis in Python and was looking for some test data.  Someone recommended I use the <a href="http://911.wikileaks.org/">9/11 Pager Data from Wikileaks</a>.  I downloaded the data, ran my program against it (which is the subject of another post) and all was well.  Got some great insight and I&#8217;ll share that later.</p>
<p>I then started browsing the raw data in vi.  After paging down a few times, what did I see?</p>
<p><img src="http://www.neilkodner.com/wp-content/uploads/2009/11/oracle-error.jpg" alt="oracle error" title="oracle error" width="808" height="48" class="alignnone size-full wp-image-85" /></p>
<p>Paging down some more yielded this gem:</p>
<p><img src="http://www.neilkodner.com/wp-content/uploads/2009/11/another-error.jpg" alt="another error" title="another error" width="849" height="143" class="alignnone size-full wp-image-88" /></p>
<p>The gears are now spinning&#8230;<br />
<a href="http://twitter.com/neilkod/status/6219088850"><img src="http://www.neilkodner.com/wp-content/uploads/2009/11/dork.jpg" alt="dork" title="dork" width="584" height="185" class="alignnone size-full wp-image-101" /></a></p>
<p>I wondered how many of these Oracle errors polluted the NYC messaging system.  Lets find out &#8211; Python to the rescue!</p>
<pre class="brush: plain; title: ; notranslate">
Error		Frequency		Description
ORA-00255	1	error archiving log %s of thread %s, sequence # %s
ORA-00333	1	redo log read error block %s count %s
ORA-00334	1	archived log: '%s'
ORA-01035	1	ORACLE only available to users with RESTRICTED SESSION privilege
ORA-01089	1	immediate shutdown in progress - no operations are permitted
ORA-01401	1	inserted value too large for column
ORA-01410	1	invalid ROWID
ORA-01652	1	unable to extend temp segment by %s in tablespace %s
ORA-01722	1	invalid number
ORA-02050	1	transaction %s rolled back, some remote DBs may be in-doubt
ORA-02068	1	following severe error from %s%s
ORA-03114	1	not connected to ORACLE
ORA-1146	1	cannot start online backup - file %s is already in backup
ORA-12154	1	TNS:could not resolve the connect identifier specified
ORA-1534	1	rollback segment '%s' doesn't exist
ORA-1537	1	cannot add file '%s' - file already part of database
ORA-1553	1	MAXEXTENTS must be no smaller than the %s extents currently allocated
ORA-1593	1	command no longer valid, see ALTER USER
ORA-19502	1	write error on file \&quot;%s\&quot;, blockno %s (blocksize=%s)
ORA-20012	1	User-defined
ORA-24324	1	service handle not initialized
ORA-27063	1	number of bytes read/written is incorrect
ORA-7445	1	exception encountered: core dump [%s] [%s] [%s]
ORA-00312	2	online log %s thread %s: '%s'
ORA-10		2	no data found
ORA-11		2	invalid value %s for attribute %s, must be between %s and %s
ORA-16038	2	log %s sequence# %s cannot be archived
ORA-20000	2	The stored procedure 'raise_application_error'
ORA-301		2	error in adding log file '%s' - file cannot be created
ORA-959		2	tablespace '%s' does not exist
ORA-00060	3	deadlock detected while waiting for resource
ORA-07445	3	exception encountered: core dump [%s] [%s] [%s]
ORA-12012	3	error on auto execute of job %s
ORA-00600	4	internal error code, arguments: [%s], [%s], [%s]
ORA-1652	4	unable to extend temp segment by %s in tablespace %s
ORA-00917	10	missing comma
ORA-01013	12	user requested cancel of current operation
ORA-1650	12	unable to extend rollback segment %s by %s in tablespace %s
ORA-20011	21	User-defined error: Execute_system: Err
ORA-1142	33	cannot end online backup - none of the files are in backup
</pre>
<p>Final analysis?  Where can I send my resume? </p>
<p>The Python code is simple &#8211; it loops through each line of the 49MB file (448k lines) and checks for an Oracle error using the regexp ORA-[0-9]{1,5} which I intended to mean the letters ORA, followed by a dash, followed by between one and five numbers.  Please feel free to correct/improve my regex-fu.  If a match is found, then add it to a dictionary as the key, and set the value to the count.  If the key is already present in the dictionary, the value gets incremented.  Finally, the contents of the dictionary are displayed, sorted by the value(frequency).</p>
<pre class="brush: python; title: ; notranslate">
#!/usr/bin/python
import re
f=open('messages_all.txt')

pattern = re.compile(r'ORA-[0-9]{1,5}')
errors={}
for line in f:
	err = re.findall(pattern,line)
	if err:
		errors[err[0]] = errors.get(err[0],0)+1
f.close()

for k,v in sorted(errors.items(), key=lambda(k,v):(v,k)):
	print '%s\t%d' % (k,v)
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.neilkodner.com/2009/11/an-analysis-of-oracle-errors-in-the-leaked-911-pager-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Generating multiple Oracle TKPROF reports using Python</title>
		<link>http://www.neilkodner.com/2009/11/generating-multiple-oracle-tkprof-reports-using-python/</link>
		<comments>http://www.neilkodner.com/2009/11/generating-multiple-oracle-tkprof-reports-using-python/#comments</comments>
		<pubDate>Tue, 24 Nov 2009 17:44:58 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[dba]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.neilkodner.com/?p=76</guid>
		<description><![CDATA[Recently, a customer told me that they felt a batch job was taking too long each night, I gave them a few commands to add to their nightly run. These commands named the tracefile and enabled 10046 logging. Since I&#8217;m lazy(the good kind), I figured I&#8217;d use Python to build the commands to run TKPROF [...]]]></description>
			<content:encoded><![CDATA[<p>Recently, a customer told me that they felt a batch job was taking too long each night, I gave them a few commands to add to their nightly run.</p>
<pre class="brush: sql; title: ; notranslate">
alter session set tracefile_identifier='charging_batch';
exec dbms_monitor.session_trace_enable;
</pre>
<p>These commands named the tracefile and enabled 10046 logging.</p>
<p>Since I&#8217;m lazy(the good kind), I figured I&#8217;d use Python to build the commands to run TKPROF for each process.  The program expects to be run from the udump directory.  As I get more time I&#8217;ll enhance it to automatically grab the location of udump from the database.</p>
<p>The script takes an optional parameter for a tracefile identifier.  If the parameter is passed, filenames containing the identifier text will be processed.  Otherwise, all tracefiles are processed.  A to-do item is to make sure the tracefile is an actual 10046 before running TKPROF against it.</p>
<p>The output format is the optional tracefile identifier_process_id.out.  The file suffix can be overridden with variable tkprof_suffix.  I use .out as an homage to Michael Levy, wherever he may be, who showed me how to use the tool way back in 1998.</p>
<pre class="brush: python; title: ; notranslate">
#!/usr/local/bin/python
# tkprof.py
# kodner 2009
# and runs a simple tkprof on them
import sys
import os
import re
sort = &quot;fchqry&quot; #parameterize this
tkprof_suffix = 'out' #this too

# find a string to be used as a tracefile identifier
# to limit the tracefiles processed
try:
  tracefile_identifier = sys.argv[1]
  print &quot;&quot;
  print &quot;&quot;
  print &quot;tracefile identifier supplied is: %s&quot; % (tracefile_identifier)
  print &quot;&quot;
  print &quot;&quot;
except:
  tracefile_identifier = None

# lists the files with suffix .trc and contain out suffix .trc
traces=[x for x in os.listdir('.') if x.endswith('.trc')]

for file in traces:
  tracefile = None

  # extract the process id from the filename.
  # I'm sure this could be done better.  i split it into multiple
  # lines for readability.

  processNum = re.findall(r'ora_[0-9]+',file)
  processNum = processNum[0].split('_')[1]

  # if a tracefile_identifier is supplied then make sure our current file
  # contains the string.  we'll also make sure the output filename contains
  # the tracefile identifier.

  if tracefile_identifier:
    if file.find(tracefile_identifier) &gt; 0:
      tracefile = file
      outputfile=tracefile_identifier + '_' + processNum + '.' + tkprof_suffix
  else:
    tracefile=file
    outputfile=processNum + '.' + tkprof_suffix

  if tracefile:
    print &quot;processing tracefile %s ...&quot; % (tracefile)

    # using regexp, find the process number of the file.
    # the process number will be used to name the tkprof output file

    # we will assume that the tracefile name is in the format
    # $ORACLE_SID_ora_$PROCESSNUM.trc
    # and that the tracefile name may contain a tracefile identifier
    # set by using alter session set tracefile_identifier = 'foo';

    # generate the tkprof command use flags sys=no and waits=yes
    command=&quot;tkprof %s %s sys=no waits=yes sort=%s&quot; % (tracefile,outputfile,sort)

    # execute the command
    os.system(command)
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.neilkodner.com/2009/11/generating-multiple-oracle-tkprof-reports-using-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Process the whole world, or just one item</title>
		<link>http://www.neilkodner.com/2009/11/process-the-whole-world-or-just-one-item/</link>
		<comments>http://www.neilkodner.com/2009/11/process-the-whole-world-or-just-one-item/#comments</comments>
		<pubDate>Thu, 05 Nov 2009 22:26:41 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[plsql]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://www.neilkodner.com/?p=37</guid>
		<description><![CDATA[One of my favorite PL/SQL techniques used in batch job development is to add an extra parameter, default it to NULL, so that I can test a single account at a time. example So the real trick is the NVL in the where clause &#8211; If I supply a value for in_employer_id that&#8217;s the only [...]]]></description>
			<content:encoded><![CDATA[<p>One of my favorite PL/SQL techniques used in batch job development is to add an extra parameter, default it to NULL, so that I can test a single account at a time.  </p>
<p>example</p>
<pre class="brush: sql; title: ; notranslate">
CREATE PROCEDURE bill_employers(in_employer_id DEFAULT NULL)
IS
BEGIN
  FOR employer in
    SELECT  emp.employer_id
         ,  emp.account_number
         ,  emp.bill_amt
      FROM  employer emp
     WHERE  emp.employer_id = NVL(in_employer_id,emp.employer_id)
  LOOP
    --Do stuff here
  END LOOP;
END;
</pre>
<p>So the real trick is the NVL in the where clause &#8211; If I supply a value for in_employer_id that&#8217;s the only one that&#8217;ll get processed.  </p>
<pre class="brush: sql; title: ; notranslate">
exec bill_employers(in_employer_id =&gt; 12345);
</pre>
<p>If I don&#8217;t pass a value at all, we&#8217;ll process the whole set because of the NVL &#8211; if in_employer_id is NULL, then the rvalue becomes the same as the lvalue.</p>
<pre class="brush: sql; title: ; notranslate">
exec bill_employers;
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.neilkodner.com/2009/11/process-the-whole-world-or-just-one-item/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Manager: How big is my table?  Me: What do you mean?</title>
		<link>http://www.neilkodner.com/2009/11/manager-how-big-is-my-table-me-what-do-you-mean/</link>
		<comments>http://www.neilkodner.com/2009/11/manager-how-big-is-my-table-me-what-do-you-mean/#comments</comments>
		<pubDate>Tue, 03 Nov 2009 20:33:03 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[dba]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[oracle]]></category>
		<category><![CDATA[plsql]]></category>
		<category><![CDATA[script]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://www.neilkodner.com/?p=3</guid>
		<description><![CDATA[Recently, a data warehouse manager sent me a list of 49 tables; he wants the approximate size of each. For an end-user, this is no easy task. Sure, GUI interfaces such as Toad or OEM will give this information, but not all managers (or developers for that matter) know how to get this information. Furthermore, [...]]]></description>
			<content:encoded><![CDATA[<p>Recently, a data warehouse manager sent me a list of 49 tables; he wants the approximate size of each.  For an end-user, this is no easy task.  Sure, GUI interfaces such as Toad or OEM will give this information, but not all managers (or developers for that matter) know how to get this information. </p>
<p>Furthermore, what constitutes table size?  Is it the space occupied by data?  What about indexes?  LOBs?  Oracle Text Indexes? Fortunately this schema didn&#8217;t contain any LOBs otherwise I&#8217;d have to look those sizes up.</p>
<p>Knowing what I do about both the manager and the project, I&#8217;m assuming that they wanted size info on these tables, to make sure we have enough room to clone them somewhere.  I chose the safest route and opted to produce data + index size for him. </p>
<p>My goal was to write a query that would present the data in a manager-friendly way.  A secondary goal(Sounds like I&#8217;ve been playing too much Command &#038; Conquer) would be to provide him with a query so he could get this information on his own.</p>
<p>Here&#8217;s my first cut at the query(explanation follows the SQL):</p>
<pre class="brush: sql; title: ; notranslate">with table_list as (select  table_name
                         ,  owner
                         ,  num_rows
                         ,  avg_row_len
                         ,  last_analyzed
            from  all_tables
           where  owner = 'SCHEMANAME'
             and  table_name in ( 'TABLE1'
                                , 'TABLE2'
                                , 'TABLE3'
...
                                , 'TABLE49'))
select  a.owner &quot;table owner&quot;
     ,  a.table_name &quot;table name&quot;
     ,  avg(ROUND(b.bytes/1024/1024,2)) &quot;table size in mb&quot;
     ,  avg(ROUND(b.bytes/1024/1024/1024,2)) &quot;table size in gb&quot;
     ,  COUNT(DISTINCT c.index_name) &quot;number of indexes&quot;
     ,  SUM(ROUND(d.bytes/1024/1024,2)) &quot;index size in mb&quot;
     ,  SUM(ROUND(d.bytes/1024/1024/1024 , 2)) &quot;index size in gb&quot;
     ,  ROUND((avg(b.bytes) + sum(d.bytes))/1024/1024,2) &quot;table + indexes in mb&quot;
     ,  ROUND((avg(b.bytes) + sum(d.bytes))/1024/1024/1024,2) &quot;table + indexes in gb&quot;
     ,  a.num_rows &quot;number of rows&quot;
     ,  a.avg_row_len &quot;average row length&quot;
     ,  a.last_analyzed &quot;last analyzed&quot;
  from  table_list a
     ,  dba_segments b
     ,  all_indexes c
     ,  dba_segments d
 where  a.table_name = b.segment_name
   and  b.segment_type = 'TABLE'
   and  a.owner = b.owner
   and  a.owner = c.owner(+)
   and  c.table_name(+) = a.table_name
   and  c.index_name = d.segment_name(+)
   and  c.owner = d.owner(+)
   and  d.segment_type(+) = 'INDEX'
GROUP BY  a.owner
       ,  a.table_name
       ,  a.num_rows
       ,  a.avg_row_len
       ,  a.last_analyzed
order by a.table_name</pre>
<p>I chose to put the hard-coded schema name and table list in a WITH query only to keep it at the top of the statement &#8211; so that the manager, or anyone else editing this query, could easily see where the editable items are.  </p>
<p>The results look like<br />
<img src="http://www.neilkodner.com/images/skitch/table_sizes-20091103-152323.jpg" alt="table size query results" /></p>
<p>So we&#8217;re off to a good start.  But then I wanted to make the process even more user-friendly.  That&#8217;s where the idea of putting this into a stored procedure came in to play.  I created a pipelined row function which displays all of the table + index information, now with LOB support.</p>
<p>So instead of running the query above, it&#8217;s just a matter of executing a function call(for each table):</p>
<pre class="brush: sql; title: ; notranslate">
SQL&gt; select * From table(table_info_pkg.table_info('CLAIMS'));

TABLE_OWNER                      TABLE_NAME
-------------------------------- --------------------------------
TOTAL_TABLE_SIZE_IN_MB TOTAL_TABLE_SIZE_IN_GB TABLE_DATA_SIZE_IN_MB
---------------------- ---------------------- ---------------------
TABLE_DATA_SIZE_IN_GB NUMBER_OF_INDEXES INDEX_SIZE_IN_MB INDEX_SIZE_IN_GB
--------------------- ----------------- ---------------- ----------------
LOB_SIZE_IN_MB LOB_SIZE_IN_GB NUMBER_OF_LOBS LOBINDEX_SIZE_IN_MB
-------------- -------------- -------------- -------------------
LOBINDEX_SIZE_IN_GB  ROW_COUNT AVERAGE_ROW_LENGTH LAST_ANAL
------------------- ---------- ------------------ ---------
CUBS_OWNER                       CLAIMS
                     0                                          108
                  .11                14               89              .09

TABLE_OWNER                      TABLE_NAME
-------------------------------- --------------------------------
TOTAL_TABLE_SIZE_IN_MB TOTAL_TABLE_SIZE_IN_GB TABLE_DATA_SIZE_IN_MB
---------------------- ---------------------- ---------------------
TABLE_DATA_SIZE_IN_GB NUMBER_OF_INDEXES INDEX_SIZE_IN_MB INDEX_SIZE_IN_GB
--------------------- ----------------- ---------------- ----------------
LOB_SIZE_IN_MB LOB_SIZE_IN_GB NUMBER_OF_LOBS LOBINDEX_SIZE_IN_MB
-------------- -------------- -------------- -------------------
LOBINDEX_SIZE_IN_GB  ROW_COUNT AVERAGE_ROW_LENGTH LAST_ANAL
------------------- ---------- ------------------ ---------
             0              0              0                   0
                  0     379029                257 01-JUN-09
</pre>
<p>The code for table_info_pkg can be found <a href="http://code.google.com/p/table-info-pkg/">on google code</a> and I welcome any and all feedback, complaints, or anything else you care to throw at me.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.neilkodner.com/2009/11/manager-how-big-is-my-table-me-what-do-you-mean/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

