Friday, January 18, 2013

Function-Based indexes

Indexes in Oracle are like indexes in books and documents. It is simply a way of finding things quicker than looking through the whole thing. But, there can be some tricky little gotchas when it comes to Oracle data and real world scenarios.

I was recently tasked with adding an index to a column in a table. At first glance this is a stupid simple task. Think of the dumbest, most clueless fucktard you know. They could create an index. It is that simple. The confusion and trickiness and why I have a job is knowing when to use an index and how to use an index.

CREATE INDEX index_name ON table_name (column_name [, another_column_name...]);

Not heavy lifting is it? Bro, do you even index?

So, as any good DB dude should do, I ran some queries to get to know the data I was tasked with indexing. So, it turns out that the column I was supposed to index was largely null. This is a problem. Nulls are not included in indexes. How do you index the absence of something? Now, I could go ahead and create an index on the column and all of the non-null values would be indexed appropriately. But, would that be enough? I needed to find out how the column is referenced in the system. I ran a query to look through all of the database objects for references to the column and/or table. I might cover this in another post. I was able to tell that the column was not used in any WHERE clauses in the DB code. I went to an App guy to get perspective on the application and how it interacted with the column. So, it turns out that most of the calls from the App were checking on the NULL status of the column. Well, that's no good. All of those calls would not use our shiny new index and the Application would continue to be dog-ass slow.

Now we get to the meat of this post. Recent versions of Oracle are able to create indexes on function calls on columns. I believe it was added in 8i but wouldn't swear to it. No matter, we are on 11g. The statement I wrote is:

CREATE INDEX FBIDX_MYTAB_NVL_MYCOLUMN ON MYTABLE( NVL(MYCOLUMN, 'some value'));

I named the column FB to indicate Function-Based and IDX so I can know at a glance that it is an Index if I am looking at multiple objects in a list or something. Remember, meaningful names are never a bad idea. The interesting part of this is the NVL function on MYCOLUMN. NVL says, if MYCOLUMN is null, pretend it is 'some value' so I can compare it to other things. So, even when MYCOLUMN is null, NVL(MYCOLUMN, 'some value') is equal to 'some value'. And 'some value' can be indexed. Of course, to fully utilize the Funciton-based index, all of the calls that check the NULL status of the column have to be changed to NVL(MYCOLUMN, 'some value') = 'some value'. But, the bottom line is that the NULL values get indexed and interaction with the table is improved.


Wednesday, January 9, 2013

Letting the Data Dictionary do some of the leg work

Let's say that you have a column that is on several tables and you need to modify the length of that column. One method of doing that would be to figure out which tables have the column and manually write a shit pot load of alter statements to do what needs to be done. Now, let's suppose you are a shade or two brighter than Corky from Life Goes On. You are going to let the data dictionary do the work for you. Oracle comes with a plethora of tables, views, synonyms and what not that is chock fucking full of information. They are broken up based on item type and user access type. If you are curious, check public synonyms in your schema or if you have read access to SYS, check out the views that look like one of the following:

ALL_%
DBA_%
USER_%

Where the % is some string like TABLES, PROCEDURES, etc. Let's focus on USER_% first. You connect to Oracle as a user and the SYS views that begin with USER_ are the ones that you own. They exist in your schema - the one you logged in with anyway. So, USER_TABLES will list the tables you own, USER_PROCEDURES will list the procedures you own, and so on and so forth. Whereas, the SYS views with ALL_ are the ones that you have access to read, write, execute, etc. The DBA's and a lucky few with elevated privileges will have access to DBA_ views and can see everything in the database. Therefore, DBA_TABLES will list every stinking table in the entire goddamn database.

Now, here is an amazing little trick that will save you a Honey Boo Boo's mom sized chunk of work. You can write queries against those tables where the results are SQL commands. Let's say you have 3 procedures entitled MYPROC1, MYPROC2, and MYPROC3. Now let's suppose you want to compile all of them. You would write select 'ALTER PROCEDURE ' || procedure_name || ' COMPILE;'  as alterallmyprocs from USER_PROCEDURES; and execute that bad boy. Your results will look like:

ALTERALLMYPROCS
ALTER PROCEDURE MYPROC1 COMPILE;
ALTER PROCEDURE MYPROC2 COMPILE;
ALTER PROCEDURE MYPROC3 COMPILE;

So, you can paste those results into a sql window, execute as script and Oracle will compile all of your procedures. Pretty handy right? Well, imagine that you have 25 schemas with similar structures and there are 100 tables in each. Every one of those tables has a column called LAST_UPDATER that tracks via the application which user last updated the record. And now, let's suppose that the USERNAME column is getting enlarged so all of these LAST_UPDATER columns would need to expand to contain the enlarged USERNAME. Ignore for the moment that ideally you would be using the corresponding user id instead of the actual username. This is for instructional purposes, capisce? Would you really want to write the 2500 required statements by hand? Well, maybe you are a masochist. Or maybe you are a contractor. Or maybe you like to sandbag and appear industrious while actually slacking. In those cases, and for you Corky's out there, continue to work hard, not smart. I'll be using sql writing sql to get the same result in a fraction of the time. Just for fun, here is the sql writing sql statement to perform the above task:
select 'alter table ' || owner || '.' || table_name || ' modify ( ' || column_name || ' varchar2(256) );' it from all_tab_cols where column_name in ('USERNAME', 'LAST_UPDATER') and owner like '%MYAPP%';

I was recently tasked with some Year End reports. I needed to run a query in every production database for a specific app or set of apps and I used this technique to meet the narrow deadline. Of course the underlying requirement is that I am able to log into a schema created for me but given read access everywhere for purposes just like this.

Of course, with great power comes great responsibility. Use this for the good of all mankind. Do not fuck with people using this. Do not write something like select 'delete from ' || owner || '.' || table_name || ';' it from all_tables where owner = 'ThatDickThatAggravatesTheFuckOutOfMeAndFoolishlyGaveMePrivsOnHisTables';
Or even worse: select 'delete from ' || owner || '.' || table_name || ' where mod(' || column_name || ', 13) = 0;' from all_cons_columns where constraint_name like '%PK%' and column_name like '%ID%' and owner = 'ThatDickThatAggravatesTheFuckOutOfMeAndFoolishlyGaveMePrivsOnHisTables';
The former is going to delete everything from tables that do not get 'Child Record Found' errors but the latter is wicked nefarious. It will delete every row whose primary key id is evenly divisible by 13 and there are no child record conflicts. Sure, I could work out how to eliminate those conflicts but I don't want to make it that easy for you to really fuck with someone.

All kidding aside, properly used, this can save a ton of time and heartache. Start small, pay attention to syntax, and test, test, test!