Monday, February 13, 2006
Using ROWNUM Properly for Pagination
I learned a very interesting thing about Oracle's ROWNUM variable behaves and how it affects queries that have on ORDER BY clause.
My goal was to create a Java Servlet that showed query results in a list format, with links to more pages of results and PREV and NEXT links - much like a search engine. For example, if there were 100 results, and 10 results per page, there would be PREV and NEXT links around links for the individual pages ( 1 2 3 4 5 6 7 8 9 10). Something we've all seen before.
Not seeing any problem with Oracle's ROWNUM, my basic thinking was to pass a START and END to the Servlet, which would dynamically create my queries and they would basically look like this:
For example, the first page results query would be:
The next page would be:
and so on.
What I didn't realize is that ROWNUM cannot be used this way because ROWNUM doesn't behave as you think it does, and it cannot be used reliably with any ORDER BY clause. See below
Let's start by getting a known set of results, in this case there are 10 results for employees with last name "smith". Note the value of ROWNUM
Great! That looks fine ... 1 through 10, they're all in order. Let's try getting just the first 5 results.
Still looks good. In reality though, i want to be able to change how the results are sorted, so i create a dynamic ORDER BY clause. In this example, let me use LOCATION:
Holy cow! Why is my ROWNUM out of order? Anyway, let me see if my ROWNUM filtering clause still works for results 1 - 5. I should expect to see CLARA, JEFF, GERTRUDE, MARYLYNN, and KEITH as my results, in order.
OH NO! This isn't going to work! From my research, ROWNUM is a value that is inherent to that row of data, and *not* a value that is computed on the fly. WHAT A BUMMER!
From some more research, I found a trick that is 100% effective and does not have any performance impact.
What you have to do is embed your original query with NO USE OF ROWNUM in 2 embedded SQL calls
These are the results I expected in the first place when re-ordering by location.
And as an aside, MYSQL offers the LIMIT BY clause which works with the behavior you'd expect, without having to jump through hoops to get the subset you want.
Example:
This would return me exactly what I want. Makes you wonder why ORACLE is so expensive and doesn't offer this seemingly intuitive functionality.
My goal was to create a Java Servlet that showed query results in a list format, with links to more pages of results and PREV and NEXT links - much like a search engine. For example, if there were 100 results, and 10 results per page, there would be PREV and NEXT links around links for the individual pages ( 1 2 3 4 5 6 7 8 9 10). Something we've all seen before.
Not seeing any problem with Oracle's ROWNUM, my basic thinking was to pass a START and END to the Servlet, which would dynamically create my queries and they would basically look like this:
SELECT * from table where rownum between <START> and <END>
For example, the first page results query would be:
SELECT * from table where rownum between 1 and 5
The next page would be:
SELECT * from table where rownum between 6 and 10
and so on.
What I didn't realize is that ROWNUM cannot be used this way because ROWNUM doesn't behave as you think it does, and it cannot be used reliably with any ORDER BY clause. See below
Let's start by getting a known set of results, in this case there are 10 results for employees with last name "smith". Note the value of ROWNUM
SQL> select first_name, last_name, rownum
from employee
where last_name = 'SMITH';
FIRST_NAME LAST_NAME ROWNUM
--------------- ------------------------------ ----------
CLARA SMITH 1
DEBORAH SMITH 2
GERTRUDE SMITH 3
JEFF SMITH 4
KEITH SMITH 5
KENN SMITH 6
MARK SMITH 7
MARYLYNN SMITH 8
MATTHEW SMITH 9
NICOLE SMITH 10
10 rows selected.
Great! That looks fine ... 1 through 10, they're all in order. Let's try getting just the first 5 results.
SQL> select first_name, last_name, rownum
from employee
where rownum between 1 and 5 and
last_name = 'SMITH';
FIRST_NAME LAST_NAME ROWNUM
--------------- ------------------------------ ----------
CLARA SMITH 1
DEBORAH SMITH 2
GERTRUDE SMITH 3
JEFF SMITH 4
KEITH SMITH 5
5 rows selected.
Still looks good. In reality though, i want to be able to change how the results are sorted, so i create a dynamic ORDER BY clause. In this example, let me use LOCATION:
SQL> select first_name, last_name, location, rownum
from employee
where last_name = 'SMITH'
order by location;
FIRST_NAME LAST_NAME LOC ROWNUM
--------------- ---------------------------- ----- ----------
CLARA SMITH 211 1
JEFF SMITH 329 6
GERTRUDE SMITH 335 4
MARYLYNN SMITH 335 8
KEITH SMITH 348 5
MARK SMITH 349 7
MATTHEW SMITH 450 9
NICOLE SMITH 450 10
DEBORAH SMITH 813 3
KENN SMITH 847 2
10 rows selected.
Holy cow! Why is my ROWNUM out of order? Anyway, let me see if my ROWNUM filtering clause still works for results 1 - 5. I should expect to see CLARA, JEFF, GERTRUDE, MARYLYNN, and KEITH as my results, in order.
SQL> select first_name, last_name, location, rownum
from employee
where rownum between 1 and 5 and
last_name = 'SMITH'
order by location;
FIRST_NAME LAST_NAME LOC ROWNUM
--------------- ------------------------------ ----- ----------
CLARA SMITH 211 1
GERTRUDE SMITH 335 4
KEITH SMITH 348 5
DEBORAH SMITH 813 3
KENN SMITH 847 2
5 rows selected.
OH NO! This isn't going to work! From my research, ROWNUM is a value that is inherent to that row of data, and *not* a value that is computed on the fly. WHAT A BUMMER!
From some more research, I found a trick that is 100% effective and does not have any performance impact.
What you have to do is embed your original query with NO USE OF ROWNUM in 2 embedded SQL calls
SQL> select * from (
SELECT x.*, rownum as r FROM (
select first_name, last_name, location, rownum
from employee
where last_name = 'SMITH'
order by location
) x
)
where r between 1 AND 5;
FIRST_NAME LAST_NAME LOC ROWNUM R
--------------- ------------------------------ ----- ---------- ----------
CLARA SMITH 211 1 1
JEFF SMITH 329 6 2
GERTRUDE SMITH 335 4 3
MARYLYNN SMITH 335 8 4
KEITH SMITH 348 5 5
5 rows selected.
These are the results I expected in the first place when re-ordering by location.
And as an aside, MYSQL offers the LIMIT BY clause which works with the behavior you'd expect, without having to jump through hoops to get the subset you want.
Example:
select first_name, last_name, location, rownum
from employee
where
rownum between 1 and 5 and
last_name = 'SMITH'
LIMIT BY 1, 5
order by location;
This would return me exactly what I want. Makes you wonder why ORACLE is so expensive and doesn't offer this seemingly intuitive functionality.
Test Well-Formedness of XML with Perl
Sometimes you run into a problem where a tool that you are using reports an error with an XML file, but doesn't report the exact location. When this happens you need to use some other tool to text the structure of the XML file, and report the exact location of the error. When you don't have a tool available for this task, you can use Perl at the command line.
This will read the file config.xml, and report any errors that it finds. If no errors are found, it will return nothing.
Sample output of the command, reporting the error at the 6th character of line 36:
This command takes it a little further. Not only will it show the location of the error, but it will also print out debugging information for the parser, which may be useful on some cases.
This will read the file config.xml, and report any errors that it finds. If no errors are found, it will return nothing.
perl -MXML::Parser -e '$x = XML::Parser->new;$x->parsefile($ARGV[0])' config.xml
Sample output of the command, reporting the error at the 6th character of line 36:
mismatched tag at line 32, column 6, byte 1078
at /usr/lib/perl5/vendor_perl/5.8/cygwin/XML/Parser.pm line 187
This command takes it a little further. Not only will it show the location of the error, but it will also print out debugging information for the parser, which may be useful on some cases.
perl -MXML::Parser -e '$x = XML::Parser->new(Style=>"Debug");$x->parsefile($ARGV[0])' config.xml
Sunday, February 12, 2006
Get Shared Windows Network Resources with Perl
The following code can be run on Windows using ActiveState's Perl. It will print data about each shared network resources on the machine.
Sample Output:
See also: Perldoc for Win32::NetResource
use Win32::NetResource qw(:DEFAULT GetSharedResources);
if (GetSharedResources(my $resources, RESOURCETYPE_ANY,
{RemoteName => "\\\\" . Win32::NodeName()}))
{
foreach my $href (@$resources) {
print "-----\n";
foreach (keys %$href) {
print "$_: $href->{$_}\n";
}
}
}
Sample Output:
C:\>perl getresources.pl
-----
LocalName:
Scope: 2
DisplayType: 3
Type: 1
Provider: Microsoft Windows Network
Usage: 1
Comment:
RemoteName: \\RHANSON\outgoing
See also: Perldoc for Win32::NetResource
Saturday, February 11, 2006
Replace Text In Multiple Files With Perl
You can use Perl at the command line (or even a short script) to make text replacements in a list of files. The key is to use three of Perl's command line switches.
These three switches combined will open each file that is passed as an argument, execute your code for each line in the file, and replace the original file with the output of rhe code. The -i switch takes an optional extension which it will use to create backups of the original (recommended!).
Replace all occurences of "foo" with "bar"
Fix all href attributes in all HTML files to use "http://www.foo.com" instead of "http://www.bar.com", recursively.
See perldoc perlrun for more information on Perl's command line switches.
-i[extension] In place editing of files.
-e 'code block' Evaluates the specified code.
-p Loops over code for each argument, prints $_.
These three switches combined will open each file that is passed as an argument, execute your code for each line in the file, and replace the original file with the output of rhe code. The -i switch takes an optional extension which it will use to create backups of the original (recommended!).
Replace all occurences of "foo" with "bar"
perl -pi.bak -e 's/foo/bar/g' *
Fix all href attributes in all HTML files to use "http://www.foo.com" instead of "http://www.bar.com", recursively.
# broken into multiple lines for readability
perl -pi.bak
-e 's!(href=(["']))http://www.foo.com\2!$1http://www.bar.com!g'
`find . -name \*.html`
See perldoc perlrun for more information on Perl's command line switches.
Thursday, February 09, 2006
Text Wrapping Function for Java
Note that the Jakarta commons-lang package has similar functionality. I would only use this were using commons-lang was not an option.
This function takes a string value and a line length, and returns an array of lines. Lines are cut on word boundaries, where the word boundary is a space character. Spaces are included as the last character of a word, so most lines will actually end with a space. This isn't too problematic, but will cause a word to wrap if that space pushes it past the max line length.
This function takes a string value and a line length, and returns an array of lines. Lines are cut on word boundaries, where the word boundary is a space character. Spaces are included as the last character of a word, so most lines will actually end with a space. This isn't too problematic, but will cause a word to wrap if that space pushes it past the max line length.
String text = "this is a long line of text that needs to be wrapped";
String [] lines = wrapText(text, 20);
for (int i = 0; i < lines.length; i++) {
System.out.println(lines[i]);
}
static String [] wrapText (String text, int len)
{
// return empty array for null text
if (text == null)
return new String [] {};
// return text if len is zero or less
if (len <= 0)
return new String [] {text};
// return text if less than length
if (text.length() <= len)
return new String [] {text};
char [] chars = text.toCharArray();
Vector lines = new Vector();
StringBuffer line = new StringBuffer();
StringBuffer word = new StringBuffer();
for (int i = 0; i < chars.length; i++) {
word.append(chars[i]);
if (chars[i] == ' ') {
if ((line.length() + word.length()) > len) {
lines.add(line.toString());
line.delete(0, line.length());
}
line.append(word);
word.delete(0, word.length());
}
}
// handle any extra chars in current word
if (word.length() > 0) {
if ((line.length() + word.length()) > len) {
lines.add(line.toString());
line.delete(0, line.length());
}
line.append(word);
}
// handle extra line
if (line.length() > 0) {
lines.add(line.toString());
}
String [] ret = new String[lines.size()];
int c = 0; // counter
for (Enumeration e = lines.elements(); e.hasMoreElements(); c++) {
ret[c] = (String) e.nextElement();
}
return ret;
}
Load XML File With JAXP
The following is a short example of loading a file using JAXP.
import javax.xml.parsers.*;
import org.xml.sax.*;
import org.w3c.dom.*;
public class Example {
static Document document;
public static void main (String[] ARGS) {
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
document = db.parse("./BAKLO211.xml");
}
catch (Exception e) {
e.printStackTrace(System.err);
}
}
}
Loading XML File With Xerces
Here is a simple example of using Xerces to load an XML document into the DOM.
import org.apache.xerces.parsers.*;
import org.w3c.dom.traversal.*;
import org.w3c.dom.*;
public class Example {
static Document document;
public static void main (String[] ARGS) {
try {
DOMParser parser = new DOMParser();
parser.parse("./sample.xml");
document = parser.getDocument();
}
catch (Exception e) {
e.printStackTrace(System.err);
}
}
}
Supress SQL*Plus Formatting
When you want to dump data using SQL*Plus to a file you can use the spool command. This also writes all of the special formatting, including column headers ever 10 lines or so. Often you just want the data, and not the extra stuff. The following is a set of SQL*Plus commands that I usually use to surpress all of the extra formatting, and set a column seperator.
set newpage 0
set space 0
set linesize 80
set pagesize 0
set echo off
set feedback off
set heading off
set colsep |
IsText and IsBinary Functions for VBScript
The following functions determine if a file is text or binary, returing a true or false value. The results are determined by reading the first 512 characters in the file, and if less than 1/3 of the characters seem binary it is flagged as a text document. The algorythm comes from Perl's implementation of this function. A null character that is found will always indicates a binary file. Any character outside of the ASCII range 32-127 (except 8, 9(tab), 10(lf), 12, 13(cr), 27) are considered binary-like. This function will not properly handle UTF-8 encoded files.
WScript.Echo(CStr(IsText("foo.xls")))
WScript.Echo(CStr(IsBinary("foo.xls")))
Function IsBinary(strCheckFileName)
IsBinary = Not IsText(strCheckFileName)
End Function
Function IsText(strCheckFileName)
Dim testFile, fileSpec, len, i, buf, char, odd
odd = 0
Set fso = CreateObject("Scripting.FileSystemObject")
On Error Resume Next
Set testFile = fso.OpenTextFile(strCheckFileName, 1, False, 0)
If Err.Number <> 0 Then
WScript.Echo "Unable to open file. Error: " & Err.Description
Err.Clear
IsText = False
Exit Function
End If
Set fileSpec = fso.GetFile(strCheckFileName)
len = fileSpec.Size
' read a max of 512 bytes
If (len > 512) Then
len = 512
End If
buf = testFile.Read(len)
If Err.Number <> 0 Then
WScript.Echo "Unable to read file. Error: " & Err.Description
Err.Clear
IsText = False
Exit Function
End If
For i = 1 To len
char = Asc(Mid(buf, i, 1))
If char = 0 Then
' text can't contain nulls
odd = len
Exit For
ElseIf char > 127 Then
odd = odd + 1
ElseIf char < 32 _
And char <> 8 And char <> 9 And char <> 10 _
And char <> 12 And char <> 13 And char <> 27 Then
odd = odd + 1
End If
Next
' allow for up to 1/3 odd
If (odd * 3) > len Then
IsText = False
Else
IsText = True
End If
testFile.Close
Set fso = Nothing
End Function
Automate Perl Module Installation
This utility allows you to automate the installation of a set of Perl modules. List the modules in the @modules array and run the script. It uses the CPAN module to download and install the modules, as well as requisite modules.
#!/usr/bin/perl
use CPAN;
my @modules = qw(
HTML::Template
XML::EasyObj
);
# install needed modules
for my $mod (@modules) {
print "Checking module '$mod' ...\n";
my $obj = CPAN::Shell->expand('Module',$mod);
$obj->install;
}
Print Outlook Email using Perl/OLE
This little code snippet is a short example of how Perl can be used to read mails in Outlook using the OLE interface. This particular example looks for a specific mailbox name, then seeks some sub folders by name, then prints out some information about the mails in that folder.
use strict;
use Win32::OLE;
# use existing instance if Outlook is already running, or launce a new one
my $ol;
eval {$ol = Win32::OLE->GetActiveObject('Outlook.Application')};
die "Outlook not installed" if $@;
unless (defined $ol) {
$ol = Win32::OLE->new('Outlook.Application', sub {$_[0]->Quit;})
or die "Oops, cannot start Outlook";
}
my $mailbox = seekFolder($ol->Session, 'Mailbox - Hanson, Robert');
my $inbox = seekFolder($mailbox, 'Inbox');
my $folder = seekFolder($inbox, 'Perl Class');
for (my $i = 1; $i <= $folder->Items->Count; $i++) {
print $folder->Items->Item($i)->SenderName . ":";
print $folder->Items->Item($i)->Subject . "\n";
print $folder->Items->Item($i)->Body . "\n";
}
sub seekFolder {
my $obj = shift;
my $target = shift;
for (my $i = 1; $i <= $obj->Folders->Count; $i++) {
if ( $obj->Folders->Item($i)->Name eq $target ) {
return $obj->Folders->Item($i);
}
}
}
Load XML file with JDOM
Here is a quick template for using the JDOM to load an XML file. I like JDOM over other DOM models because it is easy to use, takes less code, and is easy to read once you are done.
import java.io.File;
import java.io.IOException;
import org.jdom.Document;
import org.jdom.JDOMException;
import org.jdom.input.SAXBuilder;
public class SomeClass
{
public SomeMethod ()
{
Document doc = null;
SAXBuilder sb = new SAXBuilder();
try {
doc = sb.build(new File("/path/to/some/file.xml"));
}
catch (JDOMException e) {
e.printStackTrace();
}
catch (IOException e) {
e.printStackTrace();
}
}
}