building block: using captive searches

[What are they?] - [Using captive search with LII collections] - [LII helpers for other collections] - [Rolling your own]

NB: This document, like others in the "building blocks" series, assumes as part of its design that you will make frequent use of the "view source" feature in your browser to see how things have been done.

What are captive searches?

Captive or embedded searches are a useful species of hyperlink in situations where you need to retrieve a range of resources (say, a number of judicial opinions on a common topic) or where the body of material from which you want to retrieve is expected to change over time -- in other words, in those situations where a dynamic search retrieving a list of documents is more useful than a static link to a single case or document. The idea is to construct an ordinary link which, when clicked on, permits the student or user to launch a predetermined search of some body of material. Here's an example which gets recent Supreme Court cases related to bankruptcy.

For example, a law teacher constructing a Web page for a constitutional law course might well want to point to all recent cases of the US Supreme Court where the term "civil rights" appears in the syllabus -- and would also want to be certain that any new cases decided by the court will be captured as they are added to the corpus. We use this technique often in the construction of our topical pages, where we need links which will capture (say) every bankruptcy decision of the New York Court of Appeals, and continue to do so even as our collection grows.

You can puzzle out how to construct such a link for almost any searchable collection of data; some collections are easier than others. A few providers like the LII have constructed interfaces and engines to make this task easier..

LII services, collections, and helpers

The LII provides you with two kinds of help in constructing captive searches. Our own collections are designed to make it easy to build links which reach into our search engines. We have also designed software "helpers" to make construction of captive searches for some non-LII collections (such as the CFR at the House of Representatives Law Library) easier for you to build.

In no particular order, here are instructions for several important collections here and elsewhere. If your favorite isn't in the list, we provide help for you in rolling your own captive search.

LII collections and services:

Decisions of the US Supreme Court, current and historic, at the LII and elsewhere

Decisions of the New York Court of Appeals

LII Topical Pages

The United States Code

The Uniform Commercial Code

The Federal Rules of Evidence.

The Federal Rules of Civil Procedure

US Administrative Procedure Act

Introduction to Basic Legal Citation

Civil Rights statutes of the United States

US Copyright Act, Berne Convention, and selected cases

US Patent Act and selected cases

Lanham Act and selected cases

GATT 1994

Legal Ethics materials

Securities Act of 1933

Securities Exchange Act of 1934

Helpers for non-LII collections

Decisions of the US Supreme Court (multiple sources)

Code of Federal Regulations (House of Representatives)

Decisions of the US Circuit Courts of Appeal (multiple providers)

Geek Techneek: rolling your own

Basics:

Figuring out how to make your own captive searches on a new collection isn't all that difficult. Most of the time, it's a matter of looking at whatever form the collection uses to enter search terms, and moving the information into appropriate syntax. Here's
a simple example, similar to what we might see if we selected View/Page source while looking at the search screen for a caselaw collection:


<FORM METHOD="POST" ACTION="http://www.wdist.ecentral.gov/cgi-bin/getcases.cgi">
<INPUT TYPE="hidden" NAME="what_db" VALUE="caselaw">
Enter your search terms:<BR>
<INPUT TYPE="text" SIZE="60" NAME="user_query">
<INPUT TYPE="submit">
</FORM>

The trick, of course, is to build a static URL which incorporates all the information which a browser would send to the search engine if the user had filled in the form and pressed the 'submit' button. This is actually pretty easy to do, if you know the format in which the browser is supposed to submit the information. The URL for an entire query should consist of :

the URL appearing in the "ACTION" attribute of the FORM tag , followed by
a question mark, followed by
"name=value" pairs representing the values of all the fields (straightforward in our example but a little trickier when you start dealing with things like radio buttons, SELECT lists, and checkboxes), separated by ampersands (&).

So in our example, a captive link to search on the term 'bankruptcy' would look like this:

<A HREF="http://www.wdist.ecentral.gov/cgi-bin/getcases.cgi?what_db=caselaw&user_query=bankruptcy>
My captive query.
</A>

Fine points involving search terms

Note that values are not quoted (in other words, you don't say user_query="bankruptcy"). The astute reader is doubtless thinking that there has to be some special syntax for indicating multiple terms, and the astute reader is, of course, right. To capture a search for "civil rights", you'd say user_query=civil+rights . Other problems are presented by nonalphabetic characters in search strings -- even things like parentheses and forward slashes, which show up pretty commonly with search engines which support booleans and proximities. These get encoded with hexadecimal escapes (aren't you glad you have that ASCII table handy?). The sort-of-Lexis-ish query:

(lawyers w/2 guns) and money

becomes

%28lawyers%20w%2F2%20guns%29+and+money

To be perfectly honest, I'm not sure exactly when you'd use a %20 escape to replace a space, and when you'd use the + character; the above represents a guess. It may well vary by search engine, and some experimentation is in order.

Hijacking more sophisticated forms

The preceding examples show how simple it is to "take over" forms which only use simple INPUT tags and build captive URLs from them Other kinds of fields are a little harder to deal with, but not much. For a group of radio buttons, you send the name of the group plus the value of the button which is checked, eg. radiogroup=my_value. For a checkbox, send nothing if the checkbox isn't checked; send the value "on" if it is, eg. checked_checkbox=on. For SELECT fields, send the name of the SELECT field and the value of the option to be selected, eg:

      <SELECT name="my_select">
      <OPTION value="1"> One
      <OPTION value="2"> Two
      </SELECT>

would yield my_select=1 if we were trying to force the first option.

Back to the Building Blocks overview