Mounting and markup of Supreme Court decisions


Overview: the life cycle of an opinion

The opinions issued under the auspices of Project Hermes are bench opinions, the text of the opinion as it is handed down from the bench on the day of release by the Court. Those not familiar with the legal publication process are often surprised to learn that this is not the final word.

A few days after the bench opinion is issued through Hermes, the Court makes available a "slip opinion", so called because originally these hot-off-the-press opinions appeared on "slips" which could be inserted in printed volumes until the new, bound version of the opinions appeared. The slip opinion may contain corrections or minor amendments made by the court after the bench opinion is issued. These may be more or less significant.

What is certain is that this slip opinion is the first version of an opinion with full information allowing it to be cited in other legal proceedings, under current citation norms. Signficantly, its numbered pages provide the "official" way to refer to a particular portion of a decision. The court does not distribute slip opinions through Hermes, but instead via its dialup BBS.

After release Supreme Court decisions begin to appear in numerous commercial versions, print and electronic. Many of them have their own parallel citation systems. Case citations that include "S.Ct." or "L.Ed.2d" are pointing to widely used commercial editions. Sometime between 12 and 18 months after the electronic versions are handed down, the official version of the opinion is distributed in printed form in the US Reports. Once a decision is published in the US Reports its volume and page number in that series ( __ U.S. __ ) become its permanent or archival citation reference. This version is, for all intents and purposes, the last word on the subject from the Court. However, it is worth noting that other things may act over time to change the meaning of what appears in an opinion. For example, opinions frequently cite sections of the United States Code or of the Code of Federal Regulations, two compilations of laws which change with amazing rapidity. What was Section 108 of Title 5 of the US Code at the time an opinion was issued may not be Section 108 any longer, or its wording and effect may have substantially changed.


The HERMES process : what happens before we get an opinion

HERMES opinions are released simultaneously to the entire list of subscribers, and transmitted individually to subscriber sites using the UUCP protocol. Thus, all subscribers receive their transmissions as close to simultaneously as possible. Typically, the entire transmission takes place between ten and noon on the day of release. Because we wait for the entire transmission from the court before we convert and publish any single document in the "batch" for the day, our release may be a few minutes behind commercial services like WESTLAW and LEXIS. Those who receive our e-mail bulletin version may have to wait still longer, as it receives editorial approval by an actual LII staffer before it is released, and because the subscriber list is sufficiently large that, even at the speed of e-mail, delivery consumes a significant amount of time.


Conversion and markup : what happens after we get an opinion

Normally, we wait for the entire day's transmission from HERMES before beginning any conversion. This delays matters slightly, but not more than a few minutes. Once the entire transmission is received, a number of things are done to each file sent by the court:

the file is converted from WordPerfect 5.1 to an intermediate, HTML-ish format using our custom version of WP2X; the intermediate format is converted to a final HTML format using a perl script (cleanup.pl or cleanup-etc.pl) which does the final formatting, extraction of cites, and addition of metainformation; a "cluster document" is constructed; it contains links to the syllabus, opinion, dissents (if any), concurrences (if any), WordPerfect versions, and so on; if the document is an order list, a pointer to it is added in our monthly compilation of order lists; if an opinion, a pointer is added in the monthly listing of most-recent cases. the file is indexed to make it full-text searchable in a final, manual step, we construct a PGP digital signature which can be used (up to a point) to guarantee the authenticity of the WordPerfect version of the opinion

Needless to say this conversion process is not perfect. It has to handle a wide variety of input formats, and is thrown off by variations in the material sent by the court. We try to correct any such markup problems manually as quickly as possible. Also, you should bear in mind that automated links have inevitable problems of context and "freshness". We mark up US Code cites literally -- a cite in the opinion to Section 109 of Title 10 will point to that section as it currently appears on our server, which may or may not be the version the author of the opinion was looking at when the opinion was written.


Hermes naming conventions

At the moment docket numbers provide the only unique identifiers for opinions as they are issued by the court, and so we retain them as the basis of our naming scheme. A document distributed by the Hermes system will have a filename based on the docket number, for instance:

Base file name

Significance

95-1234

WordPerfect version of some document in the case with docket number 95-1234

95-1234A

ASCII version of a document in case 95-1234

9ORIG

Document in an "original jurisdiction" case. These are comparatively rare.

What type of document it is is indicated by the file extension:

File extension


(eg. 95-1234.ZZ)

Significance

.ZS

Syllabus (synopsis) of the opinion

.ZO

Main opinion

.ZC

Concurrence

.ZX

Concurrence in part, dissent in part

.ZD

Dissent

.ZE

Decree

.ZPC

Per-curiam opinion

.ZOR

Monday order list

.ZR

Irregular order list

.ZA

Attachment to Monday order list

.ZO1,.ZC2,.ZD4,.ZX1, etc.

Indicates one of a series, for example where one opinion gives rise to multiple concurrences: ZC, ZC1,ZC2,ZC3 etc.


The Hermes "back list"

Case Western Reserve University has been a participant in the Hermes distribution program from the beginning, and it is from them that we get the "back list" of opinions distributed via Hermes before our own subscription was activated on January 1, 1997. The back list has its idiosyncrasies, including missing cases (which may have disappeared for a variety of reasons) and variations in format which have occurred over the lifetime of the Hermes project. Of these format changes, the "Continental Divide" is represented by the court's move from an ATEX system to WordPerfect 5.1 sometime in June of 1991. We aren't terribly sure what happened at this point; there is some evidence that the files from this period at CWRU represent an attempt to convert ATEX to XyWrite format which was only partially successful. To the extent possible, we've converted these opinions to HTML using software of our own devising, but there are some problems with this approach, notably with the appearance of case names and HTML document titles; some other things which appear odd (underlining of all occurrences of the names of the Justices, for example) may be conventions which were in use by the court at the time.


Our use of PGP: how it helps you and what it means

Soon after the first legal document was mounted on the Net, lawyers quite naturally began worrying about the accuracy and reliability of legal texts distributed by this means. Our use of PGP signatures is a species of guarantee regarding the accuracy of what we have put on our site. Briefly, a PGP digital signature is a kind of notarization of the document; you can use it to determine that what you have is what we originally received and placed on our server. It works by using mathematical functions to transform the content of the document into a string of digits which is unique for that content; word-processing files which are different in any way generate different signatures (and, conversely, files which are the same always generate the same signature).

The digital signature files are created using PGP at the point where we receive the documents in question. If the file is subsequently altered in any way, the altered file will not generate the same PGP signature, and thus a comparison of the two signatures will show that the altered file has been altered. Specifically, our signatures guarantee that:

documents in our collection and issued by the court prior to January 14, 1997, are as we downloaded them from the CWRU Hermes archive or from the Court's BBS; documents after that date are as we received them from the court via Hermes transmission or from the Court's BBS.

You are doubtless thinking that all this would be scads more useful if the court were to create the signatures prior to transmission. We agree. All we can do is provide this "notarization" for the documents as we receive them.


CGI scripts: how to talk to our search engines

We made some tools for the Supreme Court collection which may be of use to others. Feel free to use them to enhance your pages; they're meant to help you build on what we've done. Neophyte web page designers (and maybe some not-so-neo ones as well) may want to take a look at our tutorial on using captive searches, which offers some context.

Using our engine to find Supreme Court opinions by cite

We use a small script to translate US Reports cites into actual document locations on the Net. Someday we'll do this with an actual URC server, but for now we take the cite and return a 'choice-of-viewing' page which permits the user to select a collection in which to view the opinion. Users can also enter preferences about which collection to use. Example syntax for doing this in your own documents might look like:

<A HREF="https://www.law.cornell.edu/supremecourt/text/467/883">467 US 883</A>

if you were after 467 U.S. 883. In other words, the search engine takes the volume and page number of the US Reports cite as input and returns a choice of locations for the case. Note that this only works with US Reports cites, which are not typically assigned to an opinion for eighteen months or more after the bench opinion is handed down. The only identifier which "sticks to" a case from the day of issue is the docket number; see our explanation of file naming conventions to see how you might use the docket number as a means of addressing a particular opinion, dissent, etc.



The nuts and bolts of PDF conversion [new!]

At the beginning of the term in October 1997, the Court began distributing its opinions in the Adobe Acrobat PDF format. This format has the advantage of preserving the original appearance of Court documents, but it has the disadvantage of being a somewhat difficult format to convert. We use a three-step process. The PDF file is converted to an intermediate, character-by-character ASCII format using Ghostscript. It is then condensed to a phrase-by-phrase intermediate format, and finally run over with a Perl script which converts the intermediate format to HTML, marks up important text features, and so on. We are willing to make these software tools available to others; contact us. Note, however, that the tools are very specifically designed to work with Supreme Court materials and are not a general purpose PDF-to-HTML conversion package. Also, they are primarily intended for filtering batches of files in real time; those who need to convert the opinions to another format from PDF for editing purposes would probably be better advised just to pull up our HTML version in an HTML-aware word processor (Word or WordPerfect, among others) and proceed in that way.

In writing the software we have noticed a number of problems with the PDF distribution as it appears in the Acrobat viewer. For some reason, certain special characters (notably open and close quotes, and some forms of em dash) don't appear in the viewer, or in its printed output. Mostly this is not a problem, though the lack of an open quote can sometimes make it difficult to know whether the Court is speaking in a given opinion, or if someone else is.