jan ekenberg

Zulu History Petrol

Some years ago while searching the Web I noticed an unusual and nonsensical page description reoccurring on a search results page. The search page displayed:

zulu history %7BPetrol Chlorine%7D silverchair tab silverchair.com sim Copter sim city sim...
It seemed peculiar and when I requested the pages I ended up on student home pages and adult sites; generic stuff with no trace of the unusual word combination displayed on the search page. I then took a longer search string from one of the descriptions, reiterated the search and went for the first result. The title did not reveal anything unusual and the document took very long to load, but after a while my browser indicated that the page was fully loaded . Black. Nothing. I was puzzled. I scrolled. It was a very large page. All black...

I did "select all" and lots of text appeared. Lots. This was the first time I encountered the Zulu History Petrol Document.

The Zulu History Petrol Document is a text consisting of around 70,000 words. Created for the Web (during a time when search engines were a little less sophisticated) to get search engine crawlers to index the text and point anyone who searched for a word in the document to pages that have been indexed with all the words in the document. A common tactic was then to have a link up on top of the page or to redirect traffic to where it was wanted. Sometimes the text was the same color as the background and sometimes the text was removed from the page afterthe page was indexed by the targetet search engine.

Today the Zulu History Petrol Document still exists on the Web though its presence has diminished. The links I collected a few years no longer go to pages with the text. The strategy probably just doesn't work that well any more. Today a Google search for "zulu history %7Bpetrol" (the "%7B" is the ASCII number for "{" ; a part of the text fragment {petrol chloride} - a song by Silverchair) only yields five results.

The origin of the "Zulu History Petrol" document is obscure. A clear, and throughout the document consistent characteristic, is not easy to detect. Obviously not authored in an old fashioned sense the document drifts in and out of styles and different kinds of orders. An alphabetical structure goes through all of the document, but it's not strictly followed. Exactly how the ZHP Document was created or generated is unclear.

There is no one Zulu History Petrol Document, but several. For example, the "eccentrica" permutation (first retracted from eccentrica.org) starts 40 words into the "common" ZHP Document and ends halfway. The "German" permutation (retracted from the now "dead" url "mitglied.tripod.de/~orbi") starts as the "common" ZHP Document and contains it but has a strict alphabetic 8,452 word addendum.

The first 100 words in the "common" ("common" as in most frequent) Zulu History Petrol Document are:
"zulu history %7BPetrol Chlorine%7D silverchair tab silverchair.com sim Copter sim city sim city 2000 sim towaer simant *.pkerfan *.rbs *.wav *bob marley* *ruise
GEEZZ%21%21%21%21%21%21%21++ + + +webcralwer HAING MAC HAIRSTYLES HARD HARDCORE HARDCORE ANIMATION +HOW+MANY+PEOPLE+FALL+FOR+IT
+AND+TYPE+IN+THIS+STEAD+OF+WEBCRAWLER +YOU+PAY+FOR+THE+DOMAIN+NAME+JUST+FOR+THIS+LITTLE+JOKE
+SEND+ME+IF+YOU+GET+THIS+IM+CURIOUS..PIGS.SIN +and+smoking+or+austrlia .ico icons graphics buttons icp icq ics software cd ictus vigor id id games id software id
spoofer id4 .jpg latinas lattjo lajban laughlin laughlin afb laura Nyro laura ingalls laura ingells laura numeroff .mov File on Star Trek .mov Files .mov Files .mov NEAR 30
troopers .mov downloads .jpg .mov files .mreply.rc .ppt viewer .ra .rar .s3m .scr scr .wad"

The diversity is striking. More or less obscure music references are mixed with file extensions, a couple of spellings of "Laura Ingalls" ("Little House on the Prairie") follows the confused all caps and pluses "testimony" from the author? This, the Swedish fragment "lattjo lajban", a whole lot of nonsense, misspelled words or unidentified character combinations set the tone for the document. This unwieldy style is typical for the rest of the text as well.

61,000 words into the document, in "chapter R" we can read the rhythmically and "beatniky" elegant:

"roman emperors roman empire romance romance books romance chat romance romne jim stienmen jim travis jimbo jimi hemdrix jimi hendrix jimi hendrix sound bytes jimmy jimmy buffett romy and michelle picture female picture and alien picture and rocket picture od the detroit lions picture room the park chat rooms the park chat site the park free chat."

This is just one of many beautiful passages in the document.

13,000 words later the document ends.

"tic tic tac toe tic-tic-tac tick"

A placid tribute to the passage of time.

A Dirty Crawlspace

Falling in the trap by "actively" crawling the "crawling trap" that the ZHP Document constitutes is an act of submission. By indexing a 500 hundred link "crawlspace" (if you don't mind) one finds that the Web pocket the crawler explores is unusually homogenous.

Information gathering by crawling large amounts of links most often results in a fairly heterogeneous collection. The reason is of course that Web pages with diverse listings of other pages are common on the Web (home page bookmarks, portals etc).

The information results gathered from crawling out from the ZHP Document are surprisingly uniform. The trap was set, we fell in. Five hundred urls and almost every single url is created and owned by "the only business that has been able to monazite the Web from day one" - the adult industry. We see here a strategy of success.

An interesting discrepancy in the level of "creativity" can be noted in the comparison between domain names and meta keywords. The keywords are the expected,(sometimes with amusing alternative spellings):

free porn sex hardcore nude teen amateur amatuer amature pics never pay naked (etc.)

But among the domain names (the importance of remembering them easily seems less important in this business than in, let's say, e-commerce) the inventiveness is remarkable. How about:

www.waanaadoous.com
www.handcuffs-bondage-whips-cock-ring-tie-up.com
www.adultrevenueservice.com
www.hardcoredigital.com

Just to mention a few. Some non adult sites also ended up on the list. That the ubiquitous Yahoo.com "made the list" is not very surprising, that fbi.gov did, maybe more so.

Here is a 74,781 word long common version of the ZHP Document originally copied from a Colombian home page.

meta keywords
URLs
home